---
title: "fetch"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{fetch}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```


In 2020 I wrote a package called [libr](https://libr.r-sassy.org) to deal 
with the problem of managing multiple related datasets.  This package allows
you to manage data from several popular data sources:
SAS, Excel, CSV, and R. While the package received generally positive feedback, 
users requested some features that the package did not support:  

* Users wanted to examine the data *before* loading it into memory.
* Users wanted to pick only *some* datasets to load into memory.
* Users wanted to subset all datasets in the library based on a condition.

After consideration of these requests, I determined that these features would
be most easily satisfied by creating a new package.  That package is **fetch**.

## Key Features

The **fetch** package allows you to retrieve data from any of several different 
data sources while minimizing memory usage. 
There are only two steps to retrieve data:

* Create a data catalog
* Fetch data from the catalog

The above two steps are accomplished by the following functions:

* `catalog()`: Defines a data catalog for a particular data source
* `fetch()`: Pulls a data item from the catalog and loads it into memory

The package has the following key features:

* The `catalog()` function allows you to explore the data before reading from the source.
* The `fetch()` function loads only the data you need for your analysis.
* The "where" parameter on the `catalog()` function allows you to subset across 
all datasets in the catalog.
* The "select" and "top" parameters on the `fetch()` function let you further
limit the data returned.
* The "import_spec" parameters on the `catalog()` and `fetch()` functions
give you control over the column data types.

## How to use **fetch**

Let's start with a simple example.  In this example we will create a data catalog,
examine its contents, and fetch a dataset from the catalog.  First, create
and view the data catalog:

```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create catalog
ct <- catalog(pkg, engines$csv)

# View catalog
ct
# data catalog: 6 items
# - Source: C:/packages/fetch/inst/extdata
# - Engine: csv
# - Items:
    # data item 'ADAE': 56 cols 150 rows
    # data item 'ADEX': 17 cols 348 rows
    # data item 'ADPR': 37 cols 552 rows
    # data item 'ADPSGA': 42 cols 695 rows
    # data item 'ADSL': 56 cols 87 rows
    # data item 'ADVS': 37 cols 3617 rows
```
As can be seen above, this catalog has 6 csv files in it.  The number of columns
and rows are displayed for each file. Information about each file is contained
in the catalog.  This information can be accessed using list notion, like so:

```{r eval=FALSE, echo=TRUE}
# View info for catalog item 
ct$ADEX
# data item 'ADEX': 17 cols 348 rows
# - Engine: csv
# - Size: 70.7 Kb
# - Last Modified: 2020-09-18 14:30:22
#    Name   Column     Class Label Format NAs MaxChar
# 1  ADEX  STUDYID character  <NA>     NA   0       3
# 2  ADEX  USUBJID character  <NA>     NA   0      10
# 3  ADEX   SUBJID character  <NA>     NA   0       3
# 4  ADEX   SITEID character  <NA>     NA   0       2
# 5  ADEX     TRTP character  <NA>     NA   8       5
# 6  ADEX    TRTPN   numeric  <NA>     NA   8       1
# 7  ADEX     TRTA character  <NA>     NA   8       5
# 8  ADEX    TRTAN   numeric  <NA>     NA   8       1
# 9  ADEX   RANDFL character  <NA>     NA   0       1
# 10 ADEX    SAFFL character  <NA>     NA   0       1
# 11 ADEX   MITTFL character  <NA>     NA   0       1
# 12 ADEX  PPROTFL character  <NA>     NA   0       1
# 13 ADEX    PARAM character  <NA>     NA   0      45
# 14 ADEX  PARAMCD character  <NA>     NA   0       8
# 15 ADEX   PARAMN   numeric  <NA>     NA   0       1
# 16 ADEX     AVAL   numeric  <NA>     NA  16       4
# 17 ADEX AVALCAT1 character  <NA>     NA  87      10

```
When the catalog item is printed, it shows a data dictionary for the 
specified dataset.  Column names and data types are displayed, along with
some other useful attributes of your data.

Once the catalog is created, you are ready to fetch data from the catalog:

```{r eval=FALSE, echo=TRUE}

# Fetch data from a catalog
dt <- fetch(ct$ADEX)

# View data
dt
# A tibble: 348 × 17
#    STUDYID USUBJID    SUBJID SITEID TRTP  TRTPN TRTA  TRTAN RANDFL SAFFL MITTFL PPROTFL
#    <chr>   <chr>      <chr>  <chr>  <chr> <dbl> <chr> <dbl> <chr>  <chr> <chr>  <chr>  
#  1 ABC     ABC-01-049 049    01     ARM D     4 ARM D     4 Y      Y     Y      Y      
#  2 ABC     ABC-01-049 049    01     ARM D     4 ARM D     4 Y      Y     Y      Y      
#  3 ABC     ABC-01-049 049    01     ARM D     4 ARM D     4 Y      Y     Y      Y      
#  4 ABC     ABC-01-049 049    01     ARM D     4 ARM D     4 Y      Y     Y      Y      
#  5 ABC     ABC-01-050 050    01     ARM B     2 ARM B     2 Y      Y     Y      Y      
#  6 ABC     ABC-01-050 050    01     ARM B     2 ARM B     2 Y      Y     Y      Y      
#  7 ABC     ABC-01-050 050    01     ARM B     2 ARM B     2 Y      Y     Y      Y      
#  8 ABC     ABC-01-050 050    01     ARM B     2 ARM B     2 Y      Y     Y      Y      
#  9 ABC     ABC-01-051 051    01     ARM A     1 ARM A     1 Y      Y     Y      Y      
# 10 ABC     ABC-01-051 051    01     ARM A     1 ARM A     1 Y      Y     Y      Y      
#  338 more rows
#  5 more variables: PARAM <chr>, PARAMCD <chr>, PARAMN <dbl>, AVAL <dbl>,
#   AVALCAT1 <chr>
#  Use `print(n = ...)` to see more rows

```

At this point, you can proceed with your analysis, or fetch additional datasets 
from the catalog as needed.  So easy!


## Additional Features

The above example covers the very basics.  Now let's look at some more useful 
features of the **fetch** package.

### Apply a Where Expression to a Fetch Operation

What if you are performing an analysis on a subset of data? Is there a way 
to apply a subset when the data is fetched?  You can apply subset criteria 
using the "where" parameter on the `fetch()` function.  Like this:

```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create catalog
ct <- catalog(pkg, engines$csv)

# Subset data for a specific subject
dt <- fetch(ct$ADVS, where = expression(SUBJID == '049'))

# View data subset
dt
# A tibble: 44 × 37
#    STUDYID USUBJID    SUBJID SITEID SRCDOM SRCVAR   SRCSEQ TRTP  TRTPN TRTA  TRTAN RANDFL
#    <chr>   <chr>      <chr>  <chr>  <chr>  <chr>     <dbl> <chr> <dbl> <chr> <dbl> <chr> 
#  1 ABC     ABC-01-049 049    01     VS     VSSTRESN      9 ARM D     4 ARM D     4 Y     
#  2 ABC     ABC-01-049 049    01     VS     VSSTRESN     41 ARM D     4 ARM D     4 Y     
#  3 ABC     ABC-01-049 049    01     NA     NA           NA ARM D     4 ARM D     4 Y     
#  4 ABC     ABC-01-049 049    01     VS     VSSTRESN     34 ARM D     4 ARM D     4 Y     
#  5 ABC     ABC-01-049 049    01     VS     VSSTRESN     35 ARM D     4 ARM D     4 Y     
#  6 ABC     ABC-01-049 049    01     VS     VSSTRESN     36 ARM D     4 ARM D     4 Y     
#  7 ABC     ABC-01-049 049    01     VS     VSSTRESN     37 ARM D     4 ARM D     4 Y     
#  8 ABC     ABC-01-049 049    01     VS     VSSTRESN     38 ARM D     4 ARM D     4 Y     
#  9 ABC     ABC-01-049 049    01     VS     VSSTRESN     39 ARM D     4 ARM D     4 Y     
# 10 ABC     ABC-01-049 049    01     VS     VSSTRESN     40 ARM D     4 ARM D     4 Y     
#  34 more rows
#  25 more variables: SAFFL <chr>, MITTFL <chr>, PPROTFL <chr>, TRTSDT <chr>,
#   TRTEDT <chr>, ADT <chr>, ADY <dbl>, ADTF <lgl>, AVISIT <chr>, AVISITN <dbl>,
#   PARAM <chr>, PARAMCD <chr>, PARAMN <dbl>, PARAMTYP <chr>, AVAL <dbl>, BASE <dbl>,
#   CHG <dbl>, AWRANGE <chr>, AWTARGET <dbl>, AWTDIFF <dbl>, AWLO <dbl>, AWHI <dbl>,
#   AWU <chr>, ABLFL <chr>, ANL01FL <chr>
#  Use `print(n = ...)` to see more rows


```

The `expression` function can be used to define any R expression.  This function 
accepts unquoted variable names, logical operators, and R functions like
`is.null()` and `is.na()`.  You can use the the "where" parameter with an
expression to narrow down the data you are working with and reduce
the memory footprint of your program.


### Get Top N Rows

Another way to limit the data returned by your program is to use the "top"
parameter.  This parameter allows you to return only the "top N" number of rows
in the target data item.  The "top" parameter is useful for exploratory purposes.
Here is an example:


```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create catalog
ct <- catalog(pkg, engines$csv)

# Subset data for a specific subject
dt <- fetch(ct$ADVS, top = 5, where = expression(SUBJID == '049'))

# View results
dt
# A tibble: 5 × 37
#   STUDYID USUBJID  SUBJID SITEID SRCDOM SRCVAR SRCSEQ TRTP  TRTPN TRTA  TRTAN RANDFL SAFFL
#   <chr>   <chr>    <chr>  <chr>  <chr>  <chr>   <dbl> <chr> <dbl> <chr> <dbl> <chr>  <chr>
# 1 ABC     ABC-01-… 049    01     VS     VSSTR…      9 ARM D     4 ARM D     4 Y      Y    
# 2 ABC     ABC-01-… 049    01     VS     VSSTR…     41 ARM D     4 ARM D     4 Y      Y    
# 3 ABC     ABC-01-… 049    01     NA     NA         NA ARM D     4 ARM D     4 Y      Y    
# 4 ABC     ABC-01-… 049    01     VS     VSSTR…     34 ARM D     4 ARM D     4 Y      Y    
# 5 ABC     ABC-01-… 049    01     VS     VSSTR…     35 ARM D     4 ARM D     4 Y      Y    
#   24 more variables: MITTFL <chr>, PPROTFL <chr>, TRTSDT <chr>, TRTEDT <chr>,
#   ADT <chr>, ADY <dbl>, ADTF <lgl>, AVISIT <chr>, AVISITN <dbl>, PARAM <chr>,
#   PARAMCD <chr>, PARAMN <dbl>, PARAMTYP <chr>, AVAL <dbl>, BASE <dbl>, CHG <dbl>,
#   AWRANGE <chr>, AWTARGET <dbl>, AWTDIFF <dbl>, AWLO <dbl>, AWHI <dbl>, AWU <chr>,
#   ABLFL <chr>, ANL01FL <chr>


```

The above fetch returned the top 5 rows of the dataset for subject '049'.  Note 
that the "top" parameter can be used with or without the "where" parameter.

### Apply a Pattern to a Catalog

The previous examples loaded all datasets from the data source into the data
catalog.  There may be instances, however, when you have a very large number
of datasets, and want to limit the data items loaded into the catalog.  For
this situation, you can use the "pattern" parameter on the `catalog()` function,
like so:

```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create catalog, applying pattern to dataset names
ct <- catalog(pkg, engines$csv, pattern = "*S*")

# View catalog
ct
# data catalog: 3 items
# - Source: C:/packages/fetch/inst/extdata
# - Engine: csv
# - Pattern: *S*
# - Items:
# data item 'ADPSGA': 42 cols 695 rows
# data item 'ADSL': 56 cols 87 rows
# data item 'ADVS': 37 cols 3617 rows

```
The above example limits the items in the catalog to those with an "S" in the 
name.  While this is not a realistic example, it nevertheless demonstrates
how the "pattern" parameter can be used to reduce the datasets loaded into
the catalog.

### Apply a Where Expression to an Entire Catalog

Sometimes you want to subset all the datasets in your catalog.  This subset
can be accomplished with the "where" parameter on the `catalog()` function.
Observe:

```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create catalog without where expression
ct1 <- catalog(pkg, engines$csv)

# View catalog
ct1
# data catalog: 6 items
# - Source: C:/packages/fetch/inst/extdata
# - Engine: csv
# - Items:
# data item 'ADAE': 56 cols 150 rows
# data item 'ADEX': 17 cols 348 rows
# data item 'ADPR': 37 cols 552 rows
# data item 'ADPSGA': 42 cols 695 rows
# data item 'ADSL': 56 cols 87 rows
# data item 'ADVS': 37 cols 3617 rows

# Create catalog with where expression
ct2 <- catalog(pkg, engines$csv, where = expression(SUBJID == '049'))

# View catalog
ct2
# data catalog: 6 items
# - Source: C:/packages/fetch/inst/extdata
# - Engine: csv
# - Where: SUBJID == "049"
# - Items:
# # data item 'ADAE': 56 cols 5 rows
# - Where: SUBJID == "049"
# # data item 'ADEX': 17 cols 4 rows
# - Where: SUBJID == "049"
# # data item 'ADPR': 37 cols 7 rows
# - Where: SUBJID == "049"
# # data item 'ADPSGA': 42 cols 10 rows
# - Where: SUBJID == "049"
# # data item 'ADSL': 56 cols 1 rows
# - Where: SUBJID == "049"
# # data item 'ADVS': 37 cols 44 rows
# - Where: SUBJID == "049"
```

Notice that a where expression has been added to each data item in the catalog, 
and the row counts for each item have been reduced.  Now
let's fetch some data from this catalog:


```{r eval=FALSE, echo=TRUE}
# Subset data for a specific subject
dt1 <- fetch(ct2$ADVS)

# View results of ADVS fetch
dt1
# A tibble: 44 × 37
#    STUDYID USUBJID    SUBJID SITEID SRCDOM SRCVAR   SRCSEQ TRTP  TRTPN TRTA  TRTAN RANDFL SAFFL
#    <chr>   <chr>      <chr>  <chr>  <chr>  <chr>     <dbl> <chr> <dbl> <chr> <dbl> <chr>  <chr>
#  1 ABC     ABC-01-049 049    01     VS     VSSTRESN      9 ARM D     4 ARM D     4 Y      Y    
#  2 ABC     ABC-01-049 049    01     VS     VSSTRESN     41 ARM D     4 ARM D     4 Y      Y    
#  3 ABC     ABC-01-049 049    01     NA     NA           NA ARM D     4 ARM D     4 Y      Y    
#  4 ABC     ABC-01-049 049    01     VS     VSSTRESN     34 ARM D     4 ARM D     4 Y      Y    
#  5 ABC     ABC-01-049 049    01     VS     VSSTRESN     35 ARM D     4 ARM D     4 Y      Y    
#  6 ABC     ABC-01-049 049    01     VS     VSSTRESN     36 ARM D     4 ARM D     4 Y      Y    
#  7 ABC     ABC-01-049 049    01     VS     VSSTRESN     37 ARM D     4 ARM D     4 Y      Y    
#  8 ABC     ABC-01-049 049    01     VS     VSSTRESN     38 ARM D     4 ARM D     4 Y      Y    
#  9 ABC     ABC-01-049 049    01     VS     VSSTRESN     39 ARM D     4 ARM D     4 Y      Y    
# 10 ABC     ABC-01-049 049    01     VS     VSSTRESN     40 ARM D     4 ARM D     4 Y      Y    
#  34 more rows
#  24 more variables: MITTFL <chr>, PPROTFL <chr>, TRTSDT <chr>, TRTEDT <chr>, ADT <chr>,
#   ADY <dbl>, ADTF <lgl>, AVISIT <chr>, AVISITN <dbl>, PARAM <chr>, PARAMCD <chr>,
#   PARAMN <dbl>, PARAMTYP <chr>, AVAL <dbl>, BASE <dbl>, CHG <dbl>, AWRANGE <chr>,
#   AWTARGET <dbl>, AWTDIFF <dbl>, AWLO <dbl>, AWHI <dbl>, AWU <chr>, ABLFL <chr>,
#   ANL01FL <chr>
#  Use `print(n = ...)` to see more rows


# View results of ADSL fetch
dt2 <- fetch(ct2$ADSL)
# A tibble: 1 × 56
#   STUDYID USUBJID    SUBJID SITEID   AGE AGEU  AGEGR1   SEX   RACE  RACEN ETHNIC ETHNICN COUNTRY
#   <chr>   <chr>      <chr>  <chr>  <dbl> <chr> <chr>    <chr> <chr> <dbl> <chr>    <dbl> <lgl>  
# 1 ABC     ABC-01-049 049    01        39 YEARS 30-39 y… M     WHITE     5 NOT H…       2 NA     
#   43 more variables: ARM <chr>, ACTARM <lgl>, TRT01P <chr>, TRT01PN <dbl>, TRT01A <chr>,
#   TRT01AN <dbl>, TRTSDT <chr>, TRTEDT <chr>, TRTDURN <dbl>, TRTDURU <chr>, TR01SDT <chr>,
#   TR01EDT <chr>, INCNFL <chr>, RANDFL <chr>, RANDEXC1 <lgl>, RANDEXC2 <chr>, RANDEXC3 <chr>,
#   RANDEXC4 <chr>, SAFFL <chr>, SAFEXC1 <chr>, SAFEXC2 <chr>, MITTFL <chr>, MITTEXC1 <chr>,
#   MITTEXC2 <chr>, PPROTFL <chr>, PPROTEX1 <chr>, PPROTEX2 <chr>, PPROTEX3 <chr>,
#   PPROTEX4 <chr>, COMPLFL <chr>, STDYDISP <chr>, STDYREAS <chr>, INCNDT <chr>, RANDDT <chr>,
#   DTHDT <lgl>, DTHFL <chr>, MISSDOSE <dbl>, TP1TRTR <chr>, TP2TRTR <chr>, TP3TRTR <chr>,...

```
As you can see, both the ADVS and ADSL datasets have been subset for subject '049'.
This feature of the *fetch* package makes it easy to target just the rows 
you are looking for.

### Add Import Specifications to a Catalog

Some data files contain accurate data type information within them, and others
do not.  For example, CSV files do not contain data type information, but 
RDATA files do.  If the data type information is not available, 
the data engine will try to guess the data types.  However, it does not 
always guess accurately. Sometimes you need to help the data engine determine
what the data type of a column is supposed to be.  Import specs allow you to 
to specify the correct data type for a column.

Examine the dictionary for the ADVS dataset:

```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create catalog without import spec
ct <- catalog(pkg, engines$csv)

# View dictionary for ADVS
ct$ADVS
# data item 'ADVS': 37 cols 3617 rows
# - Engine: csv
# - Size: 1.1 Mb
# - Last Modified: 2020-09-18 14:30:22
#    Name   Column     Class Label Format  NAs MaxChar
# 1  ADVS  STUDYID character  <NA>     NA    0       3
# 2  ADVS  USUBJID character  <NA>     NA    0      10
# 3  ADVS   SUBJID character  <NA>     NA    0       3
# 4  ADVS   SITEID character  <NA>     NA    0       2
# 5  ADVS   SRCDOM character  <NA>     NA   85       2
# 6  ADVS   SRCVAR character  <NA>     NA   85       8
# 7  ADVS   SRCSEQ   numeric  <NA>     NA   85       2
# 8  ADVS     TRTP character  <NA>     NA   85       5
# 9  ADVS    TRTPN   numeric  <NA>     NA   85       1
# 10 ADVS     TRTA character  <NA>     NA   85       5
# 11 ADVS    TRTAN   numeric  <NA>     NA   85       1
# 12 ADVS   RANDFL character  <NA>     NA    0       1
# 13 ADVS    SAFFL character  <NA>     NA    0       1
# 14 ADVS   MITTFL character  <NA>     NA    0       1
# 15 ADVS  PPROTFL character  <NA>     NA    0       1
# 16 ADVS   TRTSDT character  <NA>     NA   54       9 # Character by default
# 17 ADVS   TRTEDT character  <NA>     NA  119       9 # Character by default
# 18 ADVS      ADT character  <NA>     NA    0       9
# 19 ADVS      ADY   numeric  <NA>     NA   54       4
# 20 ADVS     ADTF   logical  <NA>     NA 3617       0
# 21 ADVS   AVISIT character  <NA>     NA   50      14
# 22 ADVS  AVISITN   numeric  <NA>     NA   50       2
# 23 ADVS    PARAM character  <NA>     NA    0      35
# 24 ADVS  PARAMCD character  <NA>     NA    0       6
# 25 ADVS   PARAMN   numeric  <NA>     NA    0       1
# 26 ADVS PARAMTYP character  <NA>     NA 3532       7
# 27 ADVS     AVAL   numeric  <NA>     NA    0       5
# 28 ADVS     BASE   numeric  <NA>     NA   70       5
# 29 ADVS      CHG   numeric  <NA>     NA 1312       4
# 30 ADVS  AWRANGE character  <NA>     NA 1331      25
# 31 ADVS AWTARGET   numeric  <NA>     NA 1331       3
# 32 ADVS  AWTDIFF   numeric  <NA>     NA 1331       2
# 33 ADVS     AWLO   numeric  <NA>     NA 1331       3
# 34 ADVS     AWHI   numeric  <NA>     NA 1331       3
# 35 ADVS      AWU character  <NA>     NA 1331       4
# 36 ADVS    ABLFL character  <NA>     NA 2869       1
# 37 ADVS  ANL01FL character  <NA>     NA  448       1

```
Note that the data types of the "TRTSDT" and "TRTEDT" are listed as character.
In fact, these columns are supposed to be a date.  You can force 
the date data type assignment with an import spec, as follows:

```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create import spec
spc <- import_spec(TRTSDT = "date=%d%b%Y",
                   TRTEDT = "date=%d%b%Y")

# Create catalog with import spec
ct <- catalog(pkg, engines$csv, import_specs = spc)

# View dictionary for ADVS with Import Spec
ct$ADVS
# data item 'ADVS': 37 cols 3617 rows
# - Engine: csv
# - Size: 1.1 Mb
# - Last Modified: 2020-09-18 14:30:22
#    Name   Column     Class Label Format  NAs MaxChar
# 1  ADVS  STUDYID character  <NA>     NA    0       3
# 2  ADVS  USUBJID character  <NA>     NA    0      10
# 3  ADVS   SUBJID character  <NA>     NA    0       3
# 4  ADVS   SITEID character  <NA>     NA    0       2
# 5  ADVS   SRCDOM character  <NA>     NA   85       2
# 6  ADVS   SRCVAR character  <NA>     NA   85       8
# 7  ADVS   SRCSEQ   numeric  <NA>     NA   85       2
# 8  ADVS     TRTP character  <NA>     NA   85       5
# 9  ADVS    TRTPN   numeric  <NA>     NA   85       1
# 10 ADVS     TRTA character  <NA>     NA   85       5
# 11 ADVS    TRTAN   numeric  <NA>     NA   85       1
# 12 ADVS   RANDFL character  <NA>     NA    0       1
# 13 ADVS    SAFFL character  <NA>     NA    0       1
# 14 ADVS   MITTFL character  <NA>     NA    0       1
# 15 ADVS  PPROTFL character  <NA>     NA    0       1
# 16 ADVS   TRTSDT      Date  <NA>     NA   54      10 # Converted to Date 
# 17 ADVS   TRTEDT      Date  <NA>     NA  119      10 # Converted to Date
# 18 ADVS      ADT character  <NA>     NA    0       9
# 19 ADVS      ADY   numeric  <NA>     NA   54       4
# 20 ADVS     ADTF   logical  <NA>     NA 3617       0
# 21 ADVS   AVISIT character  <NA>     NA   50      14
# 22 ADVS  AVISITN   numeric  <NA>     NA   50       2
# 23 ADVS    PARAM character  <NA>     NA    0      35
# 24 ADVS  PARAMCD character  <NA>     NA    0       6
# 25 ADVS   PARAMN   numeric  <NA>     NA    0       1
# 26 ADVS PARAMTYP character  <NA>     NA 3532       7
# 27 ADVS     AVAL   numeric  <NA>     NA    0       5
# 28 ADVS     BASE   numeric  <NA>     NA   70       5
# 29 ADVS      CHG   numeric  <NA>     NA 1312       4
# 30 ADVS  AWRANGE character  <NA>     NA 1331      25
# 31 ADVS AWTARGET   numeric  <NA>     NA 1331       3
# 32 ADVS  AWTDIFF   numeric  <NA>     NA 1331       2
# 33 ADVS     AWLO   numeric  <NA>     NA 1331       3
# 34 ADVS     AWHI   numeric  <NA>     NA 1331       3
# 35 ADVS      AWU character  <NA>     NA 1331       4
# 36 ADVS    ABLFL character  <NA>     NA 2869       1
# 37 ADVS  ANL01FL character  <NA>     NA  448       1

```

Observe that the columns "TRTSDT" and "TRTEDT" are now converted to 
Date columns.  This import spec will apply to all "TRTSDT" and "TRTEDT" 
columns in all datasets. The format "%d%b%Y" was written specifically to read in the 
data value this files uses.  See the `import_spec()` documentation for 
more information on these data formats, and for other possible data type
specifications.  Also see the `specs()` documentation if you want to 
assign a unique import specifications to each data item in the catalog.

### Add Import Specifications to a Fetch

You may also add an import specification to the fetch operation.
The import specification is defined with the `import_spec()` function in the
same way as the catalog operation:

```{r eval=FALSE, echo=TRUE}
# Get sample data directory
pkg <- system.file("extdata", package = "fetch")
 
# Create import spec
spc <- import_spec(TRTSDT = "date=%d%b%Y",
                   TRTEDT = "date=%d%b%Y")

# Create catalog with import spec
ct <- catalog(pkg, engines$csv, import_specs = spc)

# View dictionary for ADVS with Import Spec
ct$ADVS
# data item 'ADVS': 37 cols 3617 rows
# - Engine: csv
# - Size: 1.1 Mb
# - Last Modified: 2020-09-18 14:30:22
#    Name   Column     Class Label Format  NAs MaxChar
# 1  ADVS  STUDYID character  <NA>     NA    0       3
# 2  ADVS  USUBJID character  <NA>     NA    0      10
# 3  ADVS   SUBJID character  <NA>     NA    0       3
# 4  ADVS   SITEID character  <NA>     NA    0       2
# 5  ADVS   SRCDOM character  <NA>     NA   85       2
# 6  ADVS   SRCVAR character  <NA>     NA   85       8
# 7  ADVS   SRCSEQ   numeric  <NA>     NA   85       2
# 8  ADVS     TRTP character  <NA>     NA   85       5
# 9  ADVS    TRTPN   numeric  <NA>     NA   85       1
# 10 ADVS     TRTA character  <NA>     NA   85       5
# 11 ADVS    TRTAN   numeric  <NA>     NA   85       1
# 12 ADVS   RANDFL character  <NA>     NA    0       1
# 13 ADVS    SAFFL character  <NA>     NA    0       1
# 14 ADVS   MITTFL character  <NA>     NA    0       1
# 15 ADVS  PPROTFL character  <NA>     NA    0       1
# 16 ADVS   TRTSDT      Date  <NA>     NA   54      10
# 17 ADVS   TRTEDT      Date  <NA>     NA  119      10
# 18 ADVS      ADT character  <NA>     NA    0       9
# 19 ADVS      ADY   numeric  <NA>     NA   54       4
# 20 ADVS     ADTF   logical  <NA>     NA 3617       0
# 21 ADVS   AVISIT character  <NA>     NA   50      14
# 22 ADVS  AVISITN   numeric  <NA>     NA   50       2
# 23 ADVS    PARAM character  <NA>     NA    0      35
# 24 ADVS  PARAMCD character  <NA>     NA    0       6
# 25 ADVS   PARAMN   numeric  <NA>     NA    0       1
# 26 ADVS PARAMTYP character  <NA>     NA 3532       7
# 27 ADVS     AVAL   numeric  <NA>     NA    0       5
# 28 ADVS     BASE   numeric  <NA>     NA   70       5
# 29 ADVS      CHG   numeric  <NA>     NA 1312       4
# 30 ADVS  AWRANGE character  <NA>     NA 1331      25
# 31 ADVS AWTARGET   numeric  <NA>     NA 1331       3
# 32 ADVS  AWTDIFF   numeric  <NA>     NA 1331       2
# 33 ADVS     AWLO   numeric  <NA>     NA 1331       3
# 34 ADVS     AWHI   numeric  <NA>     NA 1331       3
# 35 ADVS      AWU character  <NA>     NA 1331       4
# 36 ADVS    ABLFL character  <NA>     NA 2869       1
# 37 ADVS  ANL01FL character  <NA>     NA  448       1
```
The results of this operation are similar to the previous example, when
we applied the import spec to the catalog.  The difference is that now
the import spec applies only to this specific fetch operation, 
not the entire catalog.


## Next Steps

For next steps, it is suggested to play around with the **fetch** functions, and
note how easy these functions are to use.  You may also wish to review the 
documentation for the `catalog()`, `fetch()`, `import_spec()` functions. 
The documentation contains some additional examples and descriptions 
of all parameters.



