---
title: "Automatic Script Sourcing"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Automatic Script Sourcing}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 72
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

The `autos` configuration automatically sources R scripts from specified
directories, making custom functions immediately available without
manual sourcing. This is perfect for project-specific utility functions
and shared code libraries.

## Basic AUTOS Configuration

``` yaml
default:
  autos:
    script_library: '/path/to/your/scripts'
```

**Note**: By default, auto-sourcing will overwrite any existing
functions with the same name. This behavior can be controlled through
the `overwrite` parameter in the underlying functions, though this is
typically managed automatically by the system.

## Working Example: Single Script Library

Let's create a practical example where Tidy McVerse has custom functions
in a script library:

```{r}
library(envsetup)

# Create temporary directory structure
dir <- fs::file_temp()
dir.create(dir)
dir.create(file.path(dir, "/demo/DEV/username/project1/script_library"), recursive = TRUE)

# Create a custom function
file_conn <- file(file.path(dir, "/demo/DEV/username/project1/script_library/test.R"))
writeLines(
"test <- function(){print('Hello from auto-sourced function!')}", file_conn)
close(file_conn)

# Write the configuration
config_path <- file.path(dir, "_envsetup.yml")
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'"
  ), file_conn)
close(file_conn)
```

## Loading and Using Auto-Sourced Functions

```{r}
# Load configuration and apply it
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
```

The auto-sourced functions are now available:

```{r}
# See what functions are available
objects()

# Use the function directly (no manual sourcing needed!)
test()
```

## Multiple Script Libraries

Real projects often have multiple script libraries for different
purposes:

```{r}
# Create production script library
dir.create(file.path(dir, "/demo/PROD/project1/script_library"), recursive = TRUE)

# Add production functions
file_conn <- file(file.path(dir, "/demo/PROD/project1/script_library/test2.R"))
writeLines(
"test2 <- function(){print('Hello from production function!')}", file_conn)
close(file_conn)

# Update configuration with multiple libraries
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'
    prod_script_library: '", dir,"/demo/PROD/project1/script_library'"
  ), file_conn)
close(file_conn)

# Reload configuration
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
```

## Using Multiple Libraries

```{r}
# Check search path - now includes both libraries
# Functions from both libraries are available
objects()

# Use functions from both libraries
test()   # From dev library
test2()  # From prod library
```

## Understanding Function Conflicts and the Overwrite Parameter

When auto-sourcing functions, you might encounter situations where
function names conflict with existing objects in your environment. The
`overwrite` parameter controls how these conflicts are handled.

### Quick Example of Function Conflicts

```{r}
# Create a function that might conflict
summary_stats <- function(data) {
  print("Original summary function")
}

# Create a script with the same function name
conflict_dir <- file.path(dir, "conflict_demo")
dir.create(conflict_dir)

file_conn <- file(file.path(conflict_dir, "stats.R"))
writeLines(
"summary_stats <- function(data) {
  print('Updated summary function from the new conflict_demo script')
}", file_conn)
close(file_conn)

# Add to configuration
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'
    prod_script_library: '", dir,"/demo/PROD/project1/script_library'
    conflict_demo: '", conflict_dir, "'"
  ), file_conn)
close(file_conn)

# When we reload, the auto-sourced version will overwrite the original
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)

# Test which version we have now
summary_stats()
```

The output shows detailed information about what was overwritten,
helping you track conflicts.

## Environment-Specific Auto-Sourcing

You might want different script libraries for different environments.
For example, exclude development functions when running in production:

```{r}
# Configuration that blanks out dev scripts in production
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'
    prod_script_library: '", dir,"/demo/PROD/project1/script_library'

prod:
  autos:
    dev_script_library: NULL"  # NULL disables this library
  ), file_conn)
close(file_conn)

# Load production configuration
envsetup_config <- config::get(file = config_path, config = "prod")
rprofile(envsetup_config)
```

So we can see now that only production functions are available:

```{r}
# Functions from production only
objects()

# Use functions from production only
test2()  # From prod library
```

## How Auto-Sourcing Works

When you call `rprofile()` with autos configuration:

1.  **Script Discovery**: Finds all `.R` files in specified directories
2.  **Conflict Detection**: Compares new functions with existing global
    environment objects
3.  **Automatic Sourcing**: Sources each script into its environment
4.  **Conflict Resolution**: Based on the `overwrite` parameter: 5 -
    `overwrite = TRUE` (default): Replaces existing functions and
    reports what was overwritten
    -   `overwrite = FALSE`: Preserves existing functions and reports
        what was skipped . **Metadata Tracking**: Records which script
        each function came from for debugging
5.  **Function Availability**: Functions become directly accessible

### Technical Details of Conflict Handling

The auto-sourcing system uses a sophisticated approach to handle
conflicts:

-   **Temporary Environment**: Each script is first sourced into a
    temporary environment
-   **Object Comparison**: New objects are compared against the global
    environment
-   **Selective Assignment**: Only specified objects are moved to the
    global environment
-   **Metadata Recording**: Each function's source script is recorded in
    `object_metadata`
-   **Detailed Reporting**: Users receive clear feedback about what was
    added, skipped, or overwritten
-   **Cleanup Integration**: Metadata enables precise cleanup when using
    `detach_autos()`

The `record_function_metadata()` function creates a comprehensive audit
trail by maintaining a data frame with:

-   `object_name`: The name of each sourced function

-   `script`: The full path to the source script

-   This is automatically updated when functions are overwritten by
    newer versions

## Benefits of Auto-Sourcing

1.  **No Manual Sourcing**: Functions are automatically available
2.  **Organized Libraries**: Separate environments for different script
    collections
3.  **Environment Isolation**: Functions don't interfere with each other
4.  **Dynamic Loading**: Easy to add/remove script libraries
5.  **Team Collaboration**: Shared function libraries across team
    members
6.  **Comprehensive Tracking**: Metadata system tracks function sources
    for debugging
7.  **Intelligent Cleanup**: Precise removal of auto-sourced functions
    via metadata

## Common Use Cases

### Project Utilities

``` yaml
autos:
  project_utils: '/project/utilities'
  data_processing: '/project/data_functions'
  plotting_functions: '/project/viz_functions'
```

### Environment-Specific Functions

``` yaml
default:
  autos:
    dev_helpers: '/dev/helper_functions'
    shared_utils: '/shared/utilities'

prod:
  autos:
    shared_utils: '/shared/utilities'
    # dev_helpers excluded in production
```

### Team Libraries

``` yaml
autos:
  team_functions: '/shared/team_library'
  personal_utils: '~/my_r_functions'
  project_specific: './project_functions'
```

## Managing Function Conflicts with the Overwrite Parameter

The `overwrite` parameter controls how auto-sourcing handles situations
where functions with the same name already exist in your global
environment. Understanding this parameter is crucial for managing
function conflicts effectively.

### Default Behavior: Overwrite = TRUE

By default, auto-sourcing will overwrite existing functions:

```{r}
# Create a function in global environment
my_function <- function() {
  print("Original function from global environment")
}

# Check it works
my_function()

# Create a script with the same function name
dir <- fs::file_temp()
dir.create(dir)
script_dir <- file.path(dir, "scripts")
dir.create(script_dir)

file_conn <- file(file.path(script_dir, "my_function.R"))
writeLines(
"my_function <- function() {
  print('Updated function from auto-sourced script')
}", file_conn)
close(file_conn)

# Configuration with default overwrite = TRUE
config_path <- file.path(dir, "_envsetup.yml")
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    my_scripts: '", script_dir, "'"
  ), file_conn)
close(file_conn)

# Load configuration - this will overwrite the existing function
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)

# The function has been overwritten
my_function()
```

### Conservative Behavior: Overwrite = FALSE

When `overwrite = FALSE`, existing functions are preserved:

```{r}
# clean up previous runs, removing all previously attached autos
detach_autos()

# Create a function in global environment
my_function <- function() {
  print("Original function from global environment")
}

# Check it works
my_function()

# Create a script with the same function name
dir <- fs::file_temp()
dir.create(dir)
script_dir <- file.path(dir, "scripts")
dir.create(script_dir)

file_conn <- file(file.path(script_dir, "my_function.R"))
writeLines(
"my_function <- function() {
  print('Updated function from auto-sourced script')
}", file_conn)
close(file_conn)

# Configuration with default overwrite = FALSE
config_path <- file.path(dir, "_envsetup.yml")
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    my_scripts: '", script_dir, "'"
  ), file_conn)
close(file_conn)

envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config, overwrite = FALSE)

my_function()

```

### Understanding Conflict Detection

The auto-sourcing system provides detailed feedback about what happens
during sourcing:

```{r}
# Create multiple functions to demonstrate conflict detection
existing_func1 <- function() "I exist in global"
existing_func2 <- function() "I also exist in global"

# Create script with mix of new and conflicting functions
file_conn <- file(file.path(script_dir, "mixed_functions.R"))
writeLines(
"# This will conflict with existing function
existing_func1 <- function() {
  'Updated from script'
}

# This is a new function
new_func <- function() {
  'Brand new function'
}

# This will also conflict
existing_func2 <- function() {
  'Also updated from script'
}", file_conn)
close(file_conn)

# Update configuration
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    my_scripts: '", script_dir, "'"
  ), file_conn)
close(file_conn)

# Reload - watch the detailed output
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
```

## Function Metadata Tracking

The auto-sourcing system includes sophisticated metadata tracking that
records detailed information about every function that gets sourced.
This tracking system is invaluable for debugging, auditing, and
understanding your function ecosystem.

### How Metadata Tracking Works

Every time a function is sourced through the autos system, the
`record_function_metadata()` function captures:

-   **Object Name**: The name of the function or object
-   **Source Script**: The full path to the script file that contains
    the function

This information is stored in a special `object_metadata` data frame
within the `envsetup_environment`.

### Accessing Function Metadata

```{r}
# After sourcing functions, you can access the metadata
# Note: This example shows the concept - actual access depends on envsetup internals

# Create some functions to demonstrate metadata tracking
metadata_demo_dir <- file.path(dir, "metadata_demo")
dir.create(metadata_demo_dir)

# Create multiple scripts with different functions
file_conn <- file(file.path(metadata_demo_dir, "data_functions.R"))
writeLines(
"load_data <- function(file) {
  paste('Loading data from:', file)
}

clean_data <- function(data) {
  paste('Cleaning data with', nrow(data), 'rows')
}", file_conn)
close(file_conn)

file_conn <- file(file.path(metadata_demo_dir, "plot_functions.R"))
writeLines(
"create_plot <- function(data) {
  paste('Creating plot for', ncol(data), 'variables')
}

save_plot <- function(plot, filename) {
  paste('Saving plot to:', filename)
}", file_conn)
close(file_conn)

# Update configuration to include metadata demo
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    metadata_demo: '", metadata_demo_dir, "'"
  ), file_conn)
close(file_conn)

# Source the functions
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)

# The system now tracks which script each function came from
cat("Functions sourced with metadata tracking:")
knitr::kable(envsetup_environment$object_metadata)
```

### Benefits of Metadata Tracking

#### 1. **Debugging Function Issues**

When a function isn't working as expected, metadata helps you quickly
identifyw hich script file contains the function

#### 2. **Audit Trail**

Metadata provides a complete audit trail of your function ecosystem.

### Metadata and the detach_autos() Function

The metadata tracking system integrates closely with cleanup operations:

1. Identify all auto-sourced functions
2. Remove them from the global environment  
3. Clean up the metadata records

## Best Practices

Even though your functions are not a part of a package, you should
follow best practices to ensure your functions work as expected.

1.  **Use Clear Names**: Library names should indicate their purpose
2.  **Monitor Conflicts**: Regularly check for and resolve function name
    conflicts
3.  **Document Functions**: Include roxygen2 comments in your functions
4.  **Test Functions**: Ensure auto-sourced functions work correctly
5.  **Package Prefix**: Use package prefix when writing your functions,
    for example, `dplyr::filter`

```{r echo = FALSE}
# Clean up
unlink(dir, recursive=TRUE)
```
