---
title: "Advanced Random Contexts"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Advanced Random Contexts}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(fcaR)
```

## Introduction

Creating synthetic datasets is essential for testing algorithms and validating results in Formal Concept Analysis. However, simple uniform random generation (where every cell has a fixed probability $p$ of being 1) often fails to capture the structure of real-world data.

`fcaR` now implements advanced generation methods, allowing for more realistic and controllable simulations.

## 1\. Dirichlet Distribution for Realistic Data

Real-world contexts often have "clumpy" or "sparse" rows. Some objects have many attributes, while others have very few. A uniform distribution creates rows that are all roughly the same size (Binomial distribution).

To mimic real variability, we use a **Dirichlet Distribution** to sample the probability of an object having $k$ attributes.

```{r dirichlet}
# Uniform Context (Standard)
# All objects have roughly 20% of attributes
fc_uni <- RandomContext(n_objects = 20, n_attributes = 10, density = 0.2, distribution = "uniform")

# Dirichlet Context (Realistic)
# Some objects will be empty, some full, some in between.
# alpha = 0.1 -> High skewness (Very sparse or very dense rows)
# alpha = 1.0 -> Uniform distribution of row sizes
fc_dir <- RandomContext(n_objects = 20, n_attributes = 10, distribution = "dirichlet", alpha = 0.2)

# Compare Row Sums
barplot(rowSums(fc_uni$incidence()), main = "Uniform: Row Sums", ylim = c(0, 10))
barplot(rowSums(fc_dir$incidence()), main = "Dirichlet: Row Sums", ylim = c(0, 10))
```

## 2\. Randomization via Edge Swapping

When performing statistical analysis on a concept lattice, we often ask: *"Is this pattern significant, or could it happen by chance?"*.

To answer this, we need to compare our context against a "random null model". The most robust null model is a random matrix that preserves:

1.  The number of attributes per object (Row sums).
2.  The frequency of each attribute (Column sums).

This is achieved by **Edge Swapping** (also known as the Curveball algorithm). It swaps connections without altering the marginal sums.

```{r swapping}
data(planets)
fc <- FormalContext$new(planets)

# Original Marginals
orig_col_sums <- colSums(fc$incidence())
print(orig_col_sums)

# Randomize using Swap
fc_random <- randomize_context(fc, method = "swap")

# Verify Marginals are preserved
new_col_sums <- colSums(fc_random$incidence())
print(new_col_sums)

# But the structure is different
print(all(fc$incidence() == fc_random$incidence()))
```

This allows you to generate 1000 randomized versions of your data and check if your concept stability or support is statistically significant.

## 3\. Generating Distributive Lattices

We can also generate random contexts that are **mathematically guaranteed** to produce a Distributive Concept Lattice. This relies on Birkhoff's Theorem, which states that the lattice of order ideals of a Poset is distributive.

The function `RandomDistributiveContext` generates a random Poset and builds its associated formal context.

```{r distributive_gen}
# Generate a random distributive context based on a Poset with 15 elements
fc_dist <- RandomDistributiveContext(n_elements = 15, density = 0.2)
fc_dist$find_concepts()

# Verify the mathematical guarantee
print(paste("Is Distributive?", fc_dist$concepts$is_distributive()))
print(paste("Is Modular?",      fc_dist$concepts$is_modular()))
```

This is particularly useful for testing algorithms designed for distributive lattices or for educational purposes.
