---
title: "Fuzzy Spectral Clustering with Variable-Weighted Adjacency Matrices"
author: "Jesse S. Ghashti and John R. J. Thompson"
date: "`r format(Sys.Date())`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Fuzzy Spectral Clustering with Variable-Weighted Adjacency Matrices}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.align = "center",
  fig.width = 6,
  fig.height = 5,
  message = FALSE,
  warning = FALSE
)
library(mclust)
library(fclust)
library(ggplot2)
library(patchwork)
library(mvtnorm)
library(stats)
library(knitr)
library(np)
library(MASS)
library(rmarkdown)
```

# Introduction

The **FuzzySpec** package implements the **FVIBES** (Fuzzy Variable-Importance Based Eigenspace Separation) algorithm, a fuzzy spectral clustering procedure that incorporates variable-weighted distance metrics and adaptive adjacency matrix constructions. This package accompanies the paper _Variable-Weighted Adjacency Constructions for Fuzzy Spectral Clustering_ by Ghashti, Hare, and Thompson (2025).


The key features of this package include:

- a variable-weighted distance metric that automatically determines variable importance using nonparametric kernel density estimation, 

- an adaptive adjacency construction framework with multiple options for building similarity graphs including locally-adaptive scaling (Zelnik-Manor and Perona, 2004), 

- clustering outputs that return fuzzy membership matrices rather than just hard cluster assignments, and

- a synthetic dataset generation containing built-in generators to benchmark fuzzy clustering algorithms.


## Package Overview
There are three primary functions needed to conduct FVIBES clustering:

1. Build an adjacency matrix from the data using `make.adjacency()`

2. Perform fuzzy spectral clustering using `fuzzy.spectral.clustering()`

3. Optionally, examine results results with 2D visualization function `plot.fuzzy()` or compare to true class labels using `clustering.accuracy()`.

# Installation

Install the latest release version of **FuzzySpec** from [GitHub](https://github.com/ghashti-j/FuzzySpec) or with the following:

```{r, eval = FALSE}
library(devtools)
install_github("ghashti-j/FuzzySpec")
library(FuzzySpec)
```

# Sample Usage

The basic steps using built-in function are provided below. 

1. First we generate a synthetic dataset `spirals`, see the help file for `gen.fuzzy()` for more options and information.

```{r, fig.align='center'}
set.seed(1)
data <- FuzzySpec::gen.fuzzy(n = 300, dataset = "spirals", noise = 0.15) # data generation
FuzzySpec::plot.fuzzy(data, plotFuzzy = TRUE, colorCluster = TRUE) # plot data generating process
```

2. Build a variable-weighted locally-adaptive adjacency matrix, corresponding to the adjacency $\mathbf{W}^{(\text{vwla-id})}$ in Ghashti et al. (2025):

```{r, message = FALSE}
W <- FuzzySpec::make.adjacency(
  data = data$X,
  method = "vw",           # variable-weighted distances
  isLocWeighted = TRUE,    # Locally-adaptive scaling
  scale = FALSE            # scaling not required for kernel methods
)
```

3.  Perform fuzzy spectral clustering given the adjacency matrix $\mathbf{W}$, number of clusters `k = 3` and the commonly chosen fuzzy parameter `m = 1.5`. We display the first 5 rows of the membership matrix $\mathbf{U}$:

```{r}
res <- FuzzySpec::fuzzy.spectral.clustering(
  W = W, k = 3, m = 1.5, method = "CM"           
)
res$u[1:5,]
```

4. We can compare the hard clustering results to the true class labels:

```{r}
acc <- FuzzySpec::clustering.accuracy(data$y, res$cluster)
cat("Clustering accuracy:", round(acc, 3), "\n")
```

5. We can compare the membership matrix $\mathbf{U}$ determined by FVIBES to the true probabilistic cluster memberships with function `fari`, which computes fuzzy generalizations of the Adjusted Rand Index (FARI) based on Frobenius inner products of membership matrices (Andrews, Brown and Hvingelby, 2022). 

```{r}
far <- FuzzySpec::fari(data$U, res$u)
cat("FARI:", round(far, 3), "\n")
```

6. Finally, we can visualize the clustering results with observations, where observations are assigned by hard cluster labels and sized by the membership matrix $\mathbf{U}$:

```{r, fig.align='center'}
resDF <- list(
  X = data$X, U = res$u, y = factor(res$cluster), k = 3
)
FuzzySpec::plot.fuzzy(resDF, plotFuzzy = TRUE, colorCluster = TRUE)
```

## Adjacency Construction
See respective help files for each function when needed; here we provide a basic overview of function arguments for
`make.adjacency()`. This function allows for flexible adjacency matrix constructions based on Ghashti et al. (2025). The parameters are as follows:

* `method`: distance metric
    + `"eu"`: squared Euclidean distance
    + `"vw"`: variable-weighted distance using kernel density bandwidth estimation
* `isLocWeighted`: scaling approach
    + `TRUE`: locally-adaptive scaling (Zelnik-Manor & Perona, 2004)
    + `FALSE`: global scaling with parameter `sig`
* `isModWeighted`: apply similarity weightings
    + `ModMethod = "snn"`: shared nearest neighbors (Jarvis & Patrick, 1973)
    + `ModMethod = "sim"`: similarity-based weighting
    + `ModMethod = "both"`: combined SNN and SIM
* `isSparse`: returns a sparse matrix when using weightings

**References**

* Andrews, J.L., Browne, R. and C.D. Hvingelby (2022). On Assessments of Agreement Between Fuzzy Partitions. _Journal of Classification, 39_, 326–342.

* J.C. Bezdek (1981). _Pattern Recognition with Fuzzy Objective Function Algorithms_. Plenum Press, New York.
  
* K. R. Coombes (2025). _Thresher: Threshing and Reaping for Principal Components_. R package version 1.1.5.

* Ferraro, M.B., Giordani, P., and A. Serafini (2019). fclust: An R Package for Fuzzy Clustering. _The R Journal, 11_.
  
* Jarvis, R. A., and A. E. Patrick (1973). Clustering using a similarity measure based on shared near neighbors. _IEEE Transactions on Computers, 22_(11), 1025-1034.

* Ghashti, J. S., Hare, W., and J. R. J. Thompson (2025). Variable-weighted adjacency constructions for fuzzy spectral clustering. Submitted.
  
* Hayfield, T., and J. S. Racine (2008). Nonparametric Econometrics: The np Package. _Journal of Statistical Software 27_(5).

* McLachlan, G. and T. Krishnan (2008). _The EM algorithm and extensions_, Second Edition. John Wiley & Sons.

* Ng, A., Jordan, M., and Y. Weiss (2001). On spectral clustering: Analysis and an algorithm. _Advances in Neural Information Processing Systems, 14_.

* Scrucca, L., Fraley, C., Murphy, T.B., and A. E. Raftery (2023). _Model-Based Clustering, Classification, and Density Estimation Using mclust in R_. Chapman & Hall.
  
* H. Wickham (2016). _ggplot2: Elegant Graphics for Data Analysis_. Springer--Verlag New York.

* Zelnik-Manor, L., and P. Perona (2004). Self-tuning spectral clustering. _Advances in Neural Information Processing Systems, 17_.

* Zhu, Q., Feng, J., and J. Huang (2016). Natural neighbor: A self-adaptive neighborhood method without parameter K. _Pattern Recognition Letters, 80_, 30-36.
