---
title: "Differentially Expressed Heterogeneous Overdispersion Gene Test for Count Data"
author:
  - name: "Qi Xu"
    email: "qixu@andrew.cmu.edu"
    affiliation: "Carnegie Mellon University"
  - name: "Arlina Shen"
    email: "arlinashen@yahoo.com"
    affiliation: "Independent Researcher"
  - name: "Yubai Yuan"
    email: "yvy5509@psu.edu"
    affiliation: "Pennsylvania State University"
  - name: "Annie Qu"
    email: "aqu2@uci.edu"
    affiliation: "University of California, Irvine"
date: "`r BiocStyle::doc_date()`"
package: "`r BiocStyle::pkg_ver('DEHOGT')`"
output:
  BiocStyle::html_document:
    toc: true
    toc_depth: 2
vignette: >
  %\VignetteIndexEntry{DEHOGT: Differentially Expressed Heterogeneous Overdispersion Gene Test for Count Data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
abstract: >
  This vignette describes the use of DEHOGT, a package for identifying differentially expressed genes using a generalized linear model approach. The package supports quasi-Poisson and negative binomial models to accommodate overdispersion in gene expression count data, integrating factors such as treatment effects, normalization, and significance testing.
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```


# Introduction

DEHOGT is designed to handle overdispersion in count data using a generalized linear model (GLM) framework. The package supports quasi-Poisson and negative binomial models, making it useful for differential expression analysis of RNA-seq and other count-based data types.

# Installation


```{r installation, eval=FALSE}
if (!require("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("DEHOGT")
```

# Example Worklow

In this example, we simulate gene expression data and perform differential expression analysis using the quasi-Poisson model. We also show how to incorporate covariates and normalization factors.

```{r exampleWorkflow}
## Simulate gene expression data (100 genes, 10 samples)
data <- matrix(rpois(1000, 10), nrow = 100, ncol = 10)

## Randomly assign treatment groups
treatment <- sample(0:1, 10, replace = TRUE)
## Load DEHOGT package
library(DEHOGT)

## Run the function with 2 CPU cores
result <- dehogt_func(data, treatment, num_cores = 2)

## Display results
head(result$pvals)

# Example: Adding covariates and normalization factors
covariates <- matrix(rnorm(1000), nrow = 100, ncol = 10)
norm_factors <- rep(1, 10)

# Run with covariates and normalization factors
result_cov <- dehogt_func(data, treatment, covariates = covariates, norm_factors = norm_factors, num_cores = 2)
```
# Session Info
```{r sessionInfo, results='asis'}
sessionInfo()
```

