%\VignetteIndexEntry{Implementation of lm.beta}
%\VignetteKeywords{linear regression, standardizing, standardized coefficient, beta}
% !Rnw weave = knitr
% \VignetteEngine{knitr::knitr}
\documentclass{article}
\usepackage[latin1]{inputenc}
\usepackage[english]{babel}

\usepackage{amsmath}
\usepackage{hyperref}

\usepackage[left=2.5cm,right=2cm,top=1.5cm,bottom=1.5cm]{geometry}
\pagestyle{empty}

\author{Stefan Behrendt}
\title{Implementation of \texttt{lm.beta}}
\date{January 01, 2023}

\parindent0pt
\parskip 10pt plus 1pt minus 1pt

\begin{document}
\maketitle\thispagestyle{empty}

The package \texttt{lm.beta} is based on equation (\ref{fct:impl}) to estimate the standardized regression coefficients.

\begin{equation}\label{fct:impl}
\hat{\beta}_i = \hat{b}_i \cdot \dfrac{s(X_i)}{s(Y)}
\end{equation}

using

\begin{equation*}
s(A) = \sqrt{\dfrac{\sum_j w_j\cdot (A_j - m(A) \cdot I)^2}{(n_w - 1) / n_w \cdot\sum_j w_j}}
\end{equation*}

\begin{equation*}
m(A) = \dfrac{\sum_j w_j\cdot A_j}{\sum_j w_j}
\end{equation*}

with
\begin{itemize}
\item $\hat{\beta}_i$ the $i$-th standardized regression coefficient
\item $\hat{b}_i$ the $i$-th unstandardized regression coefficient
\item $I = \left\lbrace \begin{matrix} 0/1 & \text{for models without intercept*} \\ 1 & \text{for models with intercept} \end{matrix}  \right. $
\begin{itemize}
\item[*] argument \texttt{complete.standardization} chooses the factor: \texttt{complete.standardization = FALSE} $\Rightarrow I=0$ / \texttt{complete.standardization = TRUE} $\Rightarrow I=1$
\item[*] IBM\textsuperscript{\textregistered} SPSS Statistics\textsuperscript{\textregistered}, e.g., always uses $I=0$ for models without intercept
\item[*] see e.g. \url{https://online.stat.psu.edu/~ajw13/stat501/SpecialTopics/Reg_thru_origin.pdf}\footnote{Eisenhauer J.G. (2003). Regression through the Origin. \textit{Teaching Statistics}, 25(3), p. 76-80.} for further information on which $I$ to choose
\end{itemize}
\item $Y$ the dependent variable
\item $X_i$ the $i$-th independent variable
\item $w$ the case weights
\item $n_w$ the number of non-zero weights
\end{itemize}

\clearpage
A simplification for $I=1$ is shown in equation (\ref{fct:i}) and for $I=0$ in equation (\ref{fct:ii}).

\begin{equation}\label{fct:i}
\hat{\beta}_i = \hat{b}_i \cdot \dfrac{s_{X_i}}{s_Y}
\end{equation}

\begin{equation}\label{fct:ii}
\hat{\beta}_i = \hat{b}_i \cdot \dfrac{\sigma_{X_i}}{\sigma_Y}
\end{equation}

with (additionally to above)
\begin{itemize}
\item $s_A$ the standard deviation of $A$ (*)
\item $\sigma_A = \sqrt{\sum_j A_j^2}$ an estimate of the uncentered second moment of $A$ (*)
\begin{itemize}
\item[*] The sample size---and the different methods for correcting it---doesn't have to be considered when estimating the moments, because the factors would be similar in numerater and denominater, and therefore would be reduced.
\end{itemize}
\end{itemize}

Simplifications of non-weighted cases are
\begin{equation*}
s(A) = \sqrt{\dfrac{\sum_j (A_j - m(A) \cdot I)^2}{n - 1}}
\end{equation*}

\begin{equation*}
m(A) = \dfrac{\sum_j A_j}{n}
\end{equation*}

with (additionally to above)
\begin{itemize}
\item $n$ the number of non-empty cases
\end{itemize}

\end{document}