Fork me on GitHub
Math for the people, by the people.

User login

generalized linear model

% this is the default PlanetMath preamble.  as your knowledge
% of TeX increases, you will probably want to edit this, but
% it should be fine as is for beginners.

% almost certainly you want these

% used for TeXing text within eps files
% need this for including graphics (\includegraphics)
% for neatly defining theorems and propositions
% making logically defined graphics

% there are many more packages, add them here as you need them

% define commands here

Given a random vector, or the response variable, \textbf{Y}, a \emph{generalized linear model}, or GLM for short, is a statistical model $\lbrace f_\textbf{Y}(\boldsymbol{y}\mid\boldsymbol{\theta})\rbrace$ such that
\item the components of \textbf{Y} are mutually independent of each other,
\item $f_{Y_i}(y_i\mid\theta_i)$ belongs to the exponential family of distributions and has the following canonical form:
where the parameter $\theta_i$ is called the \emph{canonical parameter} and $b(\theta_i)$ is called the \emph{cumulant function}.
\item for each component or variate $Y_i$, with a corresponding set of $p$ covariates $X_{ij}$, there exists a monotone differentiable function $g$, called the \emph{link function}, such that 
where ${\textbf{X}_i}^{\operatorname{T}}=(X_{i1},\ldots,X_{ip})$, and $\boldsymbol{\beta}=(\beta_1,\ldots,\beta_p)^{\operatorname{T}}$ is a parameter vector.

In practice, an extra parameter called the dispersion parameter, $\phi$, is introducted to the model to lower a phenonmenon known as overdispersion.  The GLM now looks like: $$f_{Y_i}(y_i\mid\theta_i)=\operatorname{exp}[\frac{y\theta_i-b(\theta_i)}{a(\phi)}+c(y,\phi)]$$

\item Below is a table of canonical parameters and cumulant functions for some well-known distributions from the exponential family:
distribution¬ation&canonical parameter $\theta$&cumulant function $b(\theta)$\\
\item GLM is a direct generalization of the general linear model, which includes linear regression models, ANOVA and ANCOVA.  The link function for the general linear model is the identity function $g(\mu)=\mu$.
\item For a GLM, $\operatorname{E}[Y]=b^{\prime}(\theta)$ and $\operatorname{Var}[Y]=b^{\prime\prime}(\theta)$. $b^{\prime\prime}(\theta)$, when expressed in terms of $\mu=\operatorname{E}[Y]$, is known as the \emph{variance function} $V(\mu)$.  Below are some examples of variance functions:
distribution & notation & variance function \\
Normal & $N(\mu,\sigma^2)$ & 1 \\
Poisson& $ Poisson(\mu)$ & $\mu$ \\
\PMlinkname{Binomial}{BernoulliDistribution2} & $Bin(m,\pi)$ & $\pi(1-\pi)$ \\
Gamma & $Gamma(\alpha,\lambda)$ & $\displaystyle{\frac{1}{\lambda^2}}$ \\
\item The logistic regression model, where the response variable $Y$ is categorial in nature, is a special case of GLM, with possible link functions the logit function, $\operatorname{logit}(\pi)=\operatorname{ln}(\operatorname{odds}(\pi))$, the inverse cumulative normal distribution function, or probit function $\Phi^{-1}(\pi)$, or the complementary-log-log function, $\operatorname{ln}(-\operatorname{ln}(1-\pi))$, where the parameter $\pi$ is between 0 and 1, usually measured as the frequency of occurrences of certain events.
\item The log-linear model, where the response variable $Y$ has a Poisson distribution, is also a special case of GLM, with link function the natural logarithm of the parameter $\mu$ in question.  Poisson distribution is typically used to model count or frequency data.
\bibitem{mccullagh} P. McCullagh and J. A. Nelder, {\em Generalized Linear Models}, Chapman \& Hall/CRC, 2nd ed., London (1989).
\bibitem{dobson} A. J. Dobson, {\em An Introduction to Generalized Linear Models}, Chapman \& Hall, 2nd ed. (2001).