quasibin {aod} | R Documentation |
The function fits the generalized linear model “II” proposed by Williams (1982) accounting for overdispersion in clustered binomial data (n, y).
quasibin(formula, data, link = c("logit", "cloglog"), phi = NULL, tol = 0.001)
formula |
A formula for the fixed effects. The left-hand side of the formula must be of the form
|
link |
The link function for the mean p: “logit” or “cloglog”. |
data |
A data frame containing the response ( |
phi |
When |
tol |
A positive scalar (default to 0.001). The algorithm stops at iteration r + 1 when the condition χ{^2}[r+1] - χ{^2}[r] <= tol is met by the chi-squared statistics . |
For a given cluster (n, y), the model is:
y | λ ~ Binomial(n, λ)
with λ a random variable of mean E[λ] = p
and variance Var[λ] = φ * p * (1 - p).
The marginal mean and variance are:
E[y] = p
Var[y] = p * (1 - p) * [1 + (n - 1) * φ]
The overdispersion parameter φ corresponds to the intra-cluster correlation coefficient,
which is here restricted to be positive.
The function uses the function glm
and the parameterization: p = h(X b) = h(η), where h is the
inverse of a given link function, X is a design-matrix, b is a vector of fixed effects and η = X b
is the linear predictor.
The estimate of b maximizes the quasi log-likelihood of the marginal model.
The parameter φ is estimated with the moment method or can be set to a constant
(a regular glim is fitted when φ is set to zero). The literature recommends to estimate φ
from the saturated model. Several explanatory variables are allowed in b. None is allowed in φ.
An object of formal class “glimQL”: see glimQL-class
for details.
Matthieu Lesnoff matthieu.lesnoff@cirad.fr, Renaud Lancelot renaud.lancelot@cirad.fr
Moore, D.F., 1987, Modelling the extraneous variance in the presence of extra-binomial variation.
Appl. Statist. 36, 8-14.
Williams, D.A., 1982, Extra-binomial variation in logistic linear models. Appl. Statist. 31, 144-148.
glm
, geese
in the contributed package geepack,
glm.binomial.disp
in the contributed package dispmod.
data(orob2) fm1 <- glm(cbind(y, n - y) ~ seed * root, family = binomial, data = orob2) fm2 <- quasibin(cbind(y, n - y) ~ seed * root, data = orob2, phi = 0) fm3 <- quasibin(cbind(y, n - y) ~ seed * root, data = orob2) rbind(fm1 = coef(fm1), fm2 = coef(fm2), fm3 = coef(fm3)) # show the model fm3 # dispersion parameter and goodness-of-fit statistic c(phi = fm3@phi, X2 = sum(residuals(fm3, type = "pearson")^2)) # model predictions predfm1 <- predict(fm1, type = "response", se = TRUE) predfm3 <- predict(fm3, type = "response", se = TRUE) New <- expand.grid(seed = levels(orob2$seed), root = levels(orob2$root)) predict(fm3, New, se = TRUE, type = "response") data.frame(orob2, p1 = predfm1$fit, se.p1 = predfm1$se.fit, p3 = predfm3$fit, se.p3 = predfm3$se.fit) fm4 <- quasibin(cbind(y, n - y) ~ seed + root, data = orob2, phi = fm3@phi) # Pearson's chi-squared goodness-of-fit statistic # compare with fm3's X2 sum(residuals(fm4, type = "pearson")^2)