Skip to content

Model Diagnostics¤

glmax.check(...) computes diagnostics from a fitted noun without refitting. The high-level philosophy is that diagnostics are explicit strategy objects: check applies one concrete diagnostic at a time and returns the typed result for that diagnostic, which keeps the workflow easy to compose and type-check.

glmax.check(fitted: glmax.FittedGLM, *, diagnostic: glmax.AbstractDiagnostic[~T] = glmax.GoodnessOfFit()) -> ~T ¤

Assess model fit with one diagnostic and return its typed result.

The canonical check grammar verb. Accepts one concrete AbstractDiagnostic[T] instance and returns the corresponding result T = diagnostic.diagnose(fitted).

Decorated with eqx.filter_jit; JIT-compiles on first call and caches subsequent calls with the same structure.

Compute multiple diagnostics with tree_map

import jax.tree_util as jtu
import glmax

diagnostics = (
    glmax.PearsonResidual(),
    glmax.DevianceResidual(),
    glmax.GoodnessOfFit(),
)

results = jtu.tree_map(
    lambda diagnostic: glmax.check(fitted, diagnostic=diagnostic),
    diagnostics,
    is_leaf=lambda node: isinstance(node, glmax.AbstractDiagnostic),
)
pearson, deviance, gof = results

Arguments:

Returns:

One diagnostic result of type T.

Raises:


Diagnostics¤

AbstractDiagnostic defines the strategy interface behind check.

glmax.AbstractDiagnostic

glmax.AbstractDiagnostic ¤

Abstract base for pluggable GLM diagnostic strategies.

Subclass and implement diagnose to define a diagnostic computation. Each concrete diagnostic encapsulates one computation and returns a typed result T (either a JAX array or an eqx.Module of arrays).

__init__(self) ¤

Initialize self. See help(type(self)) for accurate signature.

diagnose(self, fitted: glmax.FittedGLM) -> ~T ¤

Compute the diagnostic from a fitted GLM.

Example

class MyDiag(AbstractDiagnostic[Array]):
    def diagnose(self, fitted: glmax.FittedGLM) -> Array:
        return fitted.y - fitted.mu

Arguments:

Returns:

Diagnostic result of type T - a JAX array or an eqx.Module containing only JAX arrays (pytree-compatible).

glmax.PearsonResidual(glmax.AbstractDiagnostic) ¤

Pearson residuals \((y_i - \mu_i) / \sqrt{V(\mu_i)}\).

These residuals normalize the raw residual \(y_i - \mu_i\) by the square root of the family variance function \(V(\mu_i)\), where \(y_i\) is the observed response for observation \(i\) and \(\mu_i\) is the fitted mean.

__init__(self) ¤

Initialize self. See help(type(self)) for accurate signature.

diagnose(self, fitted: glmax.FittedGLM) -> Array ¤

Compute Pearson residuals.

The residual for observation \(i\) is \(r_i = (y_i - \mu_i) / \sqrt{V(\mu_i)}\), where \(V(\mu_i)\) is the family variance function evaluated at the fitted mean \(\mu_i\).

Arguments:

Returns:

Pearson residuals, shape (n,).


glmax.DevianceResidual(glmax.AbstractDiagnostic) ¤

Deviance residuals \(\operatorname{sign}(y_i - \mu_i) \sqrt{d_i}\).

Here \(y_i\) is the observed response, \(\mu_i\) is the fitted mean, and \(d_i\) is the deviance contribution for observation \(i\).

__init__(self) ¤

Initialize self. See help(type(self)) for accurate signature.

diagnose(self, fitted: glmax.FittedGLM) -> Array ¤

Compute deviance residuals.

The residual for observation \(i\) is \(r_i = \operatorname{sign}(y_i - \mu_i)\sqrt{d_i}\), where \(d_i\) is the per-observation deviance contribution.

Arguments:

Returns:

Deviance residuals, shape (n,).


glmax.QuantileResidual(glmax.AbstractDiagnostic) ¤

Deterministic quantile residuals via a mid-quantile approximation.

For discrete families (Poisson, Binomial, NegativeBinomial) this uses \(\Phi^{-1}((F(y_i) + F(y_i - 1))/2)\). For continuous families (Gaussian, Gamma) it uses \(\Phi^{-1}(F(y_i))\). Here \(F\) is the fitted cumulative distribution function and \(\Phi^{-1}\) is the standard normal quantile function.

CDF values are clamped to \([\varepsilon, 1-\varepsilon]\) before the normal quantile function to prevent infinite outputs, where \(\varepsilon\) is machine epsilon for float64.

__init__(self) ¤

Initialize self. See help(type(self)) for accurate signature.

diagnose(self, fitted: glmax.FittedGLM) -> Array ¤

Compute deterministic quantile residuals.

For discrete responses this uses the mid-quantile correction \(\Phi^{-1}((F(y_i) + F(y_i - 1))/2)\). For continuous responses this uses \(\Phi^{-1}(F(y_i))\).

Arguments:

Returns:

Quantile residuals, shape (n,).


glmax.GoodnessOfFit(glmax.AbstractDiagnostic) ¤

Goodness-of-fit summary diagnostic.

Computes scalar summaries based on deviance, Pearson residual scale, and information criteria derived from the fitted model.

__init__(self) ¤

Initialize self. See help(type(self)) for accurate signature.

diagnose(self, fitted: glmax.FittedGLM) -> glmax.GofStats ¤

Compute goodness-of-fit statistics.

This computes \(D\), \(\chi^2\), \(\hat{\phi}\), \(\mathrm{AIC}\), and \(\mathrm{BIC}\), where \(D\) is total deviance, \(\chi^2\) is the Pearson chi-squared statistic, and \(\hat{\phi}\) is the fitted dispersion.

Arguments:

Returns:

GofStats with scalar array fields.


glmax.Influence(glmax.AbstractDiagnostic) ¤

Leverage and Cook's distance via Cholesky-based hat-matrix computation.

Recomputes \(\operatorname{chol}(X^\top W X)\) from the fitted weights, where \(X\) is the design matrix and \(W\) is the diagonal matrix of working weights. It does not rely on the Cholesky factor from IRLS because that factor is not persisted in glmax.FitResult.

__init__(self) ¤

Initialize self. See help(type(self)) for accurate signature.

diagnose(self, fitted: glmax.FittedGLM) -> glmax.InfluenceStats ¤

Compute leverage and Cook's distance.

The leverage values are the diagonal elements \(h_{ii}\) of the hat matrix. Cook's distance is computed from \(h_{ii}\), the Pearson residual, and the coefficient count \(p\).

Arguments:

Returns:

InfluenceStats with leverage and cooks_distance, each shape (n,).

Diagnostic results¤

glmax.GofStats ¤

Goodness-of-fit statistics for a fitted GLM.

All fields are scalar JAX arrays. Pytree-compatible.

Fields:

  • deviance: total deviance \(D = \sum_i d_i\), where \(d_i\) is the deviance contribution for observation \(i\).
  • pearson_chi2: Pearson chi-squared statistic \(\chi^2 = \sum_i (y_i - \mu_i)^2 / V(\mu_i)\).
  • df_resid: residual degrees of freedom \(n - p\), where \(n\) is the number of observations and \(p\) is the number of coefficients.
  • dispersion: fitted dispersion estimate \(\hat{\phi}\).
  • aic: Akaike information criterion \(\mathrm{AIC} = -2 \ell + 2p\), where \(\ell\) is the fitted log-likelihood.
  • bic: Bayesian information criterion \(\mathrm{BIC} = -2 \ell + p \log n\).
__init__(self, deviance: Array, pearson_chi2: Array, df_resid: Array, dispersion: Array, aic: Array, bic: Array) ¤

Initialize self. See help(type(self)) for accurate signature.

glmax.InfluenceStats ¤

Per-observation influence statistics.

Fields:

  • leverage: hat-matrix diagonal \(h_{ii} \in (0, 1)\), shape (n,).
  • cooks_distance: Cook's distance \(D_i \geq 0\), shape (n,), where \(D_i\) measures the influence of observation \(i\) on the fitted coefficient vector.
__init__(self, leverage: Array, cooks_distance: Array) ¤

Initialize self. See help(type(self)) for accurate signature.