.. _glm: Generalized Linear Model ======================== Let :math:`\left\{\phi_i(\cdot)\right\}_{i=1}^m` be a set of basis functions (see :ref:`map-basis`). We think of a **Generalized Linear Model** (GLM) is a parametrization of a subspace of the functions :math:`\mathbf{f}:\mathbb{R}^d\rightarrow \mathbb{R}^q`: .. math:: \mathbf{f}(\mathbf{x}; \mathbf{W}) = \boldsymbol{\phi}(\mathbf{x})^T\mathbf{W}, :label: glm where :math:`\mathbf{W}\in\mathbb{R}^{m\times q}` is the weight matrix, and .. math:: \boldsymbol{\phi}(\mathbf{x}) = \left(\phi_1(\mathbf{x}), \dots, \phi_m(\mathbf{x})\right). :label: basis Usually, the weights :math:`\mathbf{W}` are not fixed, but its column is has a multi-variate Gaussian distribution: .. math:: \mathbf{W}_j \sim \mathcal{N}_m\left(\mathbf{W}_j | \mathbf{M}_j, \boldsymbol{\Sigma}\right), :label: post-weights for :math:`j=1,\dots,q`, where :math:`\mathbf{A}_j` is the :math:`j`-th column of the matrix :math:`\mathbf{A}`, :math:`\mathbf{M}_j` is the mean of :math:`\mathbf{M}_j` and semi-positive definite :math:`\boldsymbol{\Sigma}\in\mathbb{R}^{m\times m}` mean of column :math:`j` and the covariance matrix, respectively. Notice that we have restricted our attention to covariance matrices independent of the output dimension. This is very restrictive but in practice, there are ways around this problem. Giving a more general definition would make it extremely difficult to store all the required information (we would need a :math:`(qm)\times(qm)` covariance matrix). In any case, this is the model we use in our `RVM paper `_. .. note:: The distribution of the weights is to be thought as the posterior distribution for the weights that occures when you attempt to fit the model to some data. Allowing for the possibility of some Gaussian noise, the predictive distribution for the output :math:`\mathbf{y}` at the input point :math:`\mathbf{x}` is given by: .. math:: p(\mathbf{y} | \mathbf{x}) = \mathcal{N}_q\left(\mathbf{y} | \mathbf{m}(\mathbf{x}), \boldsymbol{\sigma}^2(\mathbf{x})\mathbf{I}_q\right), :label: pred-y where :math:`\mathbf{I}_q` is the :math:`q`-dimensional unit matrix, while the mean and the variance at :math:`\mathbf{x}` are given by: .. math:: \mathbf{m}(\mathbf{x}) = \boldsymbol{\phi}(\mathbf{x})^T \mathbf{W},\;\; \boldsymbol{\sigma}^2(\mathbf{x}) = \beta^{-1} + \boldsymbol{\phi}(\mathbf{x})^T\boldsymbol{\Sigma} \boldsymbol{\phi}(\mathbf{x}), :label: pred-mean-var with :math:`\beta` being the noise precision (i.e., the inverse variance). In BEST, we represent the GLM by a :class:`best.maps.GeneralizedLinearModel` class which inherits from :class:`best.maps.Function`. It is essentially a function that evaluates the predictive mean of the model. However, it also offers access to several other useful methods for uncertainty quantification. Here is the definition of :class:`best.maps.GeneralizedLinearModel`: .. class:: GeneralizedLinearModel :inherits: :class:`best.maps.Function` A class that represents a Generalized Linear Model. .. method:: __init__(basis[, weights=None[, sigma_sqrt=None[, \ beta=None[, \ name='Generalized Linear Model']]]]) Initialize the object. .. note:: Notice that instead of the covariance matrix :math:`\boldsymbol{\Sigma}`, we initialize the object with its square root. The square root of :math:`\boldsymbol{\Sigma}` is any matrix :math:`\mathbf{R}\in \mathbb{R}^{k\times m}` such that: .. math:: \boldsymbol{\Sigma} = \mathbf{R}^T\mathbf{R}. This is usefull, because we allow for a the treatment of a semi-positive definite covariance (i.e., when :math:`k < m`). It is up to the user to supply the right :math:`\mathbf{R}` in there. :param basis: A set of basis functions. :type basis: :class:`best.maps.Function` :param weights: The mean weights \ :math:`\mathbf{M}`. If \ ``None``, then it is assumed to be all zeros. :type weights: 2D numpy array of shape :math:`m\times q` :param sigma_sqrt: The square root of the covariance materix. \ If ``None``, then it is assumed to be all \ zeros. :type sigma_sqrt: 2D numpy array of shape :math:`k\times q, k\le q` :param beta: The noise precision (inverse variance). If \ unspecified, it is assumed to be a very big \ number. :type beta: ``float`` :param name: A name for the object. :type name: str .. method:: __call__(x[, hyp=None]) :overloads: :func:`best.maps.Function.__call__()` Evaluate the mean of the generalized model at ``x``. Essentially computed :math:`\mathbf{m}(\mathbf{x})`. .. method:: d(x[, hyp=None]) :overloads: :func:`best.maps.Function.d()` Evaluate the Jacobian of the generalized model at ``x``. This is :math:`\nabla \mathbf{m}(\mathbf{x})`. .. method:: get_predictive_covariance(x) Evaluate the predictive covariance at ``x``. Assume that ``x`` represents :math:`n` input points :math:`\left\{\mathbf{x}^{(i)})\right\}_{i=1}^n`. Then, this method computes the semi-positive definite matrix :math:`\mathbf{C}\in\mathbb{R}^n\times\mathbb{R}^n`, given by .. math:: C_{ij} = \phi_k\left(\mathbf{x}^{(i)}\right) \Sigma_{kl} \phi_l\left(\mathbf{x}^{(j)}\right). .. method:: get_predictive_variance(x) Evaluate the predictive variance at ``x``. This is the diagonal of :math:`\mathbf{C}` of :func:`best.maps.GeneralizedLinearModel.get_predictive_covariance()`. However, it is computed without ever building :math:`\mathbf{C}`. .. attribute:: basis Get the underlying basis. .. attribute:: weights Get the weights. .. attribute:: sigma_sqrt Get the square root of the covariance matrix. .. attribute:: beta Get the inverse precision.