algorithms.clustering.bgmm¶
Module: algorithms.clustering.bgmm
¶
Inheritance diagram for nipy.algorithms.clustering.bgmm
:
Bayesian Gaussian Mixture Model Classes: contains the basic fields and methods of Bayesian GMMs the high level functions are/should be binded in C
The base class BGMM relies on an implementation that performs Gibbs sampling
A derived class VBGMM uses Variational Bayes inference instead
A third class is introduces to take advnatge of the old C-bindings, but it is limited to diagonal covariance models
Author : Bertrand Thirion, 2008-2011
Classes¶
BGMM
¶
- class nipy.algorithms.clustering.bgmm.BGMM(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)¶
Bases:
GMM
This class implements Bayesian GMMs
this class contains the following fields k: int,
the number of components in the mixture
- dim: int,
the dimension of the data
- means: array of shape (k, dim)
all the means of the components
- precisions: array of shape (k, dim, dim)
the precisions of the components
- weights: array of shape (k):
weights of the mixture
- shrinkage: array of shape (k):
scaling factor of the posterior precisions on the mean
- dof: array of shape (k)
the degrees of freedom of the components
- prior_means: array of shape (k, dim):
the prior on the components means
- prior_scale: array of shape (k, dim):
the prior on the components precisions
- prior_dof: array of shape (k):
the prior on the dof (should be at least equal to dim)
- prior_shrinkage: array of shape (k):
scaling factor of the prior precisions on the mean
- prior_weights: array of shape (k)
the prior on the components weights
- shrinkage: array of shape (k):
scaling factor of the posterior precisions on the mean
dof : array of shape (k): the posterior dofs
- __init__(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)¶
Initialize the structure with the dimensions of the problem Eventually provide different terms
- average_log_like(x, tiny=1e-15)¶
returns the averaged log-likelihood of the mode for the dataset x
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- tiny = 1.e-15: a small constant to avoid numerical singularities
- bayes_factor(x, z, nperm=0, verbose=0)¶
Evaluate the Bayes Factor of the current model using Chib’s method
- Parameters:
- x: array of shape (nb_samples,dim)
the data from which bic is computed
- z: array of shape (nb_samples), type = np.int_
the corresponding classification
- nperm=0: int
the number of permutations to sample to model the label switching issue in the computation of the Bayes Factor By default, exhaustive permutations are used
- verbose=0: verbosity mode
- Returns:
- bf (float) the computed evidence (Bayes factor)
Notes
See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995
- bic(like, tiny=1e-15)¶
Computation of bic approximation of evidence
- Parameters:
- like, array of shape (n_samples, self.k)
component-wise likelihood
- tiny=1.e-15, a small constant to avoid numerical singularities
- Returns:
- the bic value, float
- check()¶
Checking the shape of sifferent matrices involved in the model
- check_x(x)¶
essentially check that x.shape[1]==self.dim
x is returned with possibly reshaping
- conditional_posterior_proba(x, z, perm=None)¶
Compute the probability of the current parameters of self given x and z
- Parameters:
- x: array of shape (nb_samples, dim),
the data from which bic is computed
- z: array of shape (nb_samples), type = np.int_,
the corresponding classification
- perm: array ok shape(nperm, self.k),typ=np.int_, optional
all permutation of z under which things will be recomputed By default, no permutation is performed
- estimate(x, niter=100, delta=0.0001, verbose=0)¶
Estimation of the model given a dataset x
- Parameters:
- x array of shape (n_samples,dim)
the data from which the model is estimated
- niter=100: maximal number of iterations in the estimation process
- delta = 1.e-4: increment of data likelihood at which
convergence is declared
- verbose=0: verbosity mode
- Returns:
- bican asymptotic approximation of model evidence
- evidence(x, z, nperm=0, verbose=0)¶
See bayes_factor(self, x, z, nperm=0, verbose=0)
- guess_priors(x, nocheck=0)¶
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
- Parameters:
- x, array of shape (nb_samples,self.dim)
the data used in the estimation process
- nocheck: boolean, optional,
if nocheck==True, check is skipped
- guess_regularizing(x, bcheck=1)¶
Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)
- Parameters:
- x array of shape (n_samples,dim)
the data used in the estimation process
- initialize(x)¶
initialize z using a k-means algorithm, then update the parameters
- Parameters:
- x: array of shape (nb_samples,self.dim)
the data used in the estimation process
- initialize_and_estimate(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)¶
Estimation of self given x
- Parameters:
- x array of shape (n_samples,dim)
the data from which the model is estimated
- z = None: array of shape (n_samples)
a prior labelling of the data to initialize the computation
- niter=100: maximal number of iterations in the estimation process
- delta = 1.e-4: increment of data likelihood at which
convergence is declared
- ninit=1: number of initialization performed
to reach a good solution
- verbose=0: verbosity mode
- Returns:
- the best model is returned
- likelihood(x)¶
return the likelihood of the model for the data x the values are weighted by the components weights
- Parameters:
- x array of shape (n_samples,self.dim)
the data used in the estimation process
- Returns:
- like, array of shape(n_samples,self.k)
component-wise likelihood
- map_label(x, like=None)¶
return the MAP labelling of x
- Parameters:
- x array of shape (n_samples,dim)
the data under study
- like=None array of shape(n_samples,self.k)
component-wise likelihood if like==None, it is recomputed
- Returns:
- z: array of shape(n_samples): the resulting MAP labelling
of the rows of x
- mixture_likelihood(x)¶
Returns the likelihood of the mixture for x
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- plugin(means, precisions, weights)¶
Set manually the weights, means and precision of the model
- Parameters:
- means: array of shape (self.k,self.dim)
- precisions: array of shape (self.k,self.dim,self.dim)
or (self.k, self.dim)
- weights: array of shape (self.k)
- pop(z)¶
compute the population, i.e. the statistics of allocation
- Parameters:
- z array of shape (nb_samples), type = np.int_
the allocation variable
- Returns:
- histarray shape (self.k) count variable
- probability_under_prior()¶
Compute the probability of the current parameters of self given the priors
- sample(x, niter=1, mem=0, verbose=0)¶
sample the indicator and parameters
- Parameters:
- x array of shape (nb_samples,self.dim)
the data used in the estimation process
- niter=1the number of iterations to perform
- mem=0: if mem, the best values of the parameters are computed
- verbose=0: verbosity mode
- Returns:
- best_weights: array of shape (self.k)
- best_means: array of shape (self.k, self.dim)
- best_precisions: array of shape (self.k, self.dim, self.dim)
- possibleZ: array of shape (nb_samples, niter)
the z that give the highest posterior to the data is returned first
- sample_and_average(x, niter=1, verbose=0)¶
sample the indicator and parameters the average values for weights,means, precisions are returned
- Parameters:
- x = array of shape (nb_samples,dim)
the data from which bic is computed
- niter=1: number of iterations
- Returns:
- weights: array of shape (self.k)
- means: array of shape (self.k,self.dim)
- precisions: array of shape (self.k,self.dim,self.dim)
or (self.k, self.dim) these are the average parameters across samplings
Notes
All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).
fix: implement a permutation procedure for components identification
- sample_indicator(like)¶
sample the indicator from the likelihood
- Parameters:
- like: array of shape (nb_samples,self.k)
component-wise likelihood
- Returns:
- z: array of shape(nb_samples): a draw of the membership variable
- set_priors(prior_means, prior_weights, prior_scale, prior_dof, prior_shrinkage)¶
Set the prior of the BGMM
- Parameters:
- prior_means: array of shape (self.k,self.dim)
- prior_weights: array of shape (self.k)
- prior_scale: array of shape (self.k,self.dim,self.dim)
- prior_dof: array of shape (self.k)
- prior_shrinkage: array of shape (self.k)
- show(x, gd, density=None, axes=None)¶
Function to plot a GMM, still in progress Currently, works only in 1D and 2D
- Parameters:
- x: array of shape(n_samples, dim)
the data under study
- gd: GridDescriptor instance
- density: array os shape(prod(gd.n_bins))
density of the model one the discrete grid implied by gd by default, this is recomputed
- show_components(x, gd, density=None, mpaxes=None)¶
Function to plot a GMM – Currently, works only in 1D
- Parameters:
- x: array of shape(n_samples, dim)
the data under study
- gd: GridDescriptor instance
- density: array os shape(prod(gd.n_bins))
density of the model one the discrete grid implied by gd by default, this is recomputed
- mpaxes: axes handle to make the figure, optional,
if None, a new figure is created
- test(x, tiny=1e-15)¶
Returns the log-likelihood of the mixture for x
- Parameters:
- x array of shape (n_samples,self.dim)
the data used in the estimation process
- Returns:
- ll: array of shape(n_samples)
the log-likelihood of the rows of x
- train(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)¶
Idem initialize_and_estimate
- unweighted_likelihood(x)¶
return the likelihood of each data for each component the values are not weighted by the component weights
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- Returns:
- like, array of shape(n_samples,self.k)
unweighted component-wise likelihood
Notes
Hopefully faster
- unweighted_likelihood_(x)¶
return the likelihood of each data for each component the values are not weighted by the component weights
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- Returns:
- like, array of shape(n_samples,self.k)
unweighted component-wise likelihood
- update(x, z)¶
update function (draw a sample of the GMM parameters)
- Parameters:
- x array of shape (nb_samples,self.dim)
the data used in the estimation process
- z array of shape (nb_samples), type = np.int_
the corresponding classification
- update_means(x, z)¶
Given the allocation vector z, and the corresponding data x, resample the mean
- Parameters:
- x: array of shape (nb_samples,self.dim)
the data used in the estimation process
- z: array of shape (nb_samples), type = np.int_
the corresponding classification
- update_precisions(x, z)¶
Given the allocation vector z, and the corresponding data x, resample the precisions
- Parameters:
- x array of shape (nb_samples,self.dim)
the data used in the estimation process
- z array of shape (nb_samples), type = np.int_
the corresponding classification
- update_weights(z)¶
Given the allocation vector z, resample the weights parameter
- Parameters:
- z array of shape (nb_samples), type = np.int_
the allocation variable
VBGMM
¶
- class nipy.algorithms.clustering.bgmm.VBGMM(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)¶
Bases:
BGMM
Subclass of Bayesian GMMs (BGMM) that implements Variational Bayes estimation of the parameters
- __init__(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)¶
Initialize the structure with the dimensions of the problem Eventually provide different terms
- average_log_like(x, tiny=1e-15)¶
returns the averaged log-likelihood of the mode for the dataset x
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- tiny = 1.e-15: a small constant to avoid numerical singularities
- bayes_factor(x, z, nperm=0, verbose=0)¶
Evaluate the Bayes Factor of the current model using Chib’s method
- Parameters:
- x: array of shape (nb_samples,dim)
the data from which bic is computed
- z: array of shape (nb_samples), type = np.int_
the corresponding classification
- nperm=0: int
the number of permutations to sample to model the label switching issue in the computation of the Bayes Factor By default, exhaustive permutations are used
- verbose=0: verbosity mode
- Returns:
- bf (float) the computed evidence (Bayes factor)
Notes
See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995
- bic(like, tiny=1e-15)¶
Computation of bic approximation of evidence
- Parameters:
- like, array of shape (n_samples, self.k)
component-wise likelihood
- tiny=1.e-15, a small constant to avoid numerical singularities
- Returns:
- the bic value, float
- check()¶
Checking the shape of sifferent matrices involved in the model
- check_x(x)¶
essentially check that x.shape[1]==self.dim
x is returned with possibly reshaping
- conditional_posterior_proba(x, z, perm=None)¶
Compute the probability of the current parameters of self given x and z
- Parameters:
- x: array of shape (nb_samples, dim),
the data from which bic is computed
- z: array of shape (nb_samples), type = np.int_,
the corresponding classification
- perm: array ok shape(nperm, self.k),typ=np.int_, optional
all permutation of z under which things will be recomputed By default, no permutation is performed
- estimate(x, niter=100, delta=0.0001, verbose=0)¶
estimation of self given x
- Parameters:
- x array of shape (nb_samples,dim)
the data from which the model is estimated
- z = None: array of shape (nb_samples)
a prior labelling of the data to initialize the computation
- niter=100: maximal number of iterations in the estimation process
- delta = 1.e-4: increment of data likelihood at which
convergence is declared
- verbose=0:
verbosity mode
- evidence(x, like=None, verbose=0)¶
computation of evidence bound aka free energy
- Parameters:
- x array of shape (nb_samples,dim)
the data from which evidence is computed
- like=None: array of shape (nb_samples, self.k), optional
component-wise likelihood If None, it is recomputed
- verbose=0: verbosity model
- Returns:
- ev (float) the computed evidence
- guess_priors(x, nocheck=0)¶
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
- Parameters:
- x, array of shape (nb_samples,self.dim)
the data used in the estimation process
- nocheck: boolean, optional,
if nocheck==True, check is skipped
- guess_regularizing(x, bcheck=1)¶
Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)
- Parameters:
- x array of shape (n_samples,dim)
the data used in the estimation process
- initialize(x)¶
initialize z using a k-means algorithm, then update the parameters
- Parameters:
- x: array of shape (nb_samples,self.dim)
the data used in the estimation process
- initialize_and_estimate(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)¶
Estimation of self given x
- Parameters:
- x array of shape (n_samples,dim)
the data from which the model is estimated
- z = None: array of shape (n_samples)
a prior labelling of the data to initialize the computation
- niter=100: maximal number of iterations in the estimation process
- delta = 1.e-4: increment of data likelihood at which
convergence is declared
- ninit=1: number of initialization performed
to reach a good solution
- verbose=0: verbosity mode
- Returns:
- the best model is returned
- likelihood(x)¶
return the likelihood of the model for the data x the values are weighted by the components weights
- Parameters:
- x: array of shape (nb_samples, self.dim)
the data used in the estimation process
- Returns:
- like: array of shape(nb_samples, self.k)
component-wise likelihood
- map_label(x, like=None)¶
return the MAP labelling of x
- Parameters:
- x array of shape (nb_samples,dim)
the data under study
- like=None array of shape(nb_samples,self.k)
component-wise likelihood if like==None, it is recomputed
- Returns:
- z: array of shape(nb_samples): the resulting MAP labelling
of the rows of x
- mixture_likelihood(x)¶
Returns the likelihood of the mixture for x
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- plugin(means, precisions, weights)¶
Set manually the weights, means and precision of the model
- Parameters:
- means: array of shape (self.k,self.dim)
- precisions: array of shape (self.k,self.dim,self.dim)
or (self.k, self.dim)
- weights: array of shape (self.k)
- pop(like, tiny=1e-15)¶
compute the population, i.e. the statistics of allocation
- Parameters:
- like array of shape (nb_samples, self.k):
the likelihood of each item being in each class
- probability_under_prior()¶
Compute the probability of the current parameters of self given the priors
- sample(x, niter=1, mem=0, verbose=0)¶
sample the indicator and parameters
- Parameters:
- x array of shape (nb_samples,self.dim)
the data used in the estimation process
- niter=1the number of iterations to perform
- mem=0: if mem, the best values of the parameters are computed
- verbose=0: verbosity mode
- Returns:
- best_weights: array of shape (self.k)
- best_means: array of shape (self.k, self.dim)
- best_precisions: array of shape (self.k, self.dim, self.dim)
- possibleZ: array of shape (nb_samples, niter)
the z that give the highest posterior to the data is returned first
- sample_and_average(x, niter=1, verbose=0)¶
sample the indicator and parameters the average values for weights,means, precisions are returned
- Parameters:
- x = array of shape (nb_samples,dim)
the data from which bic is computed
- niter=1: number of iterations
- Returns:
- weights: array of shape (self.k)
- means: array of shape (self.k,self.dim)
- precisions: array of shape (self.k,self.dim,self.dim)
or (self.k, self.dim) these are the average parameters across samplings
Notes
All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).
fix: implement a permutation procedure for components identification
- sample_indicator(like)¶
sample the indicator from the likelihood
- Parameters:
- like: array of shape (nb_samples,self.k)
component-wise likelihood
- Returns:
- z: array of shape(nb_samples): a draw of the membership variable
- set_priors(prior_means, prior_weights, prior_scale, prior_dof, prior_shrinkage)¶
Set the prior of the BGMM
- Parameters:
- prior_means: array of shape (self.k,self.dim)
- prior_weights: array of shape (self.k)
- prior_scale: array of shape (self.k,self.dim,self.dim)
- prior_dof: array of shape (self.k)
- prior_shrinkage: array of shape (self.k)
- show(x, gd, density=None, axes=None)¶
Function to plot a GMM, still in progress Currently, works only in 1D and 2D
- Parameters:
- x: array of shape(n_samples, dim)
the data under study
- gd: GridDescriptor instance
- density: array os shape(prod(gd.n_bins))
density of the model one the discrete grid implied by gd by default, this is recomputed
- show_components(x, gd, density=None, mpaxes=None)¶
Function to plot a GMM – Currently, works only in 1D
- Parameters:
- x: array of shape(n_samples, dim)
the data under study
- gd: GridDescriptor instance
- density: array os shape(prod(gd.n_bins))
density of the model one the discrete grid implied by gd by default, this is recomputed
- mpaxes: axes handle to make the figure, optional,
if None, a new figure is created
- test(x, tiny=1e-15)¶
Returns the log-likelihood of the mixture for x
- Parameters:
- x array of shape (n_samples,self.dim)
the data used in the estimation process
- Returns:
- ll: array of shape(n_samples)
the log-likelihood of the rows of x
- train(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)¶
Idem initialize_and_estimate
- unweighted_likelihood(x)¶
return the likelihood of each data for each component the values are not weighted by the component weights
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- Returns:
- like, array of shape(n_samples,self.k)
unweighted component-wise likelihood
Notes
Hopefully faster
- unweighted_likelihood_(x)¶
return the likelihood of each data for each component the values are not weighted by the component weights
- Parameters:
- x: array of shape (n_samples,self.dim)
the data used in the estimation process
- Returns:
- like, array of shape(n_samples,self.k)
unweighted component-wise likelihood
- update(x, z)¶
update function (draw a sample of the GMM parameters)
- Parameters:
- x array of shape (nb_samples,self.dim)
the data used in the estimation process
- z array of shape (nb_samples), type = np.int_
the corresponding classification
- update_means(x, z)¶
Given the allocation vector z, and the corresponding data x, resample the mean
- Parameters:
- x: array of shape (nb_samples,self.dim)
the data used in the estimation process
- z: array of shape (nb_samples), type = np.int_
the corresponding classification
- update_precisions(x, z)¶
Given the allocation vector z, and the corresponding data x, resample the precisions
- Parameters:
- x array of shape (nb_samples,self.dim)
the data used in the estimation process
- z array of shape (nb_samples), type = np.int_
the corresponding classification
- update_weights(z)¶
Given the allocation vector z, resample the weights parameter
- Parameters:
- z array of shape (nb_samples), type = np.int_
the allocation variable
Functions¶
- nipy.algorithms.clustering.bgmm.detsh(H)¶
Routine for the computation of determinants of symmetric positive matrices
- Parameters:
- H array of shape(n,n)
the input matrix, assumed symmmetric and positive
- Returns:
- dh: float, the determinant
- nipy.algorithms.clustering.bgmm.dirichlet_eval(w, alpha)¶
Evaluate the probability of a certain discrete draw w from the Dirichlet density with parameters alpha
- Parameters:
- w: array of shape (n)
- alpha: array of shape (n)
- nipy.algorithms.clustering.bgmm.dkl_dirichlet(w1, w2)¶
Returns the KL divergence between two dirichlet distribution
- Parameters:
- w1: array of shape(n),
the parameters of the first dirichlet density
- w2: array of shape(n),
the parameters of the second dirichlet density
- nipy.algorithms.clustering.bgmm.dkl_gaussian(m1, P1, m2, P2)¶
Returns the KL divergence between gausians densities
- Parameters:
- m1: array of shape (n),
the mean parameter of the first density
- P1: array of shape(n,n),
the precision parameters of the first density
- m2: array of shape (n),
the mean parameter of the second density
- P2: array of shape(n,n),
the precision parameters of the second density
- nipy.algorithms.clustering.bgmm.dkl_wishart(a1, B1, a2, B2)¶
returns the KL divergence bteween two Wishart distribution of parameters (a1,B1) and (a2,B2),
- Parameters:
- a1: Float,
degrees of freedom of the first density
- B1: array of shape(n,n),
scale matrix of the first density
- a2: Float,
degrees of freedom of the second density
- B2: array of shape(n,n),
scale matrix of the second density
- Returns:
- dkl: float, the Kullback-Leibler divergence
- nipy.algorithms.clustering.bgmm.generate_Wishart(n, V)¶
Generate a sample from Wishart density
- Parameters:
- n: float,
the number of degrees of freedom of the Wishart density
- V: array of shape (n,n)
the scale matrix of the Wishart density
- Returns:
- W: array of shape (n,n)
the draw from Wishart density
- nipy.algorithms.clustering.bgmm.generate_normals(m, P)¶
Generate a Gaussian sample with mean m and precision P
- Parameters:
- m array of shape n: the mean vector
- P array of shape (n,n): the precision matrix
- Returns:
- ngarray of shape(n): a draw from the gaussian density
- nipy.algorithms.clustering.bgmm.generate_perm(k, nperm=100)¶
returns an array of shape(nbperm, k) representing the permutations of k elements
- Parameters:
- k, int the number of elements to be permuted
- nperm=100 the maximal number of permutations
- if gamma(k+1)>nperm: only nperm random draws are generated
- Returns:
- p: array of shape(nperm,k): each row is permutation of k
- nipy.algorithms.clustering.bgmm.multinomial(probabilities)¶
Generate samples form a miltivariate distribution
- Parameters:
- probabilities: array of shape (nelements, nclasses):
likelihood of each element belongin to each class each row is assumedt to sum to 1 One sample is draw from each row, resulting in
- Returns:
- z array of shape (nelements): the draws,
that take values in [0..nclasses-1]
- nipy.algorithms.clustering.bgmm.normal_eval(mu, P, x, dP=None)¶
Probability of x under normal(mu, inv(P))
- Parameters:
- mu: array of shape (n),
the mean parameter
- P: array of shape (n, n),
the precision matrix
- x: array of shape (n),
the data to be evaluated
- Returns:
- (float) the density
- nipy.algorithms.clustering.bgmm.wishart_eval(n, V, W, dV=None, dW=None, piV=None)¶
Evaluation of the probability of W under Wishart(n,V)
- Parameters:
- n: float,
the number of degrees of freedom (dofs)
- V: array of shape (n,n)
the scale matrix of the Wishart density
- W: array of shape (n,n)
the sample to be evaluated
- dV: float, optional,
determinant of V
- dW: float, optional,
determinant of W
- piV: array of shape (n,n), optional
inverse of V
- Returns:
- (float) the density