algorithms.clustering.imm

Module: algorithms.clustering.imm

Inheritance diagram for nipy.algorithms.clustering.imm:

Inheritance diagram of nipy.algorithms.clustering.imm

Infinite mixture model : A generalization of Bayesian mixture models with an unspecified number of classes

Classes

IMM

class nipy.algorithms.clustering.imm.IMM(alpha=0.5, dim=1)

Bases: BGMM

The class implements Infinite Gaussian Mixture model or Dirichlet Process Mixture model. This is simply a generalization of Bayesian Gaussian Mixture Models with an unknown number of classes.

__init__(alpha=0.5, dim=1)
Parameters:
alpha: float, optional,

the parameter for cluster creation

dim: int, optional,

the dimension of the the data

Note: use the function set_priors() to set adapted priors
average_log_like(x, tiny=1e-15)

returns the averaged log-likelihood of the mode for the dataset x

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

tiny = 1.e-15: a small constant to avoid numerical singularities
bayes_factor(x, z, nperm=0, verbose=0)

Evaluate the Bayes Factor of the current model using Chib’s method

Parameters:
x: array of shape (nb_samples,dim)

the data from which bic is computed

z: array of shape (nb_samples), type = np.int_

the corresponding classification

nperm=0: int

the number of permutations to sample to model the label switching issue in the computation of the Bayes Factor By default, exhaustive permutations are used

verbose=0: verbosity mode
Returns:
bf (float) the computed evidence (Bayes factor)

Notes

See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995

bic(like, tiny=1e-15)

Computation of bic approximation of evidence

Parameters:
like, array of shape (n_samples, self.k)

component-wise likelihood

tiny=1.e-15, a small constant to avoid numerical singularities
Returns:
the bic value, float
check()

Checking the shape of sifferent matrices involved in the model

check_x(x)

essentially check that x.shape[1]==self.dim

x is returned with possibly reshaping

conditional_posterior_proba(x, z, perm=None)

Compute the probability of the current parameters of self given x and z

Parameters:
x: array of shape (nb_samples, dim),

the data from which bic is computed

z: array of shape (nb_samples), type = np.int_,

the corresponding classification

perm: array ok shape(nperm, self.k),typ=np.int_, optional

all permutation of z under which things will be recomputed By default, no permutation is performed

cross_validated_update(x, z, plike, kfold=10)

This is a step in the sampling procedure that uses internal corss_validation

Parameters:
x: array of shape(n_samples, dim),

the input data

z: array of shape(n_samples),

the associated membership variables

plike: array of shape(n_samples),

the likelihood under the prior

kfold: int, or array of shape(n_samples), optional,

folds in the cross-validation loop

Returns:
like: array od shape(n_samples),

the (cross-validated) likelihood of the data

estimate(x, niter=100, delta=0.0001, verbose=0)

Estimation of the model given a dataset x

Parameters:
x array of shape (n_samples,dim)

the data from which the model is estimated

niter=100: maximal number of iterations in the estimation process
delta = 1.e-4: increment of data likelihood at which

convergence is declared

verbose=0: verbosity mode
Returns:
bican asymptotic approximation of model evidence
evidence(x, z, nperm=0, verbose=0)

See bayes_factor(self, x, z, nperm=0, verbose=0)

guess_priors(x, nocheck=0)

Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:
x, array of shape (nb_samples,self.dim)

the data used in the estimation process

nocheck: boolean, optional,

if nocheck==True, check is skipped

guess_regularizing(x, bcheck=1)

Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:
x array of shape (n_samples,dim)

the data used in the estimation process

initialize(x)

initialize z using a k-means algorithm, then update the parameters

Parameters:
x: array of shape (nb_samples,self.dim)

the data used in the estimation process

initialize_and_estimate(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Estimation of self given x

Parameters:
x array of shape (n_samples,dim)

the data from which the model is estimated

z = None: array of shape (n_samples)

a prior labelling of the data to initialize the computation

niter=100: maximal number of iterations in the estimation process
delta = 1.e-4: increment of data likelihood at which

convergence is declared

ninit=1: number of initialization performed

to reach a good solution

verbose=0: verbosity mode
Returns:
the best model is returned
likelihood(x, plike=None)

return the likelihood of the model for the data x the values are weighted by the components weights

Parameters:
x: array of shape (n_samples, self.dim),

the data used in the estimation process

plike: array of shape (n_samples), optional,

the density of each point under the prior

Returns:
like, array of shape (nbitem, self.k)
component-wise likelihood
likelihood_under_the_prior(x)

Computes the likelihood of x under the prior

Parameters:
x, array of shape (self.n_samples,self.dim)
Returns:
w, the likelihood of x under the prior model (unweighted)
map_label(x, like=None)

return the MAP labelling of x

Parameters:
x array of shape (n_samples,dim)

the data under study

like=None array of shape(n_samples,self.k)

component-wise likelihood if like==None, it is recomputed

Returns:
z: array of shape(n_samples): the resulting MAP labelling

of the rows of x

mixture_likelihood(x)

Returns the likelihood of the mixture for x

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

plugin(means, precisions, weights)

Set manually the weights, means and precision of the model

Parameters:
means: array of shape (self.k,self.dim)
precisions: array of shape (self.k,self.dim,self.dim)

or (self.k, self.dim)

weights: array of shape (self.k)
pop(z)

compute the population, i.e. the statistics of allocation

Parameters:
z array of shape (nb_samples), type = np.int_

the allocation variable

Returns:
histarray shape (self.k) count variable
probability_under_prior()

Compute the probability of the current parameters of self given the priors

reduce(z)

Reduce the assignments by removing empty clusters and update self.k

Parameters:
z: array of shape(n),

a vector of membership variables changed in place

Returns:
z: the remapped values
sample(x, niter=1, sampling_points=None, init=False, kfold=None, verbose=0)

sample the indicator and parameters

Parameters:
x: array of shape (n_samples, self.dim)

the data used in the estimation process

niter: int,

the number of iterations to perform

sampling_points: array of shape(nbpoints, self.dim), optional

points where the likelihood will be sampled this defaults to x

kfold: int or array, optional,

parameter of cross-validation control by default, no cross-validation is used the procedure is faster but less accurate

verbose=0: verbosity mode
Returns:
likelihood: array of shape(nbpoints)

total likelihood of the model

sample_and_average(x, niter=1, verbose=0)

sample the indicator and parameters the average values for weights,means, precisions are returned

Parameters:
x = array of shape (nb_samples,dim)

the data from which bic is computed

niter=1: number of iterations
Returns:
weights: array of shape (self.k)
means: array of shape (self.k,self.dim)
precisions: array of shape (self.k,self.dim,self.dim)

or (self.k, self.dim) these are the average parameters across samplings

Notes

All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).

fix: implement a permutation procedure for components identification

sample_indicator(like)

Sample the indicator from the likelihood

Parameters:
like: array of shape (nbitem,self.k)

component-wise likelihood

Returns:
z: array of shape(nbitem): a draw of the membership variable

Notes

The behaviour is different from standard bgmm in that z can take arbitrary values

set_constant_densities(prior_dens=None)

Set the null and prior densities as constant (assuming a compact domain)

Parameters:
prior_dens: float, optional

constant for the prior density

set_priors(x)

Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:
x, array of shape (n_samples,self.dim)

the data used in the estimation process

show(x, gd, density=None, axes=None)

Function to plot a GMM, still in progress Currently, works only in 1D and 2D

Parameters:
x: array of shape(n_samples, dim)

the data under study

gd: GridDescriptor instance
density: array os shape(prod(gd.n_bins))

density of the model one the discrete grid implied by gd by default, this is recomputed

show_components(x, gd, density=None, mpaxes=None)

Function to plot a GMM – Currently, works only in 1D

Parameters:
x: array of shape(n_samples, dim)

the data under study

gd: GridDescriptor instance
density: array os shape(prod(gd.n_bins))

density of the model one the discrete grid implied by gd by default, this is recomputed

mpaxes: axes handle to make the figure, optional,

if None, a new figure is created

simple_update(x, z, plike)

This is a step in the sampling procedure

that uses internal corss_validation

Parameters:
x: array of shape(n_samples, dim),

the input data

z: array of shape(n_samples),

the associated membership variables

plike: array of shape(n_samples),

the likelihood under the prior

Returns:
like: array od shape(n_samples),

the likelihood of the data

test(x, tiny=1e-15)

Returns the log-likelihood of the mixture for x

Parameters:
x array of shape (n_samples,self.dim)

the data used in the estimation process

Returns:
ll: array of shape(n_samples)

the log-likelihood of the rows of x

train(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Idem initialize_and_estimate

unweighted_likelihood(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

Returns:
like, array of shape(n_samples,self.k)

unweighted component-wise likelihood

Notes

Hopefully faster

unweighted_likelihood_(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

Returns:
like, array of shape(n_samples,self.k)

unweighted component-wise likelihood

update(x, z)

Update function (draw a sample of the IMM parameters)

Parameters:
x array of shape (n_samples,self.dim)

the data used in the estimation process

z array of shape (n_samples), type = np.int_

the corresponding classification

update_means(x, z)

Given the allocation vector z, and the corresponding data x, resample the mean

Parameters:
x: array of shape (nb_samples,self.dim)

the data used in the estimation process

z: array of shape (nb_samples), type = np.int_

the corresponding classification

update_precisions(x, z)

Given the allocation vector z, and the corresponding data x, resample the precisions

Parameters:
x array of shape (nb_samples,self.dim)

the data used in the estimation process

z array of shape (nb_samples), type = np.int_

the corresponding classification

update_weights(z)

Given the allocation vector z, resmaple the weights parameter

Parameters:
z array of shape (n_samples), type = np.int_

the allocation variable

MixedIMM

class nipy.algorithms.clustering.imm.MixedIMM(alpha=0.5, dim=1)

Bases: IMM

Particular IMM with an additional null class. The data is supplied together with a sample-related probability of being under the null.

__init__(alpha=0.5, dim=1)
Parameters:
alpha: float, optional,

the parameter for cluster creation

dim: int, optional,

the dimension of the the data

Note: use the function set_priors() to set adapted priors
average_log_like(x, tiny=1e-15)

returns the averaged log-likelihood of the mode for the dataset x

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

tiny = 1.e-15: a small constant to avoid numerical singularities
bayes_factor(x, z, nperm=0, verbose=0)

Evaluate the Bayes Factor of the current model using Chib’s method

Parameters:
x: array of shape (nb_samples,dim)

the data from which bic is computed

z: array of shape (nb_samples), type = np.int_

the corresponding classification

nperm=0: int

the number of permutations to sample to model the label switching issue in the computation of the Bayes Factor By default, exhaustive permutations are used

verbose=0: verbosity mode
Returns:
bf (float) the computed evidence (Bayes factor)

Notes

See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995

bic(like, tiny=1e-15)

Computation of bic approximation of evidence

Parameters:
like, array of shape (n_samples, self.k)

component-wise likelihood

tiny=1.e-15, a small constant to avoid numerical singularities
Returns:
the bic value, float
check()

Checking the shape of sifferent matrices involved in the model

check_x(x)

essentially check that x.shape[1]==self.dim

x is returned with possibly reshaping

conditional_posterior_proba(x, z, perm=None)

Compute the probability of the current parameters of self given x and z

Parameters:
x: array of shape (nb_samples, dim),

the data from which bic is computed

z: array of shape (nb_samples), type = np.int_,

the corresponding classification

perm: array ok shape(nperm, self.k),typ=np.int_, optional

all permutation of z under which things will be recomputed By default, no permutation is performed

cross_validated_update(x, z, plike, null_class_proba, kfold=10)

This is a step in the sampling procedure that uses internal corss_validation

Parameters:
x: array of shape(n_samples, dim),

the input data

z: array of shape(n_samples),

the associated membership variables

plike: array of shape(n_samples),

the likelihood under the prior

kfold: int, optional, or array

number of folds in cross-validation loop or set of indexes for the cross-validation procedure

null_class_proba: array of shape(n_samples),

prior probability to be under the null

Returns:
like: array od shape(n_samples),

the (cross-validated) likelihood of the data

z: array of shape(n_samples),

the associated membership variables

Notes

When kfold is an array, there is an internal reshuffling to randomize the order of updates

estimate(x, niter=100, delta=0.0001, verbose=0)

Estimation of the model given a dataset x

Parameters:
x array of shape (n_samples,dim)

the data from which the model is estimated

niter=100: maximal number of iterations in the estimation process
delta = 1.e-4: increment of data likelihood at which

convergence is declared

verbose=0: verbosity mode
Returns:
bican asymptotic approximation of model evidence
evidence(x, z, nperm=0, verbose=0)

See bayes_factor(self, x, z, nperm=0, verbose=0)

guess_priors(x, nocheck=0)

Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:
x, array of shape (nb_samples,self.dim)

the data used in the estimation process

nocheck: boolean, optional,

if nocheck==True, check is skipped

guess_regularizing(x, bcheck=1)

Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:
x array of shape (n_samples,dim)

the data used in the estimation process

initialize(x)

initialize z using a k-means algorithm, then update the parameters

Parameters:
x: array of shape (nb_samples,self.dim)

the data used in the estimation process

initialize_and_estimate(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Estimation of self given x

Parameters:
x array of shape (n_samples,dim)

the data from which the model is estimated

z = None: array of shape (n_samples)

a prior labelling of the data to initialize the computation

niter=100: maximal number of iterations in the estimation process
delta = 1.e-4: increment of data likelihood at which

convergence is declared

ninit=1: number of initialization performed

to reach a good solution

verbose=0: verbosity mode
Returns:
the best model is returned
likelihood(x, plike=None)

return the likelihood of the model for the data x the values are weighted by the components weights

Parameters:
x: array of shape (n_samples, self.dim),

the data used in the estimation process

plike: array of shape (n_samples), optional,

the density of each point under the prior

Returns:
like, array of shape (nbitem, self.k)
component-wise likelihood
likelihood_under_the_prior(x)

Computes the likelihood of x under the prior

Parameters:
x, array of shape (self.n_samples,self.dim)
Returns:
w, the likelihood of x under the prior model (unweighted)
map_label(x, like=None)

return the MAP labelling of x

Parameters:
x array of shape (n_samples,dim)

the data under study

like=None array of shape(n_samples,self.k)

component-wise likelihood if like==None, it is recomputed

Returns:
z: array of shape(n_samples): the resulting MAP labelling

of the rows of x

mixture_likelihood(x)

Returns the likelihood of the mixture for x

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

plugin(means, precisions, weights)

Set manually the weights, means and precision of the model

Parameters:
means: array of shape (self.k,self.dim)
precisions: array of shape (self.k,self.dim,self.dim)

or (self.k, self.dim)

weights: array of shape (self.k)
pop(z)

compute the population, i.e. the statistics of allocation

Parameters:
z array of shape (nb_samples), type = np.int_

the allocation variable

Returns:
histarray shape (self.k) count variable
probability_under_prior()

Compute the probability of the current parameters of self given the priors

reduce(z)

Reduce the assignments by removing empty clusters and update self.k

Parameters:
z: array of shape(n),

a vector of membership variables changed in place

Returns:
z: the remapped values
sample(x, null_class_proba, niter=1, sampling_points=None, init=False, kfold=None, co_clustering=False, verbose=0)

sample the indicator and parameters

Parameters:
x: array of shape (n_samples, self.dim),

the data used in the estimation process

null_class_proba: array of shape(n_samples),

the probability to be under the null

niter: int,

the number of iterations to perform

sampling_points: array of shape(nbpoints, self.dim), optional

points where the likelihood will be sampled this defaults to x

kfold: int, optional,

parameter of cross-validation control by default, no cross-validation is used the procedure is faster but less accurate

co_clustering: bool, optional

if True, return a model of data co-labelling across iterations

verbose=0: verbosity mode
Returns:
likelihood: array of shape(nbpoints)

total likelihood of the model

pproba: array of shape(n_samples),

the posterior of being in the null (the posterior of null_class_proba)

coclust: only if co_clustering==True,

sparse_matrix of shape (n_samples, n_samples), frequency of co-labelling of each sample pairs across iterations

sample_and_average(x, niter=1, verbose=0)

sample the indicator and parameters the average values for weights,means, precisions are returned

Parameters:
x = array of shape (nb_samples,dim)

the data from which bic is computed

niter=1: number of iterations
Returns:
weights: array of shape (self.k)
means: array of shape (self.k,self.dim)
precisions: array of shape (self.k,self.dim,self.dim)

or (self.k, self.dim) these are the average parameters across samplings

Notes

All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).

fix: implement a permutation procedure for components identification

sample_indicator(like, null_class_proba)

sample the indicator from the likelihood

Parameters:
like: array of shape (nbitem,self.k)

component-wise likelihood

null_class_proba: array of shape(n_samples),

prior probability to be under the null

Returns:
z: array of shape(nbitem): a draw of the membership variable

Notes

Here z=-1 encodes for the null class

set_constant_densities(null_dens=None, prior_dens=None)

Set the null and prior densities as constant (over a supposedly compact domain)

Parameters:
null_dens: float, optional

constant for the null density

prior_dens: float, optional

constant for the prior density

set_priors(x)

Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:
x, array of shape (n_samples,self.dim)

the data used in the estimation process

show(x, gd, density=None, axes=None)

Function to plot a GMM, still in progress Currently, works only in 1D and 2D

Parameters:
x: array of shape(n_samples, dim)

the data under study

gd: GridDescriptor instance
density: array os shape(prod(gd.n_bins))

density of the model one the discrete grid implied by gd by default, this is recomputed

show_components(x, gd, density=None, mpaxes=None)

Function to plot a GMM – Currently, works only in 1D

Parameters:
x: array of shape(n_samples, dim)

the data under study

gd: GridDescriptor instance
density: array os shape(prod(gd.n_bins))

density of the model one the discrete grid implied by gd by default, this is recomputed

mpaxes: axes handle to make the figure, optional,

if None, a new figure is created

simple_update(x, z, plike, null_class_proba)

One step in the sampling procedure (one data sweep)

Parameters:
x: array of shape(n_samples, dim),

the input data

z: array of shape(n_samples),

the associated membership variables

plike: array of shape(n_samples),

the likelihood under the prior

null_class_proba: array of shape(n_samples),

prior probability to be under the null

Returns:
like: array od shape(n_samples),

the likelihood of the data under the H1 hypothesis

test(x, tiny=1e-15)

Returns the log-likelihood of the mixture for x

Parameters:
x array of shape (n_samples,self.dim)

the data used in the estimation process

Returns:
ll: array of shape(n_samples)

the log-likelihood of the rows of x

train(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Idem initialize_and_estimate

unweighted_likelihood(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

Returns:
like, array of shape(n_samples,self.k)

unweighted component-wise likelihood

Notes

Hopefully faster

unweighted_likelihood_(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:
x: array of shape (n_samples,self.dim)

the data used in the estimation process

Returns:
like, array of shape(n_samples,self.k)

unweighted component-wise likelihood

update(x, z)

Update function (draw a sample of the IMM parameters)

Parameters:
x array of shape (n_samples,self.dim)

the data used in the estimation process

z array of shape (n_samples), type = np.int_

the corresponding classification

update_means(x, z)

Given the allocation vector z, and the corresponding data x, resample the mean

Parameters:
x: array of shape (nb_samples,self.dim)

the data used in the estimation process

z: array of shape (nb_samples), type = np.int_

the corresponding classification

update_precisions(x, z)

Given the allocation vector z, and the corresponding data x, resample the precisions

Parameters:
x array of shape (nb_samples,self.dim)

the data used in the estimation process

z array of shape (nb_samples), type = np.int_

the corresponding classification

update_weights(z)

Given the allocation vector z, resmaple the weights parameter

Parameters:
z array of shape (n_samples), type = np.int_

the allocation variable

Functions

nipy.algorithms.clustering.imm.co_labelling(z, kmax=None, kmin=None)

return a sparse co-labelling matrix given the label vector z

Parameters:
z: array of shape(n_samples),

the input labels

kmax: int, optional,

considers only the labels in the range [0, kmax[

Returns:
colabel: a sparse coo_matrix,

yields the co labelling of the data i.e. c[i,j]= 1 if z[i]==z[j], 0 otherwise

nipy.algorithms.clustering.imm.main()

Illustrative example of the behaviour of imm