Skip to main content

SSNMF contains class for (SS)NMF model and several multiplicative update methods to train different models.

Project description

SSNMF

PyPI Version Supported Python Versions

SSNMF contains class for (SS)NMF model and several multiplicative update methods to train different models.


Installation

To install SSNMF, run this command in your terminal:

    $ pip install -U ssnmf

This is the preferred method to install SSNMF, as it will always install the most recent stable release.

If you don't have pip installed, these installation instructions can guide you through the process.

Usage

First, import the ssnmf package and the relevant class SSNMF. We import numpy and `scipy' for experimentation.

>>> import ssnmf
>>> from ssnmf import SSNMF
>>> import numpy as np
>>> import scipy
>>> import scipy.sparse as sparse
>>> import scipy.optimize

Training an unsupervised model

Declare an unsupervised NMF model with data matrix X and number of topics k.

>>> X = np.random.rand(100,100)
>>> k = 10
>>> model = SSNMF(X,k)

You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')

Run the multiplicative updates method for this unsupervised model for N iterations. This method tries to minimize the objective function ||X-AS||_F.

>>> N = 100
>>> model.mult(numiters = N)

This method updates the factor matrices N times. You can see how much the relative reconstruction error improves.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')

Training a supervised model

We begin by generating some synthetic data for testing.

>>> labelmat = np.concatenate((np.concatenate((np.ones([1,10]),np.zeros([1,30])),axis=1),np.concatenate((np.zeros([1,10]),np.ones([1,10]),np.zeros([1,20])),axis=1),np.concatenate((np.zeros([1,20]),np.ones([1,10]),np.zeros([1,10])),axis=1),np.concatenate((np.zeros([1,30]),np.ones([1,10])),axis=1)))
>>> B = sparse.random(4,10,density=0.2).toarray()
>>> S = np.zeros([10,40])
>>> for i in range(40):
	S[:,i] = scipy.optimize.nnls(B,labelmat[:,i])[0]
>>> A = np.random.rand(40,10)
>>> X = A @ S

Declare a supervised NMF model with data matrix X, number of topics k, label matrix Y, and weight parameter lam.

>>> k = 10
>>> model = SSNMF(X,k,Y = labelmat,lam=100*np.linalg.norm(X,'fro'))

You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F and classification accuracy.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
>>> acc = model.accuracy()

Run the multiplicative updates method for this supervised model for N iterations. This method tries to minimize the objective function ||X-AS||_F^2 + lam ||Y - BS||_F^2. This also saves the errors and accuracies in each iteration.

>>> N = 100
>>> [errs,reconerrs,classerrs,classaccs] = model.snmfmult(numiters = N,saveerrs = True)

This method updates the factor matrices N times. You can see how much the relative reconstruction error and classification accuracy improves.

>>> rel_error = reconerrs[99]/np.linalg.norm(X,'fro')
>>> acc = classaccs[99]

Training a supervised model with KL-divergence

We begin by generating some synthetic data for testing.

>>> labelmat = np.concatenate((np.concatenate((np.ones([1,10]),np.zeros([1,30])),axis=1),np.concatenate((np.zeros([1,10]),np.ones([1,10]),np.zeros([1,20])),axis=1),np.concatenate((np.zeros([1,20]),np.ones([1,10]),np.zeros([1,10])),axis=1),np.concatenate((np.zeros([1,30]),np.ones([1,10])),axis=1)))
>>> B = sparse.random(4,10,density=0.2).toarray()
>>> S = np.zeros([10,40])
>>> for i in range(40):
	S[:,i] = scipy.optimize.nnls(B,labelmat[:,i])[0]
>>> A = np.random.rand(40,10)
>>> X = A @ S

Declare a supervised NMF model with data matrix X, number of topics k, label matrix Y, and weight parameter lam.

>>> k = 10
>>> model = SSNMF(X,k,Y = labelmat,lam=100*np.linalg.norm(X,'fro'))

You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F, classification accuracy, and KL-divergence improves.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
>>> acc = model.accuracy()
>>> div = model.kldiv()

Run the multiplicative updates method for this supervised model for N iterations. This method tries to minimize the objective function ||X-AS||_F^2 + lam D(Y||BS). This also saves the errors and accuracies in each iteration.

>>> N = 100
>>> [errs,reconerrs,classerrs,classaccs] = model.klsnmfmult(numiters = N,saveerrs = True)

This method updates the factor matrices N times. You can see how much the relative reconstruction error and classification accuracy improves.

>>> rel_error = reconerrs[99]/np.linalg.norm(X,'fro')
>>> acc = classaccs[99]
>>> div = classerrs[99]

Citing

If you use our code in an academic setting, please consider citing our code.

Development

See CONTRIBUTING.md for information related to developing the code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssnmf-0.0.2.tar.gz (8.8 kB view hashes)

Uploaded Source

Built Distribution

ssnmf-0.0.2-py2.py3-none-any.whl (6.9 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page