SSNMF contains class for (SS)NMF model and several multiplicative update methods to train different models.
Project description
SSNMF
SSNMF contains class for (SS)NMF model and several multiplicative update methods to train different models.
Installation
To install SSNMF, run this command in your terminal:
$ pip install -U ssnmf
This is the preferred method to install SSNMF, as it will always install the most recent stable release.
If you don't have pip installed, these installation instructions can guide you through the process.
Usage
First, import the ssnmf
package and the relevant class SSNMF
. We import numpy
and `scipy' for experimentation.
>>> import ssnmf
>>> from ssnmf import SSNMF
>>> import numpy as np
>>> import scipy
>>> import scipy.sparse as sparse
>>> import scipy.optimize
Training an unsupervised model
Declare an unsupervised NMF model with data matrix X
and number of topics k
.
>>> X = np.random.rand(100,100)
>>> k = 10
>>> model = SSNMF(X,k)
You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F.
>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
Run the multiplicative updates method for this unsupervised model for N
iterations. This method tries to minimize the objective function ||X-AS||_F
.
>>> N = 100
>>> model.mult(numiters = N)
This method updates the factor matrices N times. You can see how much the relative reconstruction error improves.
>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
Training a supervised model
We begin by generating some synthetic data for testing.
>>> labelmat = np.concatenate((np.concatenate((np.ones([1,10]),np.zeros([1,30])),axis=1),np.concatenate((np.zeros([1,10]),np.ones([1,10]),np.zeros([1,20])),axis=1),np.concatenate((np.zeros([1,20]),np.ones([1,10]),np.zeros([1,10])),axis=1),np.concatenate((np.zeros([1,30]),np.ones([1,10])),axis=1)))
>>> B = sparse.random(4,10,density=0.2).toarray()
>>> S = np.zeros([10,40])
>>> for i in range(40):
S[:,i] = scipy.optimize.nnls(B,labelmat[:,i])[0]
>>> A = np.random.rand(40,10)
>>> X = A @ S
Declare a supervised NMF model with data matrix X
, number of topics k
, label matrix Y
, and weight parameter lam
.
>>> k = 10
>>> model = SSNMF(X,k,Y = labelmat,lam=100*np.linalg.norm(X,'fro'))
You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F and classification accuracy.
>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
>>> acc = model.accuracy()
Run the multiplicative updates method for this supervised model for N
iterations. This method tries to minimize the objective function ||X-AS||_F^2 + lam ||Y - BS||_F^2
. This also saves the errors and accuracies in each iteration.
>>> N = 100
>>> [errs,reconerrs,classerrs,classaccs] = model.snmfmult(numiters = N,saveerrs = True)
This method updates the factor matrices N times. You can see how much the relative reconstruction error and classification accuracy improves.
>>> rel_error = reconerrs[99]/np.linalg.norm(X,'fro')
>>> acc = classaccs[99]
Training a supervised model with KL-divergence
We begin by generating some synthetic data for testing.
>>> labelmat = np.concatenate((np.concatenate((np.ones([1,10]),np.zeros([1,30])),axis=1),np.concatenate((np.zeros([1,10]),np.ones([1,10]),np.zeros([1,20])),axis=1),np.concatenate((np.zeros([1,20]),np.ones([1,10]),np.zeros([1,10])),axis=1),np.concatenate((np.zeros([1,30]),np.ones([1,10])),axis=1)))
>>> B = sparse.random(4,10,density=0.2).toarray()
>>> S = np.zeros([10,40])
>>> for i in range(40):
S[:,i] = scipy.optimize.nnls(B,labelmat[:,i])[0]
>>> A = np.random.rand(40,10)
>>> X = A @ S
Declare a supervised NMF model with data matrix X
, number of topics k
, label matrix Y
, and weight parameter lam
.
>>> k = 10
>>> model = SSNMF(X,k,Y = labelmat,lam=100*np.linalg.norm(X,'fro'))
You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F, classification accuracy, and KL-divergence improves.
>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
>>> acc = model.accuracy()
>>> div = model.kldiv()
Run the multiplicative updates method for this supervised model for N
iterations. This method tries to minimize the objective function ||X-AS||_F^2 + lam D(Y||BS)
. This also saves the errors and accuracies in each iteration.
>>> N = 100
>>> [errs,reconerrs,classerrs,classaccs] = model.klsnmfmult(numiters = N,saveerrs = True)
This method updates the factor matrices N times. You can see how much the relative reconstruction error and classification accuracy improves.
>>> rel_error = reconerrs[99]/np.linalg.norm(X,'fro')
>>> acc = classaccs[99]
>>> div = classerrs[99]
Citing
If you use our code in an academic setting, please consider citing our code.
Development
See CONTRIBUTING.md for information related to developing the code.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ssnmf-0.0.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14c0773433fc8acc83c2f9cf008621662bc1373c0c549508c648eb00c88622ca |
|
MD5 | 0d05084d418e85930612e44f37284056 |
|
BLAKE2b-256 | b794c39afba5da84b88ae57bd863133555304e8c9d19bc0ed965f3428189d289 |