Scikit-multilearn-ng is the follow up to scikit-multilearn, a BSD-licensed library for multi-label classification that is built on top of the well-known scikit-learn ecosystem.
Project description
scikit-multilearn-ng
scikit-multilearn-ng is a Python module capable of performing multi-label learning tasks and is the follow-up to scikit-multilearn. It is built on-top of various scientific Python packages (numpy, scipy) and follows a similar API to that of scikit-learn.
More documentation:
Features
-
Native Python implementation. A native Python implementation for a variety of multi-label classification algorithms. To see the list of all supported classifiers, check this link.
-
Interface to Meka. A Meka wrapper class is implemented for reference purposes and integration. This provides access to all methods available in MEKA, MULAN, and WEKA — the reference standard in the field.
-
Builds upon giants! Team-up with the power of numpy and scikit. You can use scikit-learn's base classifiers as scikit-multilearn's classifiers. In addition, the two packages follow a similar API.
Installation & Dependencies
To install scikit-multilearn, simply type the following command:
$ pip install scikit-multilearn-ng
This will install the latest release from the Python package index. If you
wish to install the bleeding-edge version, then clone this repository and
run setup.py
:
$ git clone https://github.com/scikit-multilearn-ng/scikit-multilearn-ng.git
$ cd scikit-multilearn-ng
$ python setup.py
In most cases requirements are installed when you install using pip install scikit-multilearn-ng
or run python setup.py install
. There are also optional dependencies pip install scikit-multilearn-ng[gpl,keras,meka]
installs the GPL-incurring igraph for for igraph library based clusterers, keras for the keras classifiers and requirements for the meka bridge respectively.
To install openNE
, run:
pip install 'openne @ git+https://github.com/thunlp/OpenNE.git@master#subdirectory=src'
Note that installing the GPL licensed graphtool, for graphtool based clusters, is complicated, and must be done manually, please see: graphtool install instructions
Basic Usage
Note: You should use the same import statement as previously with scikit-multilearn (import skmultilearn
), after installation. This allows for quicker switching to this follow-up version.
Before proceeding to classification, this library assumes that you have a dataset with the following matrices:
x_train
,x_test
: training and test feature matrices of size(n_samples, n_features)
y_train
,y_test
: training and test label matrices of size(n_samples, n_labels)
Suppose we wanted to use a problem-transformation method called Binary Relevance, which treats each label as a separate single-label classification problem, to a Support-vector machine (SVM) classifier, we simply perform the following tasks:
# Import BinaryRelevance from skmultilearn
from skmultilearn.problem_transform import BinaryRelevance
# Import SVC classifier from sklearn
from sklearn.svm import SVC
# Setup the classifier
classifier = BinaryRelevance(classifier=SVC(), require_dense=[False,True])
# Train
classifier.fit(X_train, y_train)
# Predict
y_pred = classifier.predict(X_test)
More examples and use-cases can be seen in the documentation.
Contributing
This project is open for contributions. Here are some of the ways for you to contribute:
- Bug reports/fix
- Features requests
- Use-case demonstrations
- Documentation updates
In case you want to implement your own multi-label classifier, please read our Developer's Guide to help you integrate your implementation in our API.
To make a contribution, just fork this repository, push the changes in your fork, open up an issue, and make a Pull Request!
Cite
If you used scikit-multilearn-ng in your research or project, please cite the original package scikit-multilearn:
@ARTICLE{2017arXiv170201460S,
author = {{Szyma{\'n}ski}, P. and {Kajdanowicz}, T.},
title = "{A scikit-based Python environment for performing multi-label classification}",
journal = {ArXiv e-prints},
archivePrefix = "arXiv",
eprint = {1702.01460},
year = 2017,
month = feb
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scikit-multilearn-ng-0.0.7.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 318324b91317b13abfa6b71f0e05eccb59f735a4199fe673d62bdcf944fc4413 |
|
MD5 | 52336d1cc6f1f3b63798e257c7f40641 |
|
BLAKE2b-256 | 4cd4e5f2e996535a8955c94d906d3c95c693f6708d0d5a291bdec6f23dfc9adf |
Hashes for scikit_multilearn_ng-0.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 546c336ec26ee09dae3b2eef5a8fbb5d9563cfde6d23b16621c304314e52e3f9 |
|
MD5 | a6f73b4f1091442a8deb91fe9e2e48f3 |
|
BLAKE2b-256 | 9f5b419cb03d973225ba4b04dd1bd1f6c00c4a98411f508f8762fe41cf3e3afa |