MCDA algorithms

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

lincs is a collection of MCDA algorithms, usable as a C++ library, a Python package and a command-line utility.

lincs is licensed under the GNU Lesser General Public License v3.0 as indicated by the two files COPYING and COPYING.LESSER.

@todo (When we have a paper to actually cite) Add a note asking academics to kindly cite our work.

Questions? Remarks? Bugs? Want to contribute? Open an issue or a discussion!

Contributors and previous work

lincs is developed by the MICS research team at CentraleSupélec.

Its main authors are (alphabetical order):

Laurent Cabaret (performance optimization)
Vincent Jacques (engineering)
Vincent Mousseau (domain expertise and project leadership)

It's based on work by:

Olivier Sobrie (The "weights, profiles, breed" learning strategy for MR-Sort models, and the profiles improvement heuristic, developed in his Ph.D thesis, and implemented in Python)
Emma Dixneuf, Thibault Monsel and Thomas Vindard (C++ implementation of Sobrie's heuristic)

Project goals

Provide MCDA tools usable out of the box

You should be able to use lincs without being a specialist of MCDA and/or NCS models. Just follow the Get started section below.

Provide a base for developing new MCDA algorithms

lincs is designed to be easy to extend with new algorithms of even replace parts of existing algorithms. @todo Write doc about that use case.

linc also provides a benchmark framework to compare algorithms (@todo Write and document). This should make it easier to understand the relative strengths and weaknesses of each algorithm.

Get started

Install

First, you need to install a few dependencies (@todo build binary wheel distributions to make installation easier):

# System packages
sudo apt-get install --yes g++ libboost-python-dev python3-dev libyaml-cpp-dev

# CUDA
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
sudo add-apt-repository 'deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /'
sudo apt-get update
sudo apt-get install --yes cuda-cudart-dev-12-1 cuda-nvcc-12-1

# OR-tools
wget https://github.com/google/or-tools/releases/download/v8.2/or-tools_ubuntu-20.04_v8.2.8710.tar.gz
tar xf or-tools_ubuntu-20.04_v8.2.8710.tar.gz
sudo cp -r or-tools_Ubuntu-20.04-64bit_v8.2.8710/include/* /usr/local/include
sudo cp -r or-tools_Ubuntu-20.04-64bit_v8.2.8710/lib/*.so /usr/local/lib
sudo ldconfig
rm -r or-tools_Ubuntu-20.04-64bit_v8.2.8710 or-tools_ubuntu-20.04_v8.2.8710.tar.gz

# Header-only libraries
cd /usr/local/include
sudo wget https://raw.githubusercontent.com/Neargye/magic_enum/v0.8.2/include/magic_enum.hpp
sudo wget https://raw.githubusercontent.com/d99kris/rapidcsv/v8.75/src/rapidcsv.h
sudo wget https://raw.githubusercontent.com/jacquev6/lov-e-cuda/13e45bc/lov-e.hpp
sudo wget https://raw.githubusercontent.com/doctest/doctest/v2.4.11/doctest/doctest.h

Finally, lincs is available on the Python Package Index, so pip install lincs should finalize the install.

Concepts and files

lincs is based on the following concepts:

a "domain" describes the objects to be classified (a.k.a. the "alternatives"), the criteria used to classify them, and the existing categories they can belong to;
a "model" is used to actually assign a category to each alternative, based on the values of the criteria for that alternative;
a "classified alternative" is an alternative, with its category.

Start using lincs' command-line interface

The command-line interface is the easiest way to get started with lincs, starting with lincs --help, which should output something like:

Usage: lincs [OPTIONS] COMMAND [ARGS]...

  lincs (Learn and Infer Non-Compensatory Sorting) is a set of tools for
  training and using MCDA models.

Options:
  --help  Show this message and exit.

Commands:
  classification-accuracy  Compute a classification accuracy.
  classify                 Classify alternatives.
  generate                 Generate synthetic data.
  learn                    Learn a model.
  visualize                Make graphs from data.

It's organized using sub-commands, the first one being generate, to generate synthetic pseudo-random data.

Generate a classification domain with 4 criteria and 3 categories (@todo Link to concepts and file formats):

lincs generate classification-domain 4 3 --output-domain domain.yml

The generated domain.yml should look like:

kind: classification-domain
format_version: 1
criteria:
  - name: Criterion 1
    value_type: real
    category_correlation: growing
  - name: Criterion 2
    value_type: real
    category_correlation: growing
  - name: Criterion 3
    value_type: real
    category_correlation: growing
  - name: Criterion 4
    value_type: real
    category_correlation: growing
categories:
  - name: Category 1
  - name: Category 2
  - name: Category 3

Then generate a classification model (@todo Link to concepts and file formats):

lincs generate classification-model domain.yml --output-model model.yml

It should look like:

kind: classification-model
format_version: 1
boundaries:
  - profile:
      - 0.255905151
      - 0.0551739037
      - 0.162252158
      - 0.0526000932
    sufficient_coalitions:
      kind: weights
      criterion_weights:
        - 0.147771254
        - 0.618687689
        - 0.406786472
        - 0.0960085914
  - profile:
      - 0.676961303
      - 0.324553937
      - 0.673279881
      - 0.598555863
    sufficient_coalitions:
      kind: weights
      criterion_weights:
        - 0.147771254
        - 0.618687689
        - 0.406786472
        - 0.0960085914

@todo Use YAML anchors and references to avoid repeating the same sufficient coalitions in all profiles

You can visualize it using:

lincs visualize classification-model domain.yml model.yml model.png

It should output something like:

Model visualization

And finally generate a set of classified alternatives (@todo Link to concepts and file formats):

lincs generate classified-alternatives domain.yml model.yml 1000 --output-classified-alternatives learning-set.csv

It should start with something like this, and contain 1000 alternatives:

name,"Criterion 1","Criterion 2","Criterion 3","Criterion 4",category
"Alternative 1",0.37454012,0.796543002,0.95071429,0.183434784,"Category 3"
"Alternative 2",0.731993914,0.779690981,0.598658502,0.596850157,"Category 2"
"Alternative 3",0.156018645,0.445832759,0.15599452,0.0999749228,"Category 1"
"Alternative 4",0.0580836125,0.4592489,0.866176128,0.333708614,"Category 3"
"Alternative 5",0.601114988,0.14286682,0.708072603,0.650888503,"Category 2"

You can visualize its first five alternatives using:

lincs visualize classification-model domain.yml model.yml --alternatives learning-set.csv --alternatives-count 5 alternatives.png

It should output something like:

Alternatives visualization

@todo Improve how this graph looks:

display categories as stacked solid colors
display alternatives in a color that matches their assigned category
remove the legend, place names (categories and alternatives) directly on the graph

You now have a (synthetic) learning set. You can use it to train a new model:

# @todo Rename the command to `train`?
lincs learn classification-model domain.yml learning-set.csv --output-model trained-model.yml

The trained model has the same structure as the original (synthetic) model because they are both MR-Sort models for the same domain, but the trained model is numerically different because information was lost in the process:

kind: classification-model
format_version: 1
boundaries:
  - profile:
      - 0.00751833664
      - 0.0549556538
      - 0.162616938
      - 0.193127945
    sufficient_coalitions:
      kind: weights
      criterion_weights:
        - 0.499998987
        - 0.5
        - 0.5
        - 0
  - profile:
      - 0.0340298451
      - 0.324480206
      - 0.672487617
      - 0.427051842
    sufficient_coalitions:
      kind: weights
      criterion_weights:
        - 0.499998987
        - 0.5
        - 0.5
        - 0

If the training is effective, the resulting trained model should behave closely to the original one. To see how close a trained model is to the original one, you can reclassify a testing set.

First, generate a testing set:

lincs generate classified-alternatives domain.yml model.yml 10000 --output-classified-alternatives testing-set.csv

And ask the trained model to classify it:

lincs classify domain.yml trained-model.yml testing-set.csv --output-classified-alternatives reclassified-testing-set.csv

There are a few differences between the original testing set and the reclassified one:

diff testing-set.csv reclassified-testing-set.csv

That command should show a few alternatives that are not classified the same way by the original and the trained model:

2595c2595
< "Alternative 2594",0.234433308,0.780464768,0.162389532,0.622178912,"Category 2"
---
> "Alternative 2594",0.234433308,0.780464768,0.162389532,0.622178912,"Category 1"
5000c5000
< "Alternative 4999",0.074135974,0.496049821,0.672853291,0.782560945,"Category 2"
---
> "Alternative 4999",0.074135974,0.496049821,0.672853291,0.782560945,"Category 3"
5346c5346
< "Alternative 5345",0.815349102,0.580399215,0.162403136,0.995580792,"Category 2"
---
> "Alternative 5345",0.815349102,0.580399215,0.162403136,0.995580792,"Category 1"
9639c9639
< "Alternative 9638",0.939305425,0.0550933145,0.247014269,0.265170485,"Category 1"
---
> "Alternative 9638",0.939305425,0.0550933145,0.247014269,0.265170485,"Category 2"
9689c9689
< "Alternative 9688",0.940304875,0.885046899,0.162586793,0.515185535,"Category 2"
---
> "Alternative 9688",0.940304875,0.885046899,0.162586793,0.515185535,"Category 1"
9934c9934
< "Alternative 9933",0.705289483,0.11529737,0.162508503,0.0438248962,"Category 2"
---
> "Alternative 9933",0.705289483,0.11529737,0.162508503,0.0438248962,"Category 1"

You can also measure the classification accuracy of the trained model on that testing set:

lincs classification-accuracy domain.yml trained-model.yml testing-set.csv

It should be close to 100%:

9994/10000

Once you're comfortable with the tooling, you can use a learning set based on real-world data and train a model that you can use to classify new real-world alternatives.

User guide

@todo Write the user guide.

Reference

@todo Generate a reference documentation using Sphinx:

Python using autodoc
C++ using Doxygen+Breath
CLI using https://sphinx-click.readthedocs.io/en/latest/
YAML file formats using JSON Schema and https://sphinx-jsonschema.readthedocs.io/en/latest/

Develop lincs itself

Run ./run-development-cycle.sh.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.1.0

Feb 9, 2024

1.1.0a20 pre-release

Feb 8, 2024

1.1.0a18 pre-release

Feb 8, 2024

1.1.0a17 pre-release

Feb 8, 2024

1.1.0a16 pre-release

Feb 8, 2024

1.1.0a6 pre-release

Feb 6, 2024

1.1.0a5 pre-release

Feb 1, 2024

1.1.0a4 pre-release

Jan 29, 2024

1.1.0a3 pre-release

Jan 29, 2024

1.1.0a1 pre-release

Jan 11, 2024

1.1.0a0 pre-release

Jan 10, 2024

1.0.0

Nov 22, 2023

0.11.1

Nov 17, 2023

0.11.0

Nov 15, 2023

0.10.3

Oct 25, 2023

0.10.2

Oct 24, 2023

0.10.0

Oct 23, 2023

0.9.1

Oct 18, 2023

0.8.7

Sep 28, 2023

0.8.6

Sep 28, 2023

0.8.4

Sep 27, 2023

0.7.0

Sep 1, 2023

0.6.0

Jul 28, 2023

0.5.1

Jul 26, 2023

0.5.0

Jul 6, 2023

0.4.5

Jun 9, 2023

0.4.4 yanked

Jun 9, 2023

Reason this release was yanked:

Not properly tagged in git

0.4.3 yanked

Jun 9, 2023

Reason this release was yanked:

Not properly tagged in git

0.4.2 yanked

Jun 9, 2023

Reason this release was yanked:

Not properly tagged in git

0.4.1 yanked

Jun 9, 2023

Reason this release was yanked:

Not properly tagged in git

0.4.0

Jun 8, 2023

0.3.7

May 31, 2023

0.3.6

May 31, 2023

0.3.5

May 31, 2023

0.3.3

May 12, 2023

This version

0.3.2

May 12, 2023

0.3.1

May 12, 2023

0.3.0

May 11, 2023

0.2.2

May 7, 2023

0.2.1

May 5, 2023

0.2.0

May 5, 2023

0.1.3

May 4, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lincs-0.3.2.tar.gz (51.2 kB view hashes)

Uploaded May 12, 2023 Source

Hashes for lincs-0.3.2.tar.gz

Hashes for lincs-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`4af72ac1e81b8da28de6abcbdf9ba91b4e8d49d6f9d7906aaa9eb1f593ac3a4b`
MD5	`174ffe5cb66601614b46f12e3babbb47`
BLAKE2b-256	`5aa1f999b353ce7e637aaa07d46db900d61994d980126f45b1e7270b7abbd513`