This tool clusterizes lines of text given a collection of input patterns modeled using regular expressions.
Project description
Pattern clustering
This tool clusterizes lines of text given a collection of input patterns modeled using regular expressions.
This work has been published to:
[ICPR’2022] A novel pattern-based edit distance for automatic log parsing, Maxime Raynal, Marc-Olivier Buob, Georges Quénot.
Features
Forms groups of homogeneous line using a pattern based distance, based on customizable patterns.
Configured by default to use common patterns (IP addresses, numeric values, etc.)
License
This project is licensed under the BSD-3-Clause license - see the LICENSE.
More about pattern-clustering
For more information, feel free to visit the wiki:
Acks
The skeleton package was created with Cookiecutter and the francois-durand/package_helper_2 project template.
The sphinx part is inspired from Sphinx-Autosummary-Recursion.
History
0.1.0 (2022-05-11): First release
First release on PyPI.
0.2.0 (2022-06-02): CI
Updated tox.ini and GitHub actions, work in progress.
0.3.0 (2022-06-22): Bug fixes and CI improvements
Fixed sphinx local build
Fixed bumpversion
Add experiments notebooks and datasets
Improved test suite
0.3.1 (2022-06-22): Bug fixes and CI improvements
Fixed readthedoc build
0.4.1 (2022-06-24): Bug fixes and CI improvements
Fixed readthedoc build
Implemented console script (cli)
Reworked PatternClusteringEnv class
Bug fixes
Updated documentation
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.