A library for audio and music analysis, feature extraction.
Project description
audioFlux
A library for audio and music analysis, feature extraction.
Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc
Overview
Description
In audio domain, feature extraction is particularly important for Audio Classification, Speech enhancement, Audio/Music Separation,music-information-retrieval(MIR), ASR and other audio task.
In the above tasks, mel spectrogram and mfcc features are commonly used in traditional machine-learning based on statistics and deep-learning based on neural network.
audioFlux
provides systematic, comprehensive and multi-dimensional feature extraction and combination, and combines various deep learning network models to conduct research and development learning in different fields.
Functionality
audioFlux
is based on the design of data flow. It decouples each algorithm module structurally, and it is convenient, fast and efficient to extract features from large batches.The following are the main feature architecture diagrams, specific and detailed description view the documentation.
The main functions of audioFlux
include transform, feature and mir modules.
1. transform
In the time–frequency representation, main transform algorithm:
BFT
- Based Fourier Transform, similar short-time Fourier transform.NSGT
- Non-Stationary Gabor Transform.CWT
- Continuous Wavelet Transform.PWT
- Pseudo Wavelet Transform.
The above transform supports all the following frequency scale types:
- Linear - Short-time Fourier transform spectrogram.
- Linspace - Linspace-scale spectrogram.
- Mel - Mel-scale spectrogram.
- Bark - Bark-scale spectrogram.
- Erb - Erb-scale spectrogram.
- Octave - Octave-scale spectrogram.
- Log - Logarithmic-scale spectrogram.
The following transform are not supports multiple frequency scale types, only used as independent transform:
CQT
- Constant-Q Transform.VQT
- Variable-Q Transform.ST
- S-Transform/Stockwell Transform.FST
- Fast S-Transform.DWT
- Discrete Wavelet Transform.WPT
- Wave Packet Transform.SWT
- Stationary Wavelet Transform.
Detailed transform function, description, and use view the documentation.
The synchrosqueezing or reassignment is a technique for sharpening a time-frequency representation, contains the following algorithms:
reassign
- reassign transform forSTFT
.synsq
- reassign data useSTFT
data.wsst
- reassign transform forCWT
.
2. feature
The feature module contains the following algorithms:
spectral
- Spectrum feature, supports all spectrum types.xxcc
- Cepstrum coefficients, supports all spectrum types.deconv
- Deconvolution for spectrum, supports all spectrum types.chroma
- Chroma feature, only supportsCQT
spectrum, Linear/Octave spectrum based onBFT
.
3. mir
The mir module contains the following algorithms:
pitch
- YIN, STFT, etc algorithm.onset
- Spectrum flux, novelty, etc algorithm.hpss
- Median filtering, NMF algorithm.
Installation
The library is cross-platform and currently supports Linux, macOS, Windows, iOS and Android systems.
Python Package Intsall
Using PyPI:
$ pip install audioflux
Using Anaconda:
$ conda install -c conda-forge audioflux
Building from source:
$ python setup.py build
$ python setup.py install
iOS build
To compile iOS on a Mac, Xcode Command Line Tools must exist in the system:
- Install the full Xcode package
- install Xcode Command Line Tools when triggered by a command or run xcode-select command:
$ xcode-select --install
Enter the audioFlux
project scripts
directory and switch to the current directory, run the following script to build and compile:
$ ./build_iOS.sh
Build and compile successfully, the project build compilation results are in the build
folder
Android build
The current system development environment needs to be installed android NDK, ndk version>=16,after installation, set the environment variable ndk path.
For example, ndk installation path is ~/Android/android-ndk-r16b
:
$ export NDK_ROOT=~/Android/android-ndk-r16b
$ export PATH=$NDK_ROOT:$PATH
Android audioFlux
build uses fftw library to accelerate performance, compile the single-floating point version for android platform. fftw lib successful compilation, copy to audioFlux
project scripts/android/fftw3
directory.
Enter the audioFlux
project scripts
directory and switch to the current directory, run the following script to build and compile:
$ ./build_android.sh
Build and compile successfully, the project build compilation results are in the build
folder
Documentation
Documentation of the package can be found online:
Contributing
We are more than happy to collaborate and receive your contributions to audioFlux
. If you want to contribute, the best way is is to submit your code. Create a pull request
You are also more than welcome to suggest any improvements, including proposals for need help, find a bug, have a feature request, ask a general question, new algorithms. Open an issue
License
audioFlux project is available MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for audioflux-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a705bb45b8af1d0ae51c7e71d7647819fc716c935d383412888fa6a4fd3f1e1 |
|
MD5 | 77ab6a44a25584f09d302ff7b5a5d919 |
|
BLAKE2b-256 | 02addef975d0118fef960bff7f1bf865bd9d48511c40ff799a7ac7c34e3079b6 |