skip to navigation
skip to content

mbdata 2017.6.2

MusicBrainz Database Tools

SQLAlchemy Models

If you are developing a Python application that needs access to the MusicBrainz data, you can use the mbdata.models module to get SQLAlchemy models mapped to the MusicBrainz database tables.

All tables from the MusicBrainz database are mapped, all foreign keys have one-way relationships set up and some models, where it’s essential to access their related models, have two-way relationships (collections) set up.

In order to work with the relationships efficiently, you should use the appropriate kind of eager loading.

Example usage of the models:

>>> from sqlalchemy import create_engine
>>> from sqlalchemy.orm import sessionmaker
>>> from mbdata.models import Artist
>>> engine = create_engine('postgresql://musicbrainz:musicbrainz@127.0.0.1/musicbrainz', echo=True)
>>> Session = sessionmaker(bind=engine)
>>> session = Session()
>>> artist = session.query(Artist).filter_by(gid='8970d868-0723-483b-a75b-51088913d3d4').first()
>>> print artist.name

If you use the models in your own application and want to define foreign keys from your own models to the MusicBrainz schema, you will need to let mbdata know which metadata object to add the MusicBrainz tables to:

from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()

# this should be the first place where you import anything from mbdata
import mbdata.config
mbdata.config.configure(base_class=Base)

# now you can import and use the mbdata models
import mbdata.models

You can also use mbdata.config to re-map the MusicBrainz schema names, if your database doesn’t follow the original structure:

import mbdata.config
mbdata.config.configure(schema='my_own_mb_schema')

If you need sample MusicBrainz data for your tests, you can use mbdata.sample_data:

from mbdata.sample_data import create_sample_data
create_sample_data(session)

HTTP API

Note: This is very much a work in progress. It is not ready to use yet. Any help is welcome.

There is also a HTTP API, which you can use to access the MusicBrainz data using JSON or XML formats over HTTP. This is useful if you want to abstract away the MusicBrainz PostgreSQL database.

Installation:

virtualenv --system-site-packages e
. e/bin/activate
pip install -r requirements.txt
python setup.py develop

Configuration:

cp settings.py.sample settings.py
vim settings.py

Start the development server:

MBDATA_API_SETTINGS=`pwd`/settings.py python -m mbdata.api.app

Query the API:

curl 'http://127.0.0.1:5000/v1/artist/get?id=b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d'

For production use, you should use server software like uWSGI and nginx to run the service.

Solr Index

Create a minimal Solr configuration:

./bin/create_solr_home.py -d /tmp/mbdata_solr

Start Solr:

cd /path/to/solr-4.6.1/example
java -Dsolr.solr.home=/tmp/mbdata_solr -jar start.jar

Development

Normally you should work against a regular PostgreSQL database with MusicBrainz data, but for testing purposes, you can use a SQLite database with small data sub-set used in unit tests. You can create the database using:

./bin/create_sample_db.py sample.db

Then you can change your configuration:

DATABASE_URI = 'sqlite:///sample.db'

Running tests:

nosetests -v

If you want to see the SQL queries from a failed test, you can use the following:

MBDATA_DATABASE_ECHO=1 nosetests -v

Jenkins task that automatically runs the tests after each commit is here.

 
File Type Py Version Uploaded on Size
mbdata-2017.6.2.tar.gz (md5) Source 2017-06-01 100KB