Skip to main content

A simple Django app download and parse California campaign finance data from Cal-Access.

Project description

# django-calaccess-parser

A simple Django app to download, extract and load the [CAL-ACCESS](http://www.sos.ca.gov/prd/cal-access/) campaign finance and lobbying activity database.

[![Build Status](https://travis-ci.org/california-civic-data-coalition/django-calaccess-parser.png?branch=master)](https://travis-ci.org/california-civic-data-coalition/django-calaccess-parser)
[![PyPI version](https://badge.fury.io/py/django-calaccess-parser.png)](http://badge.fury.io/py/django-calaccess-parser)
[![Coverage Status](https://coveralls.io/repos/california-civic-data-coalition/django-calaccess-parser/badge.png?branch=master)](https://coveralls.io/r/california-civic-data-coalition/django-calaccess-parser?branch=master)

* Documentation: [http://django-calaccess-parser.rtfd.org](http://django-calaccess-parser.rtfd.org)
* Issues: [https://github.com/california-civic-data-coalition/django-calaccess-parser/issues](https://github.com/california-civic-data-coalition/django-calaccess-parser/issues)
* Packaging: [https://pypi.python.org/pypi/django-calaccess-parser](https://pypi.python.org/pypi/django-calaccess-parser)
* Testing: [https://travis-ci.org/california-civic-data-coalition/django-calaccess-parser](https://travis-ci.org/california-civic-data-coalition/django-calaccess-parser)
* Coverage: [https://coveralls.io/r/california-civic-data-coalition/django-calaccess-parser](https://coveralls.io/r/california-civic-data-coalition/django-calaccess-parser)

## Requirements

- Django 1.6
- MySQL 5.5
- Patience

## Installation

- Install django-calaccess-parser with pip

```bash
$ pip install https://github.com/california-civic-data-coalition/django-calaccess-parser/archive/master.zip
```

- Configure the `DATABASE` dictionary in `settings.py`
```python
DEBUG = False
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'local_calaccess_db',
'USER': 'calaccessuser',
'PASSWORD': 'password',
'HOST': 'localhost',
'PORT': '3306',
'OPTIONS': {
'local_infile': 1,
}
}
}
```

- Add `calaccess` to your INSTALLED_APPS setting like this:
```python
INSTALLED_APPS = (
...
'calaccess',
)
```

## Loading the data

- Set `settings.CALACCESS_DOWNLOAD_DIR` environment variable to your preferred path to store the data
- Next, sync the database, create a Django admin user, and run the management command to the load the CAL Access data
```bash
$ python manage.py syncdb
$ python manage.py downloadaccess
```
This'll take a while. Go grab some coffee or do something else productive with your life.

## Explore data

Start the development server and visit [http://127.0.0.1:8000/admin/](http://127.0.0.1:8000/admin/)
to inspect the Cal-access data (you'll need the Admin app enabled).

## Available flags for `downloadcalaccess`
```
Usage: manage.py downloadcalaccess [options]

Download the latest snapshot of the CalAccess database

Options:
-v VERBOSITY, --verbosity=VERBOSITY
Verbosity level; 0=minimal output, 1=normal output,
2=verbose output, 3=very verbose output
--settings=SETTINGS The Python path to a settings module, e.g.
"myproject.settings.main". If this isn't provided, the
DJANGO_SETTINGS_MODULE environment variable will be
used.
--pythonpath=PYTHONPATH
A directory to add to the Python path, e.g.
"/home/djangoprojects/myproject".
--traceback Raise on exception
--skip-download Skip downloading of the ZIP archive
--skip-unzip Skip unzipping of the archive
--skip-prep Skip prepping of the unzipped archive
--skip-clear Skip clearing out ZIP archive and extra files
--skip-clean Skip cleaning up the raw data files
--skip-load Skip loading up the raw data files
--noinput Download the ZIP archive without asking permission
--version show program's version number and exit
-h, --help show this help message and exit

```

Project details


Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page