Skip to main content

database of nyc housing data

Project description

# NYCDB

a tool for building a database of NYC housing data

This is a Python library and cli tool for installing, updating and managing NYCDB, a postgres database of NYC Housing Data.

For more background information on this project and links to download copies of full database dump visit: https://github.com/nycdb/nycdb. We use the term nycdb to refer to both the python software and the running copy of the postgres database.

## Using the cli tool

You will need python 3.6+ and Postgres. The latest version can be installed from pypi with pip: ` python3 -m pip install nycdb `

If the installation is successful, you can view a summary of the tool’s options by running nycdb –help

To print a list of datasets: ` nycdb –list-datasets`

nycdb’s main job is to download datasets and import them into postgres. It does not manage the database for you. You can use the flags -U/–user, -D/–database, -P/–password, and -H/–host to instruct nycdb to connect to the correct database. See nycdb –help for the defaults.

Example: downloading, loading, and verifying the dataset hpd_violations:

` sh nycdb --download hpd_violations nycdb --load hpd_violations nycdb --verify hpd_violations `

You can also verify all datasets: ` nycdb –verify-all `

By default the downloaded data files are is stored in ./data. Use –root-dir to change the location of the data directory.

You can export a .sql file for any dataset by using the –dump command

## Development

There are two development workflows: one using python virtual environments and one using docker.

### python3 virtual environments

First check your python version to make sure you have 3.6 or higher: python3 –version

Setup the virtual environment and install the requirements: make init

Install nycdb into the virtual environment: ` make install`

Run the tests: make test

Run only the unit tests: make test-unit

### Using docker

This reuqires docker and docker-compose.

To get started run ` docker-compose up `

After Docker downloads and builds some things, it will start a Postgres server on port 7777 of your local machine, which you can connect to via a desktop client if you like. You can also press <kbd>CTRL</kbd>-<kbd>C</kbd> at any point to stop the server.

In a separate terminal, you can run:

` docker-compose run app bash `

At this point you are inside a bash shell in a container that has everything already set up for you. The initial working directory will be /nycdb, which is mapped to the root of the project’s repository. From here you can run nycdb to access the command-line tool.

To develop on nycdb itself:

  • You can run pytest to run the test suite.

  • Any changes you make to the tool’s source code will automatically be reflected in future invocations of nycdb and/or the test suite.

  • If you don’t have a desktop Postgres client, you can always run nycdb –dbshell to interactively inspect the database with [psql][].

You can leave the bash shell with exit.

If you ever want to wipe the database, run docker-compose down -v.

[install Docker]: https://www.docker.com/get-started [psql]: http://postgresguide.com/utilities/psql.html

### Adding New Datasets

See the [guide Here](ADDING_NEW_DATASETS.md) for the steps to add a new dataset

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nycdb-0.1.28.tar.gz (18.2 kB view hashes)

Uploaded Source

Built Distribution

nycdb-0.1.28-py3-none-any.whl (51.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page