Skip to main content

Utility bill scraper for extracting data from pdfs and websites.

Project description

Utility bill scraper

build Binder PyPI version shields.io

Download energy usage and carbon emissions data from utility websites or pdf bills.

Supported utilities

The simplest way to get started without installing anything on your computer is to click on one of the following links, which will open a session on https://mybinder.org where you can try downloading some data.

Install

pip install utility-bill-scraper

Get updates

import utility_bill_scraper.canada.on.kitchener_utilities as ku

ku_api = ku.KitchenerUtilitiesAPI(username='username', password='password')

# Get new statements.
updates = ku_api.update()
if updates is not None:
    print(f"{ len(updates) } statements_downloaded")
ku_api.history().tail()

history tail

Plot monthly gas consumption

import matplotlib.pyplot as plt

df_ku = ku_api.history()

plt.figure()
plt.bar(df_ku.index, df_ku["Gas Consumption"], width=0.9, alpha=0.5)
plt.xticks(rotation=90)
plt.title("Monthly Gas Consumption")
plt.ylabel("m$^3$")

monthly gas consumption

Convert gas consumption to CO2 emissions

from utility_bill_scraper import GAS_KGCO2_PER_CUBIC_METER

df_ku["kgCO2"] = df_ku["Gas Consumption"] * GAS_KGCO2_PER_CUBIC_METER

Plot CO2 emissions versus previous years

import datetime as dt

df_ku["kgCO2"] = df_ku["Gas Consumption"] * GAS_KGCO2_PER_CUBIC_METER
df_ku["year"] = [int(x[0:4]) for x in df_ku.index]
df_ku["month"] = [int(x[5:7]) for x in df_ku.index]

n_years_history = 1

plt.figure()
for year, df_year in df_ku.groupby("year"):
    if year >= dt.datetime.utcnow().year - n_years_history:
        df_year.sort_values("month", inplace=True)
        plt.bar(
            df_year["month"],
            df_year["Gas Consumption"],
            label=year,
            width=0.9,
            alpha=0.5,
        )
plt.legend()
plt.ylabel("m$^3$")
plt.xlabel("Month")
ylim = plt.ylim()
ax = plt.gca()
ax2 = ax.twinx()
plt.ylabel("tCO$_2$e")
plt.ylim([GAS_KGCO2_PER_CUBIC_METER * y / 1e3 for y in ylim])
plt.title("Monthly CO$_2$e emissions from natural gas")

monthly_co2_emissions

Command line utilities

Update and export your utility data from the command line.

Update data

> python -m utility_bill_scraper.bin.ubs --utilty-name "Kitchener Utilities" update --user $USER --password $PASSWORD

Export data

> python -m utility_bill_scraper.bin.ubs --utilty-name "Kitchener Utilities" export --output data.csv

Options

> python -m utility_bill_scraper.bin.ubs --help
usage: ubs.py [-h] [-e ENV] [--data-path DATA_PATH] [--utility-name UTILITY_NAME]
              [--google-sa-credentials GOOGLE_SA_CREDENTIALS]
              {update,export} ...

ubs (Utility bill scraper)

optional arguments:
  -h, --help            show this help message and exit
  -e ENV, --env ENV     path to .env file.
  --data-path DATA_PATH
                        folder containing the history file
  --utility-name UTILITY_NAME
                        name of the utility
  --google-sa-credentials GOOGLE_SA_CREDENTIALS
                        google service account credentials

subcommands:
  {update,export}       available sub-commands

Environment variables

Note that many options can be set via environment variables (useful for continuous integration and/or working with containers). The following can be set in your shell or via a .env file passed using the -e option.

DATA_PATH
UTILITY_NAME
GOOGLE_SA_CREDENTIALS
USER
PASSWORD
SAVE_STATEMENTS
MAX_DOWNLOADS

Contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

utility-bill-scraper-0.5.1.tar.gz (20.1 kB view hashes)

Uploaded Source

Built Distribution

utility_bill_scraper-0.5.1-py3-none-any.whl (21.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page