Web Scraper for Eurostat data.
Project description
Eurostat
The program eurostat.py
is a simple interface to parse Eurostat data.
Executing the modul
Parsing data from Eurostat to a file is as easy as
python3 eurostat.py --output data.csv --start 2019-01-01 --verbose
It downloads the file from Eurostat and parses it according to the input to an output format.
sex,age,geo\time,2020W23,2020W22,2020W21, ... ,2019W03,2019W02,2019W01
F,OTAL,AT,,,, ... ,852,877,914
F,OTAL,AT1,,, ... ,364,361,387
...
All parameters of the command can be shown with
python3 eurostat.py --help
usage: eurostat.py [-h] [-o OUTPUT] [-n CHUNKSIZE] [-s START] [-v]
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Directs the output to a name of your choice.
-n CHUNKSIZE, --chunksize CHUNKSIZE
Number of lines in chunk (in thousands).
-s START, --start START
Start date.
-v, --verbose Sets verbose log (logging level INFO).
Importing
It can be imported as well. Following code is using the inner function read_eurostat()
to load the data. The total size of the data frame is about 218 MB, so the call takes more than 15 minutes and the usage of memory is enormous.
The module should not be used like this. Recommended is implementation using Big Data framework, e.g. PySpark.
from datetime import datetime
import eurostat
data = eurostat.read_eurostat(output = None, start = datetime(2019,1,1))
Parameter output = None
causes that the output is collected into a single dataframe and returned.
Credits
Author: Martin Benes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for eurostat_deaths-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c77d902bfacd39de30aba510208c44ee1855c783d1ed05fe02d153d1084d998 |
|
MD5 | 958d8a28dd8f3dff3677b2cca6362312 |
|
BLAKE2b-256 | ca3edd58a3d6195644ad331babc1582562f7ef6004fbf7fcb7117593d366c737 |