A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
The Python bindings for libzim
.
This library allows you to interact with .zim
files via Python.
It just provides a shallow Python interface on top of the libzim
C++ library (maintained by OpenZIM).
It is primarily used by sotoki
.
Installation
# Install from PyPI: https://pypi.org/project/libzim/
pip3 install libzim
Quickstart
Reader API
from libzim.reader import File
f = File("test.zim")
article = f.get_article("article/url.html")
print(article.url, article.title)
if not article.is_redirect():
print(article.content)
Write API
See example for a basic usage of the writer API.
User Documentation
Setup: Ubuntu/Debian and macOS x86_64
(Recommended)
Install the python libzim
package from PyPI.
pip3 install libzim
The x86_64
linux and macOS wheels automatically includes the libzim.(so|dylib)
dylib and headers, but other platforms may need to install libzim
and its headers manually.
Installing the libzim
dylib and headers manually
If you are not on a linux or macOS x86_64
platform, you will have to install libzim manually.
Either by get a prebuilt binary at https://download.openzim.org/release/libzim
or compile libzim
from source.
If you have not installed libzim in standard directory, you will have to set LD_LIBRARY_PATH
to allow python to find the library :
Assuming you have extracted (or installed) the library if LIBZIM_DIR:
export LD_LIBRARY_PATH="${LIBZIM_DIR}/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
Setup: Docker (Optional)
docker build . --tag openzim:python-libzim
# Run a custom script inside the container
docker run -it openzim:python-libzim ./some_example_script.py
# Or use the python repl interactively
docker run -it openzim:python-libzim
>>> import libzim
Developer Documentation
These instructions are for developers working on the python-libzim
source code itself. If you are simply a user of the library and you don't intend to change its internal source code, follow the User Documentation instructions above instead.
Setup: Ubuntu/Debian
Note: Make sure you've installed libzim
dylib + headers first (see above).
apt install coreutils wget git ca-certificates \
g++ pkg-config libtool automake autoconf make meson ninja-build \
liblzma-dev zlib1g-dev libicu-dev libgumbo-dev libmagic-dev
pip3 install --upgrade pip pipenv
export CFLAGS="-I${LIBZIM_DIR}/include"
export LDFLAGS="-L${LIBZIM_DIR}/lib/x86_64-linux-gnu"
git clone https://github.com/openzim/python-libzim
cd python-libzim
python setup.py build_ext
pipenv install --dev
pipenv run pip install -e .
Setup: Docker
docker build . -f Dockerfile.dev --tag openzim:python-libzim-dev
docker run -it openzim:python-libzim-dev ./some_example_script.py
docker run -it openzim:python-libzim-dev
$ black . && flake8 . && pytest .
$ pipenv install --dev <newpackagehere>
$ python setup.py build_ext
$ python setup.py sdist bdist_wheel
$ python setup.py install
$ python -c "import libzim"
Common Tasks
Run Linters & Tests
# Autoformat code with black
black --exclude=setup.py .
# Lint and check for errors with flake8
flake8 --exclude=setup.py .
# Typecheck with mypy (optional)
mypy .
# Run tests
pytest .
Rebuild Cython extension during development
rm libzim/libzim.cpp
rm -Rf build
rm -Rf *.so
python setup.py build_ext
python setup.py install
Build package sdist
and bdist_wheels
for PyPI
python setup.py build_ext
python setup.py sdist bdist_wheel
# upload to PyPI (caution: this is done automatically via Github Actions)
twine upload dist/*
Use a specific libzim
dylib and headers when compiling python-libzim
export CFLAGS="-I${LIBZIM_DIR}/include"
export LDFLAGS="-L${LIBZIM_DIR}/lib/x86_64-linux-gnu"
export LD_LIBRARY_PATH="${LIBZIM_DIR}/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
python setup.py build_ext
python setup.py install
Further Reading
Related Projects
- https://github.com/openzim/sotoki
- https://framagit.org/mgautierfr/pyzim
- https://github.com/pediapress/pyzim
- https://github.com/jarondl/pyzimmer/blob/master/pyzimmer/zim_writer.py
Research
- https://github.com/cython/cython/wiki/AutoPxd
- https://www.youtube.com/watch?v=YReJ3pSnNDo
- https://github.com/openzim/zim-tools/blob/master/src/zimrecreate.cpp
- https://github.com/cython/cython/wiki/enchancements-inherit_CPP_classes
- https://groups.google.com/forum/#!topic/cython-users/vAB9hbLMxRg
Debugging
- https://cython.readthedocs.io/en/latest/src/userguide/debugging.html
- https://github.com/cython/cython/wiki/DebuggingTechniques
- https://stackoverflow.com/questions/2663841/python-tracing-a-segmentation-fault
- https://cython-devel.python.narkive.com/cW3Cn1th/debugging-a-segfault-in-a-cython-generated-module
- https://groups.google.com/forum/#!topic/cython-users/B_Sxj2NV1PE
Packaging
- https://download.openzim.org/release/libzim/
- https://cibuildwheel.readthedocs.io/en/stable/faq/
- https://github.com/pypa/manylinux
- https://github.com/RalfG/python-wheels-manylinux-build/blob/master/full_workflow_example.yml
- https://packaging.python.org/guides/packaging-binary-extensions/#publishing-binary-extensions
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for libzim-0.1-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e538da7c88aa41bec7770eb06fcce70e67d4ee39cf445d0e58a8760663637494 |
|
MD5 | a0f4cceb416763a16bf40f42bb45e3fb |
|
BLAKE2b-256 | c781f6a61936986f77bd683865497f103ecb0d5e4e4dd6eb87ff66e0ae472aa8 |
Hashes for libzim-0.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 734b368ae31dd74bd3e4e5e56249c2c60ff872d7080d6be784495b10ee25d287 |
|
MD5 | 6f198c389b6e014838707054b7de2c74 |
|
BLAKE2b-256 | 59f769b794954bfcf0764504ca8cdf7613d1154d2ae2b78f70e8247fcbf41362 |
Hashes for libzim-0.1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a9868467c9cce47de306413663f97f500908d3f2928cdec2f6607eabd68dc947 |
|
MD5 | dffa91afcb2d7417b945923bba9296e9 |
|
BLAKE2b-256 | d39512ddb8392a75ae3d296d9652feaa7b34a14e6545f5c7f9a5f62cd6a2f4be |
Hashes for libzim-0.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4350d56fe4421dca9524411838cc6d66f207705fbc95427cf927df658f943e65 |
|
MD5 | 6dc8295de036da612e4b0311979977e8 |
|
BLAKE2b-256 | 1169a0964a7879f5a9180ef1cfd10848eb9a4e4f86971d286051b64a0bd7aae6 |
Hashes for libzim-0.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c49f7ccda39f3b6beb30202fe0043675c9031fd2beed847997d84459f3e5044a |
|
MD5 | b25811de2a7f29ef11f4c3df2b423e48 |
|
BLAKE2b-256 | 157a72b82288cfe3cb4cfa85ba19f18069f39a11c16b40ce42b2a560dfa191aa |
Hashes for libzim-0.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e0e95bdc75c9bf968349db4ade1466c28aa98e633a81a3a931ad0c67be834d8 |
|
MD5 | 465df41b4c49d0959a5583953d91dfb0 |
|
BLAKE2b-256 | 23ed11a73ab4000ba7d83a2c359f0cd75141427a30faae07496f8887bf2d09bc |