A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
libzim
module allows you to read and write ZIM
files in Python. It provides a shallow python
interface on top of the C++ libzim
library.
It is primarily used in openZIM scrapers like sotoki
or youtube2zim
.
Installation
pip install libzim
The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.
On other platforms, you'd have to compile C++ libzim from
source first then build this one, adjusting LD_LIBRARY_PATH
.
Contributions
git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers
See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!
Usage
Read a ZIM file
from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher
zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))
# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))
# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))
Write a ZIM file
from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint
class MyItem(Item):
def __init__(self, title, path, content = "", fpath = None):
super().__init__()
self.path = path
self.title = title
self.content = content
self.fpath = fpath
def get_path(self):
return self.path
def get_title(self):
return self.title
def get_mimetype(self):
return "text/html"
def get_contentprovider(self):
if self.fpath is not None:
return FileProvider(self.fpath)
return StringProvider(self.content)
def get_hints(self):
return {Hint.FRONT_ARTICLE: True}
content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""
item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")
with Creator("test.zim").config_indexing(True, "eng") as creator:
creator.set_mainpath("home")
creator.add_item(item)
creator.add_item(item2)
for name, value in {
"creator": "python-libzim",
"description": "Created in python",
"name": "my-zim",
"publisher": "You",
"title": "Test ZIM",
}.items():
creator.add_metadata(name.title(), value)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
libzim-3.0.0.tar.gz
(8.3 MB
view hashes)
Built Distributions
Close
Hashes for libzim-3.0.0-cp311-cp311-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 833988420508e30763f62007dec5be0e55d3c1f1acd9d30000d06f71bbdfd531 |
|
MD5 | c7132af4b0d77926a6c7a83ddaf23d85 |
|
BLAKE2b-256 | 8355809cc67c2a91cada8b718f55fd6bb20e14ec42ed7f02fbbd9ad7b12c6b0f |
Close
Hashes for libzim-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca21cca23f61af874d2c5207a48c1615ed94ce7fea6da17f9d689b54f1d07a14 |
|
MD5 | ff3fd98d4283aec030ef2b76e0d0d1bd |
|
BLAKE2b-256 | 88d9437fac8414061b622527a50fb9ab15c3d5d35ffb434510b32536be228e8b |
Close
Hashes for libzim-3.0.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c01a7a61a8a182c1c5ab973df1ffe698f928a5a57108ad758eba7620c1bf046 |
|
MD5 | 90e66e353b8b1fbfd66ae238b31f7477 |
|
BLAKE2b-256 | 357c26255e476f6847bafce67f660d018b7af1375cbf86fb93d50f1e6cc62060 |
Close
Hashes for libzim-3.0.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39eb0221b789de0b816b4803230afec34309d0b409c5822c8ed5be9a881bf3a8 |
|
MD5 | a701e5505c494d7ba89d208f57d73245 |
|
BLAKE2b-256 | 4c999c83435aa4f9e2ee6a4edb130707f39fee42539d1421e2281bacb1c65b23 |
Close
Hashes for libzim-3.0.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a19f19c2fd0c4a77f3afb55e7087b6b8aa96a5b90ccfe6b8fff7df792148b1b |
|
MD5 | 7e1fb3a5079d87826c5fc63e196a1832 |
|
BLAKE2b-256 | 64f831429d0155eb91d57ec0c1c985d004105e7bca9b82f770c30c087125fd5b |
Close
Hashes for libzim-3.0.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d23e2e8e4c5e5aee48686ce33be01cee1103a215d27c1485b7767e8298676849 |
|
MD5 | 9e89c69a5c6faeea643a158ed9c6e654 |
|
BLAKE2b-256 | 8ab9781eea5e7f44530ec454499c8a0512693596ab878061d68d8825fb008ae7 |
Close
Hashes for libzim-3.0.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 889cebf3c82827e900ee8b3bec3042f8cc473bf096581fab189070ff1903d749 |
|
MD5 | 895fbe6394ba1335aff8f6dbb1a5d53a |
|
BLAKE2b-256 | b9af5d9865aa017430b225102da7bf7ee5352f92bd1550256481e3c7d0ab41ef |
Close
Hashes for libzim-3.0.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46354ff212c1d49dec84ecf577a529e2efb01082bdaac43bb8f633c7a2cecc3c |
|
MD5 | bfaecc88078493bf58fce425ace855bb |
|
BLAKE2b-256 | f043b7c96c11589317a09ab1cfafb68ab3101b0e692a9388e55782166c98881a |
Close
Hashes for libzim-3.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7616ff96373342366a73fad7d2fc9ff5eeeb4d8454e50792743205ffe494e02 |
|
MD5 | 0e41032fc5259b00c94f16c3fee61679 |
|
BLAKE2b-256 | 770d9ee766e222bcfdd4b7c9e8158d7f850d32a318ddead5e7b58a93ccae9e65 |
Close
Hashes for libzim-3.0.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f82e6a41e3d10d3a11d61a4e9c70828dd7b1f81a34e61e33a8b00814d861197 |
|
MD5 | 509dca60ee68a44ea40710b65065a6cb |
|
BLAKE2b-256 | dfda41397370b5ce77d16a1dc93ff30c24f8659899a5a3b94c5183a0b7d8dd44 |
Close
Hashes for libzim-3.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfafc998f55c1f0eeb3a81630df89b42d719d2af88b3ba6c3bcbd0ad13d7bde1 |
|
MD5 | c4c6dc329adfd74eb0fea34ed5fd90a5 |
|
BLAKE2b-256 | 38d96de61a263359dd1b596955f8e5093f022fdcf5bd44a6b91a06edfdca1f4e |
Close
Hashes for libzim-3.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e86e255f580b709ae6573d99252eaa9a9d5f78a15d28b0cdbacd3a5e3de26f2f |
|
MD5 | f06059f7ec30186e7c22208889498435 |
|
BLAKE2b-256 | 0328d9fe0747ba5e07d52eeb319fa03ddb8ae793d893bb69afef9035c5834e64 |