A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
libzim
module allows you to read and write ZIM
files in Python. It provides a shallow python
interface on top of the C++ libzim
library.
It is primarily used in openZIM scrapers like sotoki
or youtube2zim
.
Installation
pip install libzim
The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.
On other platforms, you'd have to compile C++ libzim from
source first then build this one, adjusting LD_LIBRARY_PATH
.
Contributions
git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers
See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!
Usage
Read a ZIM file
from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher
zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))
# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))
# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))
Write a ZIM file
from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint
class MyItem(Item):
def __init__(self, title, path, content = "", fpath = None):
super().__init__()
self.path = path
self.title = title
self.content = content
self.fpath = fpath
def get_path(self):
return self.path
def get_title(self):
return self.title
def get_mimetype(self):
return "text/html"
def get_contentprovider(self):
if self.fpath is not None:
return FileProvider(self.fpath)
return StringProvider(self.content)
def get_hints(self):
return {Hint.FRONT_ARTICLE: True}
content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""
item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")
with Creator("test.zim").config_indexing(True, "eng") as creator:
creator.set_mainpath("home")
creator.add_item(item)
creator.add_item(item2)
for name, value in {
"creator": "python-libzim",
"description": "Created in python",
"name": "my-zim",
"publisher": "You",
"title": "Test ZIM",
}.items():
creator.add_metadata(name.title(), value)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
libzim-1.0.0.tar.gz
(8.1 MB
view hashes)
Built Distributions
Close
Hashes for libzim-1.0.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b6b2bbb688ef046d1fb932f5bd7cc584b877b6439fae71e7ff57ce5ea88849d |
|
MD5 | f216c529819794190d6cef44ab33bc0c |
|
BLAKE2b-256 | 868784fb0b847f8ce01a895e3938ac746ffa9ef6e7ce22250ef85369c609a48b |
Close
Hashes for libzim-1.0.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00adcca361da74768f1365210b24bd563f76617faa06b684e61faa8caa3a2d22 |
|
MD5 | 2edbcee08ea225e975f61df27dd4f114 |
|
BLAKE2b-256 | e4a67af61e5e033bb7635c4f7ea8cadd01c7d2f36a7ab24a63b1989a6b71f13d |
Close
Hashes for libzim-1.0.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1624a949a1ebc60574b586f6e35a804e7fa08c35d0aba04a62655b7d3f15d9c |
|
MD5 | 0ba867fb85d7b658cf48338a4e817dcc |
|
BLAKE2b-256 | 3d51fd43fc749085a861f667c4b18a7d97e78926d0e2cdd564345d6708a7dc49 |
Close
Hashes for libzim-1.0.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0af13ceeb03a990a6253ac734bc6ed61f529db7f8252c7e15b892f37f43a1612 |
|
MD5 | 292e3a966242b3da1bf6e35beb057128 |
|
BLAKE2b-256 | eaac8f34396754b46275a3d360572469d507a01ce05279166829501712079df5 |
Close
Hashes for libzim-1.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8457db103e319a55f4a72d39b5b677e994170f0e12172f90785ea77f055d56e8 |
|
MD5 | 35eb998ba44bb57729e86fce26377c63 |
|
BLAKE2b-256 | 2206de46385bc03a9a087969d97cccfad5ccc31dfb2d988b8feebaf90f6f2e3d |
Close
Hashes for libzim-1.0.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1281e1bc4a242cd3d91d44aaf19b11f7f05c7848031da528dc52861665812bff |
|
MD5 | 936fca896b7e391dd1c39e99119075c4 |
|
BLAKE2b-256 | 6a43fabcce58c05bf2f7627887a8854ce211434f01fba06827d814dba5724df1 |
Close
Hashes for libzim-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4907bb0a3880c3a283c7449977e091ff1ea588ac9f86419d777672f77118b21c |
|
MD5 | ce42b1393cad7ef93195322dfda5e984 |
|
BLAKE2b-256 | 690002fb5f561c4f69f5ba06e16c37ed375db083b7e14cf5974af01199d9d57b |
Close
Hashes for libzim-1.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12a52f0c86fe25e963cdefc9cb8e7d71adb7a6f1ece0199c958fc40725367d3a |
|
MD5 | ddca7a860c218213a973e83fbd50c85f |
|
BLAKE2b-256 | caf806332c026376a42e4291dab86c0744f4f4396a191f455b63b1a4ea03e6f0 |