Skip to main content

API for http://abclinuxu.cz.

Project description

Introduction

https://badge.fury.io/py/abclinuxuapi.png https://img.shields.io/pypi/pyversions/abclinuxuapi.svg https://img.shields.io/pypi/l/abclinuxuapi.svg https://readthedocs.org/projects/abclinuxuapi/badge/?version=latest https://img.shields.io/github/issues/Bystroushaak/abclinuxuapi.svg

This module contains basic API for crawling the http://abclinuxu.cz website.

Installation

Module is hosted at PYPI, and can be installed using PIP:

pip install abclinuxuapi

Documentation

Full module documentation is hosted at ReadTheDocs: http://abclinuxuapi.readthedocs.org

Disclaimer

The API was made by me (Bystroushaak) and it is not officially related to the http://abclinuxu.cz project.

Examples

Iterate over all published blogs:

>>> import abclinuxuapi
>>> for blog in abclinuxuapi.iter_blogposts():
...  print blog.title
...
Czech blacklist 1.0.21
iOS aplikace, filemanager, prehravani multimedii...
ENCFS - lze doporucit? mozna uskali?
Vývoj v C# + Oracle ODP.NET + EntityFramework
Skončila svoboda?
Abclinuxu - vyjádření k útokům
Eliptické křivky - vztah Weierstrass, Montgomery, Edwards
kopirovanie raspbianu na microsd kartu
Půjdem dolem, půjdem horem?
Podotčeno…
Abclinuxu presmerovano...
Dead man
Valentýn 2018 (genderově korektní mikrozápisek)
Textilosaurus - co je nového?
Kvíz: Znáte český kraj?
Název filmu
Trilium Notes jako platforma pro mini-aplikace
Marketingový "průzkum" pro zjištění obětí na další útok
Vítězný únor 2018
Reverse engineering komunikace Xorg a nvidia driveru
Vtipná konstrukce v shellu
Anketa: Kdy budou další presidentské volby v ČR?
Debian 9 a data corruption s detektivní zápletkou
Proč je tolik povyku s meltdownem mezi normálními usery
Tabletové skúsenosti pre ľahší život.
...

Get structured information for specific blog:

>>> blog = abclinuxuapi.Blogpost("https://www.abclinuxu.cz/blog/bystroushaak/2017/9/autorske-okenko-neal-asher", lazy=False)
>>> blog.created_ts
1506733800.0
>>> blog.last_modified_ts
1508752260.0
>>> blog.tags
['knihy', 'ProtectedByTagManager', 'recenze', 'sci-fi']
>>> blog.has_tux
False
>>> blog.rating
Rating(100%@5)
>>> blog.readed
1470
>>> blog.comments_n
73
>>> blog.comments[65]
Comment(username=andrea, id=18)
>>> blog.comments[65].registered
False
>>> blog.comments[65].timestamp
1506861120.0
>>> print blog.comments[65].text
supr blogísky, ráda je čtu.
<p class="separator"></p>
myslím že jsem tu od Tebe viděla souhrn knih, které jsi přečetl. měl bys třeba top50 sci-fi, které bych si určitě měla přečíst? nebo alespoň top 10, první trojka?
>>> blog.comments[65].responses
[Comment(username=bystroushaak, id=19)]
>>> print blog.text
<h2>Autorské okénko: Neal Asher</h2>


<p>Dvacátého září jsem dočetl všechno...

Changelog

0.4.16

  • abclinuxu_uploader.py; detect images bigger than 1MB. Added –url parameter to handle these.

  • concept.py; Detect upload of images bigger than 1MB and raise ValueError in such cases.

0.4.15

  • Added better error detection when too long title is used.

0.4.14

  • Fixed bug in parsing of number of comments from blog description.

0.4.13

0.4.12

  • Added abclinxuapi.number_of_blog_pages() function to find out how many blogs is there.

0.4.11

0.4.0 - 0.4.10

  • Added badges to README.

  • Blogpost.comments are now by default blank list instead of None.

  • Fixed bugs in uploader.

  • Parsing of the tags updated.

  • Added support for Blog.uid.

  • Fixed bugs in tests (new year parsing).

  • Added possibility to bypass lazy tag parsing.

  • Fixed bug in date parsing function.

  • Added support for parsing of more obscure date formats used by articles on abclinuxu.

  • Fixed another bug in date parsing function.

  • Added verify=False, because the SSL library pisses me off.

  • Added another special case of parsing the date.

  • Fixed another problem with date formats.

  • Fixed problem with parsing comments on the http://abclinuxu.cz/blog/msk/2016/8/hlada-sa-linux-embedded-vyvojar - there are no links to comments.

  • Fixed comment parsing in case of http://abclinuxu.cz/blog/leos/2007/2/prepis-diskusniho-fora-hw-sekce#31

0.3.0 - 0.3.11

  • Added parsing of comments under blogposts.

  • Fixed bugs.

  • Fixed bugs in user.py.

  • Added iter_blogposts(), first_blog_page() functions for browsing the bloglist.

  • Implemented Blogpost.get_image_urls().

  • Added date_izolator(). Fixed bugs in comments parsing with relative dates.

  • Fixed bug in parsing of Blogpost’s content.

  • Added blog iterator tor User object.

  • Fixed #4 - bug in username parsing.

  • Fixed parsing of censored comments.

  • Added Comment.censored.

  • Comment.registered_user renamed to Comment.registered.

  • Fixed bug which skipped censored comments.

  • Fixed problems with old blogs (different HTML).

  • Implemented #6: .__repr__() for all important classes.

  • Fixed #7 - blogs with opening HTML comments in perex.

  • Fixed bug in Blogpost._parse_content_tag().

  • Another attempt to solve shit in old blogs. There are missing tags, crossed tags and a lot of other shitfucks.

  • Fixed bug caused by http://abclinuxu.cz/blog/Mostly_IMDB/2008/6/radeon-hd-4850-a-tak-vubec#17

  • Added a lot of documentation, fixed docstrings and so on.

  • User.has_blog() changed to bool property User.has_blog.

  • Concept class refactored.

  • Added new parameter data for shared.download().

  • User.ts_to_concept_date moved to shared.ts_to_concept_date().

0.2.0

  • Added a lot of features.

  • Fixed broken setup.py.

0.1.0

  • Created.

  • It can be now used to read data from the abclinuxu, but it is incomplete and it will need a lot of work to do.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abclinuxuapi-0.4.16.tar.gz (32.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page