cabu

cabu is a simple REST microservice to scrap content from anywhere.

These details have not been verified by PyPI

Project links

Homepage

Environment
- Web Environment
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Operating System
- OS Independent
Programming Language
Topic
- Internet :: WWW/HTTP :: Dynamic Content

Project description

Cabu is a simple microservice framework to remotely crawl websites. It’s built on Flask and Selenium, contains a virtual display wrapper and few methods.

Full documentation here

Usage

@app.route('/gizmodo_last_articles_links')
def gizmodo_last_articles():
    app.webdriver.get('http://www.gizmodo.com')
    articles_links = [i.get_attribute('href') for i in app.webdriver.find_elements_by_css_selector('h1.headline>a')]

    return jsonify({'articles': articles_links})

Installing

$ pip install cabu

Features

Selenium configuration out of the box
Flask wrapping
Crawling methods included
AWS S3 Export
FTP / FTPS
Cookies persistence
Link extractor
Proxy configuration
Headless optional for local debug
Docker pre-configured distributed environment
Database handler
Compatible with most Flask extensions (Flask-Admin, Flask-Mail, Flask-OAuth, …)
12 Factors compliance

(Likely to come soon)

CouchDB support
Couchbase support
Mobile drivers
SFTP
HtmlUnit web driver
Remote webdriver wrapper
Parallelization
Neural Network plugins

Testing

All tests were written using Docker services instead of Mocks. Alternative mocks will be added soon ;)

$ pip install -r requirements-dev.txt
$ py.test cabu/tests

Contributing

Please see the Contribute page.

Copyright

Cabu is an open source project by Théotime Lévèque.

Algorithm	Hash digest
SHA256	`56cfb267fa81fe8abb0be1c21f64f839b6ae2c85256943458e722316164db519`
MD5	`f524930414d53f1902fba04f963b9bf8`
BLAKE2b-256	`53111a7f1fadf48c3713badee34d2a3a646d7436da37a109669e2a0ff4d15daf`

Algorithm	Hash digest
SHA256	`ffbcaa80afcf7f4eb7c930e7682889853668dd2b34fd89b30d9cfe15bed6f41a`
MD5	`8bb0d61deef3948e5c688afbc35c2b81`
BLAKE2b-256	`a2601b6942169220ee8cac4983870e4ab550d31a50732137e0edf045446bd5ee`

cabu 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cabu

Usage

Installing

Features

Testing

Contributing

Copyright

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes