Pure-Python robots.txt parser with support for modern conventions
Project description
Protego
Overview
Protego is a pure-Python robots.txt
parser with support for modern conventions.
Requirements
- Python 2.7 or Python 3.4+
- Works on Linux, Windows, Mac OSX, BSD
Install
To install Protego, simply use pip:
pip install protego
Usage
>> from protego import Protego
>> import requests
>> r = requests.get('https://google.com/robots.txt')
>> rp = Protego.parse(r.text)
>> # That's it! We can now perform queries.
>> rp.can_fetch('https://google.com/search', 'mybot')
False
>> rp.can_fetch('https://google.com/search/about', 'mybot')
True
>> list(rp.sitemaps)
['https://www.google.com/sitemap.xml']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Protego-0.1.dev0.tar.gz
(4.9 kB
view hashes)
Built Distributions
Protego-0.1.dev0-py3.7.egg
(9.3 kB
view hashes)
Protego-0.1.dev0-py2.7.egg
(9.3 kB
view hashes)