Palu is a small spider, a forked of patu.
Project description
Palu
A small spider, useful for checking a site for 404s and 500s. It’s a forked of [Patu][1].Palu requires httplib2 and lxml:
pip install -U httplib2 lxml
Is it safe? [![Build Status](https://secure.travis-ci.org/akrito/palu.png?branch=master)](http://travis-ci.org/akrito/palu)
Quick Usage
To see available options:
palu.py –help
To spider an entire site using 5 workers, only showing errors:
palu.py –spiders=5 www.example.com
To spider, stopping after the first level of links:
palu.py –depth=1 www.example.com
To get a list of every linked page on a site:
palu.py –generate www.example.com > urls.txt
Instead of spidering for URLs, use a file instead and show all responses:
palu.py –input=urls.txt –verbose www.example.com
Format of URLs File
The output produced by <code>–generate</code> is formatted like so:
FIRST_URL<TAB>None LINK1<TAB>REFERER LINK2<TAB>REFERER
<code>–input</code> can take a file of that format, or one URL per line with no referer. <code>–input=-</code> reads from stdin.
Testing
Palu uses Nose for testing. To install Nose and test:
pip install -U nose nosetests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.