light-weight high-level web-crawling framework

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Spydy

spydy is a light-weight high-level web-crawling framework for fast-devlopment and high performance, which is inspired by unix pipeline.

Code

Document

Install

pip install spydy

How to use

There are two ways of running spydy:

one way is to prepare a configuration file, and run spydy from cmd:

spydy myconfig.cfg

myconfig.cfg may looks like below:

[Globals]
run_mode = async_forever
nworkers = 4

[PipeLine]
url = DummyUrls
request = AsyncHttpRequest
parser = DmozParser
log = MessageLog
store = CsvStore

[url]
url = https://dmoz-odp.org
repeat = 10

[store]
file_name = result.csv

or run it from a python file(e.g. spider.py):

from spydy.engine import Engine
from spydy.utils import check_configs
from spydy import urls, request, parsers, logs, store

myconfig = {
    "Globals":{
        "run_mode": "async_forever",
        "nworkers": "4"
    },
    "PipeLine":[urls.DummyUrls(url="https://dmoz-odp.org", repeat=10),
                request.AsyncHttpRequest(), parsers.DmozParser(), logs.MessageLog(), store.CsvStore(file_name=FILE_NAME)]
    }

check_configs(myconfig)
spider = Engine.from_dict(myconfig)
spider.run()

then run it :

$ python spider.py

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.25

May 2, 2021

0.1.24

May 2, 2021

0.1.23

May 1, 2021

0.1.22

Mar 18, 2021

0.1.21

Mar 17, 2021

0.1.20

Mar 9, 2021

0.1.19

Mar 8, 2021

0.1.18

Mar 4, 2021

0.1.17

Feb 26, 2021

0.1.16

Feb 26, 2021

0.1.15

Feb 24, 2021

0.1.14

Feb 24, 2021

0.1.13

Feb 23, 2021

0.1.12

Feb 20, 2021

0.1.11

Feb 20, 2021

0.1.10

Feb 20, 2021

0.1.9

Feb 19, 2021

0.1.8

Feb 10, 2021

0.1.7

Feb 8, 2021

0.1.6

Feb 6, 2021

0.1.5

Feb 5, 2021

0.1.4

Feb 4, 2021

0.1.3

Jan 22, 2021

0.1.2

Jan 21, 2021

0.1.1

Jan 19, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spydy-0.1.25.tar.gz (16.6 kB view hashes)

Uploaded May 2, 2021 Source

Hashes for spydy-0.1.25.tar.gz

Hashes for spydy-0.1.25.tar.gz
Algorithm	Hash digest
SHA256	`bee78e8e9a278b758961f87c463869b6e7fafc76dbc865a7bf4b3f7bde45edf7`
MD5	`1858486eac5219720a87d8677a0434b3`
BLAKE2b-256	`0b7577f9eb441d24d345511f500e14e7cd01e99ac8b00380e4e8d8422ab4ecd6`