scrapy util
Project description
Scrapy util
基于scrapy 的一些扩展
pypi: https://pypi.org/project/scrapy-util
github: https://github.com/mouday/scrapy-util
pip install scrapy-util
启用数据收集功能
此功能配合 spider-admin-pro 使用
# 设置收集运行日志的路径,会以post方式提交json数据
STATS_COLLECTION_URL = "http://127.0.0.1:5001/api/statsCollection/addItem"
# 启用数据收集扩展
EXTENSIONS = {
# ===========================================
# 可选:如果收集到的时间是utc时间,可以使用本地时间扩展收集
'scrapy.extensions.corestats.CoreStats': None,
'scrapy_util.extensions.LocaltimeCoreStats': 0,
# ===========================================
# 可选,打印程序运行时长
'scrapy_util.extensions.ShowDurationExtension': 100,
# 启用数据收集扩展
'scrapy_util.extensions.StatsCollectorExtension': 100
}
使用脚本Spider
# -*- coding: utf-8 -*-
from scrapy import cmdline
from scrapy_util.spiders import ScriptSpider
class BaiduScriptSpider(ScriptSpider):
name = 'baidu_script'
def execute(self):
print("hi")
if __name__ == '__main__':
cmdline.execute('scrapy crawl baidu_script'.split())
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrapy-util-0.0.9.tar.gz
(5.0 kB
view hashes)
Built Distribution
Close
Hashes for scrapy_util-0.0.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85b3340f80de8f279842daedd98bb21f0226fc98c2652d869b721c58d5886c20 |
|
MD5 | 24f697fcd3b479deb0459d346a49d0e9 |
|
BLAKE2b-256 | 8530ccacfc9ec7b888317dac08d073d3358dde11efc6bd50ff3401a9dcccced7 |