Python library that makes web scraping very simple.
Project description
Documentation is hosted at http://learnwebscraping.com/docs. Note: Documentation is currently being written.
Simplewebscraper is a library designed to facilitate webscraping. It has a lot of built in code for standard web requests, proxy usage, browser cookie imports, and file downloads.
Homepage: https://github.com/alexanderward/simplewebscraper
Simple Usage - More details to come once documentation is complete.
from simplewebscraper import Browser, HTTPMethod, Scraper, ProxyPool
if __name__ == "__main__":
example_GET = True
example_GET_parameters = True
example_POST = False
example_Proxy = False
example_cookie_import = False
if example_GET:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_GET_parameters:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.parameters = {'InData': "75791",
"submit": "Search"}
my_scraper.url = "http://www.melissadata.com/lookups/GeoCoder.asp"
print my_scraper.fetch()
if example_POST:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.POST
my_scraper.parameters = {"email": "example@gmail.com",
"pass": "samplepassword"}
my_scraper.url = "https://www.dnsdynamic.org/auth.php"
print my_scraper.fetch()
if example_Proxy:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.use_per_proxy_count = 5
my_scraper.proxy_pool = ProxyPool.Hidester #You can provide a group of proxies like this as well {"https": ["https://212.119.246.138:8080"],"http": []}
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_cookie_import:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.cookies = Browser.Chrome # Chrome or Firefox
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
Features
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
simplewebscraper-1.041.zip
(11.9 kB
view hashes)
Built Distribution
simplewebscraper-1.041.win32.exe
(210.6 kB
view hashes)
Close
Hashes for simplewebscraper-1.041.win32.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9373bf7e8e2a0ca2ba9151e8c60df2fe577d9a5338005606bb677e79eeff6d2 |
|
MD5 | b0f0f168f53824aed58c32ce48a32738 |
|
BLAKE2b-256 | 2d319899ec15b0f7f5447bf27b39a13050b79fb35a43d2094fff6715e7d02128 |