Python library that makes web scraping very simple.
Project description
Documentation is hosted at http://learnwebscraping.com/docs. Note: Documentation is currently being written.
Simplewebscraper is a library designed to facilitate webscraping. It has a lot of built in code for standard web requests, proxy usage, browser cookie imports, and file downloads.
Homepage: https://github.com/alexanderward/simplewebscraper
Simple Usage - More details to come once documentation is complete.
from simplewebscraper import Browser, HTTPMethod, Scraper, ProxyPool
if __name__ == "__main__":
example_GET = True
example_GET_parameters = True
example_POST = False
example_Proxy = False
example_cookie_import = False
if example_GET:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_GET_parameters:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.parameters = {'InData': "75791",
"submit": "Search"}
my_scraper.url = "http://www.melissadata.com/lookups/GeoCoder.asp"
print my_scraper.fetch()
if example_POST:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.POST
my_scraper.parameters = {"email": "example@gmail.com",
"pass": "samplepassword"}
my_scraper.url = "https://www.dnsdynamic.org/auth.php"
print my_scraper.fetch()
if example_Proxy:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.use_per_proxy_count = 5
my_scraper.proxy_pool = ProxyPool.Hidester #You can provide a group of proxies like this as well {"https": ["https://212.119.246.138:8080"],"http": []}
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
if example_cookie_import:
my_scraper = Scraper()
my_scraper.HTTP_mode = HTTPMethod.GET
my_scraper.cookies = Browser.Chrome # Chrome or Firefox
my_scraper.url = "https://myip.dnsdynamic.org"
print my_scraper.fetch()
Features
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
simplewebscraper-1.03b.zip
(11.3 kB
view hashes)
Built Distribution
simplewebscraper-1.03b.win32.exe
(207.6 kB
view hashes)
Close
Hashes for simplewebscraper-1.03b.win32.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | be01c6618029bba8a8cbf5f54ef441df0ac37e3df5834b753b1eb87a01c02f02 |
|
MD5 | 2d100751ef80bb387da8e97da17b5ea1 |
|
BLAKE2b-256 | 2d0c168f051de5605895f169146d83d89a8f4a6e439c6740c545120543caba99 |