simplewebscraper

Python library that makes web scraping very simple.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Environment
- Web Environment
Intended Audience
- Developers
License
- OSI Approved :: GNU Library or Lesser General Public License (LGPL)
Operating System
- OS Independent
Programming Language
- Python
Topic
- Internet :: WWW/HTTP

Project description

Documentation is hosted at http://learnwebscraping.com/docs. Note: Documentation is currently being written.

Simplewebscraper is a library designed to facilitate webscraping. It has a lot of built in code for standard web requests, proxy usage, browser cookie imports, and file downloads.

Homepage: https://github.com/alexanderward/simplewebscraper

Simple Usage - More details to come once documentation is complete.

from simplewebscraper import Browser, HTTPMethod, Scraper, ProxyPool

if __name__ == "__main__":

    example_GET = True
    example_GET_parameters = True
    example_POST = False
    example_Proxy = False
    example_cookie_import = False

    if example_GET:
        my_scraper = Scraper()
        my_scraper.HTTP_mode = HTTPMethod.GET
        my_scraper.url = "https://myip.dnsdynamic.org"
        print my_scraper.fetch()

    if example_GET_parameters:
        my_scraper = Scraper()
        my_scraper.HTTP_mode = HTTPMethod.GET
        my_scraper.parameters = {'InData': "75791",
                                 "submit": "Search"}
        my_scraper.url = "http://www.melissadata.com/lookups/GeoCoder.asp"
        print my_scraper.fetch()

    if example_POST:
        my_scraper = Scraper()
        my_scraper.HTTP_mode = HTTPMethod.POST
        my_scraper.parameters = {"email": "example@gmail.com",
                                 "pass": "samplepassword"}
        my_scraper.url = "https://www.dnsdynamic.org/auth.php"
        print my_scraper.fetch()

    if example_Proxy:
        my_scraper = Scraper()
        my_scraper.HTTP_mode = HTTPMethod.GET
        my_scraper.use_per_proxy_count = 5
        my_scraper.proxy_pool = ProxyPool.Hidester  #You can provide a group of proxies like this as well {"https": ["https://212.119.246.138:8080"],"http": []}
        my_scraper.url = "https://myip.dnsdynamic.org"
        print my_scraper.fetch()

    if example_cookie_import:
        my_scraper = Scraper()
        my_scraper.HTTP_mode = HTTPMethod.GET
        my_scraper.cookies = Browser.Chrome  # Chrome or Firefox
        my_scraper.url = "https://myip.dnsdynamic.org"
        print my_scraper.fetch()

Features

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Environment
- Web Environment
Intended Audience
- Developers
License
- OSI Approved :: GNU Library or Lesser General Public License (LGPL)
Operating System
- OS Independent
Programming Language
- Python
Topic
- Internet :: WWW/HTTP

Release history Release notifications | RSS feed

1.43

Feb 5, 2016

1.42

Feb 5, 2016

This version

1.042

Feb 5, 2016

1.42rc0 pre-release

Feb 5, 2016

1.42b0 pre-release

Feb 5, 2016

1.42a0 pre-release

Feb 5, 2016

1.041

Feb 5, 2016

1.04

Feb 5, 2016

1.03

Feb 5, 2016

1.03c pre-release

Feb 5, 2016

1.03b pre-release

Feb 5, 2016

1.03a pre-release

Feb 5, 2016

1.02

Feb 5, 2016

1.01

Feb 5, 2016

1.0

Feb 4, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simplewebscraper-1.042.zip (11.9 kB view hashes)

Uploaded Feb 5, 2016 Source

Built Distribution

simplewebscraper-1.042.win32.exe (210.6 kB view hashes)

Uploaded Feb 5, 2016 Source

Hashes for simplewebscraper-1.042.zip

Hashes for simplewebscraper-1.042.zip
Algorithm	Hash digest
SHA256	`bc6bd8d86a15708c9f082870ef005bd38e1920bd560d018cbadda4033fba218c`
MD5	`9244f6f9961f107ea14949587a48265b`
BLAKE2b-256	`77616f7b59ee025d94e1e30fd5fd0b85263b52a8553fc264ac57a68fbb44148a`

Hashes for simplewebscraper-1.042.win32.exe

Hashes for simplewebscraper-1.042.win32.exe
Algorithm	Hash digest
SHA256	`f085745f7ff60e47fd9b9a19969f688afccbe17c77bbb0efb5666b4bab95ecfe`
MD5	`55349a6602fc7bc00b871ac826917972`
BLAKE2b-256	`39418705b4ced2781e701328c76e6138eaa7d44edf813908d6f739d39d05f12d`