parse-helper

Helpers to fetch & parse text on pages with requests, lxml, & beautifulsoup4

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python
- Python :: 3.5
Topic
- Software Development :: Libraries

Project description

Install

Install system requirements for lxml

% sudo apt-get install -y libxml2 libxslt1.1 libxml2-dev libxslt1-dev zlib1g-dev

or

% brew install libxml2

Install with pip

% pip3 install parse-helper

Usage

The ph-goo, ph-ddg, ph-download-files, ph-download-file-as, and ph-soup-explore scripts are provided

$ venv/bin/ph-goo --help
Usage: ph-goo [OPTIONS] [QUERY]

  Pass a search query to google

Options:
  --page INTEGER                  page number of results
  --since [|year|month|week|day]  limit results by time
  --site TEXT                     limit results by site/domain
  --filetype [|pdf|xls|ppt|doc|rtf]
                                  limit results by filetype
  --help                          Show this message and exit.

$ venv/bin/ph-ddg --help
Usage: ph-ddg [OPTIONS] [QUERY]

  Pass a search query to duckduckgo api

Options:
  --help  Show this message and exit.

$ venv/bin/ph-download-files --help
Usage: ph-download-files [OPTIONS] [ARGS]...

  Download all links to local files

  - args: urls or filenames containing urls

Options:
  --help  Show this message and exit.

$ venv/bin/ph-download-file-as --help
Usage: ph-download-file-as [OPTIONS] URL [LOCALFILE]

  Download link to local file

  - url: a string - localfile: a string

Options:
  --help  Show this message and exit.

$ venv/bin/ph-soup-explore --help
Usage: ph-soup-explore [OPTIONS] [URL_OR_FILE]

  Create a soup object from a url or file and explore with ipython

Options:
  --help  Show this message and exit.

In [1]: import parse_helper as ph

In [2]: ph.USER_AGENT
Out[2]: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/58.0.3029.110 Chrome/58.0.3029.110 Safari/537.36'

In [3]: ph.google_serp('scaling redis')
Out[3]:
[{'link': 'https://redis.io/topics/partitioning',
  'title': 'Partitioning: how to split data among multiple Redis instances. – Redis'},
 {'link': 'http://highscalability.com/blog/2014/9/8/how-twitter-uses-redis-to-scale-105tb-ram-39mm-qps-10000-ins.html',
  'title': 'How Twitter Uses Redis to Scale - 105TB RAM ... - High Scalability'},
 {'link': 'http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/Scaling.RedisReplGrps.html',
  'title': 'Scaling Redis Clusters with Replica Nodes - Amazon ElastiCache'},
 {'link': 'http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/Scaling.RedisStandalone.ScaleUp.html',
  'title': 'Scaling Up Single-Node Redis Clusters - Amazon ElastiCache'},
 {'link': 'https://redislabs.com/ebook/part-3-next-steps/chapter-10-scaling-redis/',
  'title': 'Chapter 10: Scaling Redis - Redis Labs'},
 {'link': 'https://redislabs.com/blog/scaling-out-redis-read-only-slaves-or-cluster/',
  'title': 'Scaling Out Redis: Read-Only Slaves or Cluster? - Redis Labs'},
 {'link': 'http://petrohi.me/post/6323289515/scaling-redis',
  'title': 'ten thousand hours • Scaling Redis'},
 {'link': 'https://www.quora.com/How-scalable-is-Redis',
  'title': 'How scalable is Redis? - Quora'},
 {'link': 'https://www.linkedin.com/pulse/how-twitter-uses-redis-scale-105tb-ram-39mm-qps-10000-iravani',
  'title': 'How Twitter Uses Redis To Scale - 105TB RAM, 39MM QPS ... - LinkedIn'},
 {'link': 'https://docs.microsoft.com/en-us/azure/redis-cache/cache-how-to-scale',
  'title': 'How to Scale Azure Redis Cache | Microsoft Docs'}]

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python
- Python :: 3.5
Topic
- Software Development :: Libraries

Release history Release notifications | RSS feed

0.1.22

Apr 10, 2022

0.1.21

Sep 13, 2019

0.1.20

Aug 27, 2019

0.1.19

Dec 13, 2018

0.1.18

Dec 13, 2018

0.1.17

Jul 20, 2018

This version

0.1.16

Dec 7, 2017

0.1.15

Oct 31, 2017

0.1.14

Oct 21, 2017

0.1.13

Sep 17, 2017

0.1.12

Aug 27, 2017

0.1.11

Jul 6, 2017

0.1.10

Jun 11, 2017

0.1.9

Jun 11, 2017

0.1.8

May 1, 2017

0.1.7

Apr 21, 2017

0.1.6

Apr 19, 2017

0.1.5

Apr 3, 2017

0.1.4

Mar 22, 2017

0.1.3

Mar 18, 2017

0.1.2

Mar 17, 2017

0.1.1

Mar 11, 2017

0.1.0

Mar 9, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

parse_helper-0.1.16-py3-none-any.whl (12.9 kB view hashes)

Uploaded Dec 7, 2017 Python 3

Hashes for parse_helper-0.1.16-py3-none-any.whl

Hashes for parse_helper-0.1.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b3ac32a1b4bdd4b0cb886264ad2b0dd18a9b29814f1e088f90a56215b73c33e2`
MD5	`c8cc2216abdc58ccbc68e0bfaa0dbdd3`
BLAKE2b-256	`0d2d93bc5fcb051c0a78c4e45985e9dec6ef71838739cc6be0e1bf8095f2ba7a`