hackernews_scraper

Python library for retrieving comments and stories from HackerNews

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

hackernews-scraper
==================

Scrape [hacker news](https://news.ycombinator.com) comments and posts
using the [Algolia API](http://hn.algolia.com/api/).

Usage
=====

```python
from hackernews-scraper import CommentScraper

CommentScraper.getComments(since=1394039447)
```

The above will return a generator that will yield one comment at a time.
It will keep on going until there are no more comments to fetch, or until
it reaches the 50 pages limit set by hacker news. In the latter case, a
`TooManyItemsException` will be raised.

If the hacker news API response is missing any required fields, the scraper
will raise `KeyError`.

Response format
===============

Comments:
```
{
'author': u'dhmholley',
'comment_id': u'7531026',
'comment_text': u'Are people still blowing this whistle?...',
'created_at': u'2014-04-04T12:57:38.000Z',
'parent_id': 7530853,
'points': 1,
'story_id': None,
'story_title': None,
'story_url': None,
'timestamp': 1396616258,
'title': None,
'url': None
}
```

Stories:
```
{
'author': u'sethco',
'created_at': u'2014-04-04T12:56:23.000Z',
'objectID': None,
'points': 1,
'story_text': 1,
'timestamp': 1396616183,
'title': u'Opower IPO today',
'url': u'http://www.businesswire.com/news/home/20140403006541/en#.Uz4cbq1dVih'
}
```

Testing
=======

You need to have [httpretty](https://github.com/gabrielfalcao/HTTPretty)
and [factory-boy](https://github.com/rbarrois/factory_boy) installed.

Run `nosetests` in the root folder or the `tests` folder.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.0.2

Jul 21, 2014

1.0.1

Jul 11, 2014

1.0.0

Jul 11, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hackernews_scraper-1.0.2.tar.gz (5.4 kB view hashes)

Uploaded Jul 21, 2014 Source

Hashes for hackernews_scraper-1.0.2.tar.gz

Hashes for hackernews_scraper-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`83e78a533c0db1e4a5288c2d55efa302c5523072c6fbbbafefb00c8b0b51ef3d`
MD5	`71cab268b526b0997e4e5fdceb744e5b`
BLAKE2b-256	`e442248201768b9bceef4fb7e463d9377e45d9c434ef8ac255df857486729383`