Skip to main content

No project description provided

Project description

reddit-to-sqlite

Save data from Reddit to SQLite. Dogsheep-based.

Inserts records of posts and comments into a SQLite database. Can be run repeatedly safely; will refresh already-saved results (see Reload, below). Creates posts and comments tables, plus an items view with a unified view.

Usage

reddit-to-sqlite r/python
reddit-to-sqlite u/catherinedevlin 
reddit-to-sqlite --help 

By default, writes to a local reddit.db database (change with --db).

Authorizing

reddit-to-sqlite will look for a file of authorization info (location determined by --auth, defaults to ~/.config/reddit-to-sqlite.json) and, if not found, will query the user and then save the info there. You will need a Reddit username and password, and you will need to register your app with Reddit to get a client_id and client_secret. (More instructions)

Limits

Whether used for users or for subreddits, can't guarantee getting all posts or comments, because

  • Reddit's API only supplies the last 1000 items for each API call, and does not support pagination;
  • Comments nested under a single post won't be retrieved if they are deeply nested in long comment chains (see replace_more)

Reload

reddit_to_sql can be run repeatedly for a given user or subreddit, replacing previously saved results each time. However, to save excessive API use, it works backward through time and stops after it reaches the timestamp of the last saved post, plus an overlap period (default 7 days). That way, recent changes (scores, new comments) will be recorded, but older ones won't unless --post_reload is increased. If posts keep getting comments of interest long after they are posted, you can increase --post_reload.

When loading an individual user's comments, by default reddit_to_sql stops just 1 day after reaching the most recent comment that is already recorded in the database. However, if you're interested in comment scores, you may want to impose a longer --comment_reload, since scores may keep changing for longer than a single day after the comment is posted.

Notes

  • author is saved in case-sensitive form, so case-insensitive searching with LIKE may be helpful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reddit-to-sqlite-0.1.0.tar.gz (6.5 kB view hashes)

Uploaded Source

Built Distribution

reddit_to_sqlite-0.1.0-py3-none-any.whl (7.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page