qcache

In memory cache server with analytical query capabilities

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

======
QCache
======

.. image:: https://badge.fury.io/py/qcache.png
:target: http://badge.fury.io/py/qcache

.. image:: https://travis-ci.org/tobgu/qcache.png?branch=master
:target: https://travis-ci.org/tobgu/qcache

.. image:: https://pypip.in/d/qcache/badge.png
:target: https://crate.io/packages/qcache?version=latest

In memory cache server with analytical query capabilities

Features
--------

* TODO

Requirements
------------

- Python == 2.7 for now

License
-------

MIT licensed. See the bundled `LICENSE <https://github.com/tobgu/qcache/blob/master/LICENSE>`_ file for more details.

TODO
----
* Query language for filtering, sorting and pagination
* Cache eviction
- By age for mutable data
- By size (number of lines and or bytes)
* Support both JSON and CSV input/output
* Call the right server using some sort of stable hashing?
- Use a fixed number of cache servers to start with
* Configurable URL prefix
* One thread/process per container
* Assure that memory usage is stable over time
* Implement both GET and POST to query (using .../q/)
* Make it possible to execute multiple queries in one request (qs=,/qs/)
* Make it possible to do explicit evict by DELETE
* Discovery of dead servers is done when a request to the server is required
* Allow post with data and query in one request, this will guarantee progress
as long as the dataset fits in memory. {"query": ..., dataset: ...}
* Counters available at special URL (cache hits direct and indirect, misses, dataset size distribution, exception count)
* Counters to influx DB
* Exceptions to Sentry

Links
-----
* http://stackoverflow.com/questions/23886030/how-to-post-a-very-long-url-using-python-requests-module
* http://stackoverflow.com/questions/18089667/how-to-estimate-how-much-memory-a-pandas-dataframe-will-need
* http://stackoverflow.com/questions/16524545/how-to-write-a-web-proxy-in-python
* https://groups.google.com/forum/#!topic/python-tornado/TB_6oKBmdlA
* http://stackoverflow.com/questions/16626058/what-is-the-performance-impact-of-non-unique-indexes-in-pandas

Configuration file
------------------
* Maximum size
- Get the size of data frame: df.values.nbytes + df.index.nbytes + df.columns.nbytes
* Maximum age
- Seconds
* List of hosts in the "cluster"
- IP address and port number

Using cURL to test
------------------
* time curl -X POST --data-binary @my_csv2.csv http://localhost:8888/url_prefix/big
* curl localhost:8888/url_prefix/big

Query examples
==============

Select all
----------
{}

Projection
----------
{"select": ["foo", "bar"]}

Aggregation, max, min and so on.

Not specifying select means "select *"

Filtering
---------
Lisp style prefix notation

Exact:
{"where": ["==" "foo" 1]}

Comparison:
{"where": ["<" "foo" 1]}
!=, <=, <, >, >=

In:
{"where": ["in" "foo" [1, 2]]}

Clauses:
{"where": ["&" [">" "foo" 1],
["==" "bar" 2]]}
&, |

Negation:
{"where": ["!" ["=" "foo" 1]]}

Ordering
--------
{"order_by": ["foo"]} Asc
{"order_by": ["-foo"]} Desc

Offset
------
{"offset": 5}

Limit
-----
{"limit": 10}

Group by
--------
{"group_by": ["foo"]}

API examples using curl
-----------------------
curl -G localhost:8888/url_prefix/fairlybig --data-urlencode "q={\"select\": [[\"count\"]], \"where\": [\"<\", \"baz\", 99999999999915], \"offset\": 100, \"limit\": 50}"
curl -G localhost:8888/url_prefix/fairlybig --data-urlencode "q={\"select\": [[\"count\"]], \"where\": [\"in\", \"baz\", [779889,8958854,8281368,6836605,3080972,4072649,7173075,4769116,4766900,4947128,7314959,683531,6395813,7834211,12051932,3735224,12368089,9858334,4424629,4155280]], \"offset\": 0, \"limit\": 50}"
curl -G localhost:8888/url_prefix/fairlybig --data-urlencode "q={\"where\": [\"==\", \"foo\", \"\\\"95d9f671\\\"\"], \"offset\": 0, \"limit\": 50}"
curl -G localhost:8888/url_prefix/fairlybig --data-urlencode "q={\"select\": [[\"max\", \"baz\"]], \"offset\": 0, \"limit\": 500000000000}"
curl -X POST --data-binary @fairly_big.csv http://localhost:8888/url_prefix/fairlybig

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

0.9.3

Jan 5, 2019

0.9.2

May 23, 2018

0.9.1

Nov 15, 2017

0.9.0

Nov 14, 2017

0.8.1

Apr 6, 2017

0.8.0

Jan 8, 2017

0.7.2

Dec 18, 2016

0.7.1

Nov 30, 2016

0.7.0

Nov 9, 2016

0.6.1

Sep 18, 2016

0.6.0

Sep 18, 2016

0.5.0

Jun 19, 2016

0.4.2

Jun 4, 2016

0.4.1

Jan 31, 2016

0.4.0

Jan 24, 2016

0.3.0

Dec 23, 2015

0.2.1

Dec 15, 2015

0.2.0

Dec 6, 2015

0.1.0

Oct 25, 2015

This version

0.0.1

Oct 3, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qcache-0.0.1.tar.gz (8.5 kB view hashes)

Uploaded Oct 3, 2015 Source

Hashes for qcache-0.0.1.tar.gz

Hashes for qcache-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`370481053c14a133e4ce643ce42679fb1bc465c7803cf5347cf7d6474f6d7018`
MD5	`da73065099de9d7ac7a49d1c92c825ea`
BLAKE2b-256	`ce3a62844fee0e6f3c036f2f141f5ff3b3385e93cf930ba0fe9d53ab3b8c4c4e`