Pandas utility for handling very wide DataFrames
Project description
querycolumn
Simple extension to Pandas that makes it easier to select columns in (very) wide DataFrames. If you name your columns in a hierarchical fashion with a separator (e.g, as you might get from pd.normalize_json()
, it lets you select a column or group of columns easily and with tab completion.
Here's a quick demo of what this looks like.
>>> import pandas as pd
>>> from querycolumn import patch_dataframe_with_query_columns
>>> patch_dataframe_with_query_columns()
>>> df = pd.DataFrame([{'person.size.height': 3, 'person.size.weight': 4}])
>>> df.qc.person
<QueryColumns @ person: size>
>>> df.qc.person.size
<QueryColumns @ person.size: height, weight>
>>> df.qc.person.size.weight
0 4
Name: person.size.weight, dtype: int64
>>> df[df.qc.person.size]
>>>
QueryColumns does its magic by patching a Descriptor into the DataFrame
class. When you retrieve the attribute qc
from a frame, `QueryColumns.get()' is invoked; this is when QueryColumns introspects its parent dataframe and returns a magical object with tab completion for segmented column names.
Note I wrote QueryColumns after working with Pandas for about a week. I don't know if it encourages elegant use of the library, but I find it useful for my usecase. I'm practically certain that it doesn't extend Pandas as one should (there are explicit APIs for that), but I couldn't think of a way to do it without the Descritor protocol.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for querycolumns-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 139960401a26e94229a7b57a89298aebd525a8551e208db7a123639ada831d04 |
|
MD5 | 82c57f57426315b9ca35f313b0ecb64e |
|
BLAKE2b-256 | f06b83312cc63fcb95c63f937402b7960e284e0cdadb907985582dc30fa1394b |