Skip to main content

Simple, composable column selector for loc[], iloc[], assign() and others.

Project description

Pandas Selector

Simple, composable column selector for loc[], iloc[], assign() and others.

Documentation Status Test Status

Overview

  • Motivation: Make chaining Pandas operations easier and bring functionality to Pandas similar to Spark’s col() function or referencing columns in R’s dplyr.

  • Install from PyPI with pip install pandas-selector. Pandas versions 1.0+ (^1.0) are supported.

  • Documentation can be found at readthedocs.

  • Source code can be obtained from GitHub.

Example: Create new column and filter

Instead of writing “traditional” Pandas like this:

df_in = pd.DataFrame({"x": range(5)})
df = df_in.assign(y = df_in.x // 2)
df = df.loc[df.y <= 1]
df
#    x  y
# 0  0  0
# 1  1  0
# 2  2  1
# 3  3  1

One can write:

from pandas_selector import DF
df = (df_in
      .assign(y = DF.x // 2)
      .loc[DF.y <= 1]
     )

This is especially handy when re-iterating on data frame manipulations interactively, e.g. in a notebook.

But you can access all methods and attributes of the data frame from the context:

df = pd.DataFrame({
    "X": range(5),
    "Y": ["1", "a", "c", "D", "e"],
})
df.loc[DF.y.str.isupper() | DF.y.str.isnumeric()]
#    X  y
# 0  0  1
# 3  3  D
df.loc[:, DF.columns.str.isupper()]
#    X
# 0  0
# 1  1
# 2  2
# 3  3
# 4  4

Similar projects for pandas

  • pandas-ply

    • (-) stale(?), last change 6 years ago

    • (-) new API to learn

    • (-) Symbol / pandas_ply.X works only with ply_* functions

  • pandas-select

    • (+) no explicite df necessary

    • (-) new API to learn

  • pandas-selectable

    • (+) simple select accessor

    • (-) usage inside chains clumsy (needs explicite df):

      ((df
        .select.A == 'a')
        .select.B == 'b'
      )
    • (-) hard-coded str, dt accessor methods

    • (?) composable?

Development

Development is containerized with [Docker](https://www.docker.com/) to separte from host systems and improve reproducability. No other prerequisites are needed on the host system.

Recommendation for Windows users: install WSL 2 (tested on Ubuntu 20.04), and for containerized workflows, Docker Desktop for Windows.

The common tasks are collected in Makefile (See make help for a complete list):

  • Run the unit tests: make test or make watch for continuously running tests on code-changes.

  • Build the documentation: make docs

  • TODO: Update the poetry.lock file: make lock

  • Add a dependency:

    1. Start a shell in a new container.

    2. Add dependency with poetry add in the running container. This will update poetry.lock automatically:

      # 1. On the host system
      % make shell
      # 2. In the container instance:
      I have no name!@7d0e85b3a303:/app$ poetry add --dev --lock falcon
  • Build the development image make devimage (Note: This should be done automatically for the targets.)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_selector-0.1.1.tar.gz (6.7 kB view hashes)

Uploaded Source

Built Distribution

pandas_selector-0.1.1-py3-none-any.whl (6.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page