Skip to main content

ODPS Python SDK and data analysis framework

Project description

PyPI version Docs License Implementation

Elegent way to access ODPS API. Documentation

Installation

The quick way:

pip install pyodps[full]

If you don’t need to use Jupyter, just type

pip install pyodps

The dependencies will be installed automatically.

Or from source code:

$ virtualenv pyodps_env
$ source pyodps_env/bin/activate
$ pip install git+https://github.com/aliyun/aliyun-odps-python-sdk.git

Dependencies

  • Python (>=2.7), including Python 3+, pypy, Python 3.7 recommended

  • setuptools (>=3.0)

Run Tests

  • install pytest

  • copy conf/test.conf.template to odps/tests/test.conf, and fill it with your account

  • run pytest odps

Usage

>>> import os
>>> from odps import ODPS
>>> # Make sure environment variable ALIBABA_CLOUD_ACCESS_KEY_ID already set to Access Key ID of user
>>> # while environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET set to Access Key Secret of user.
>>> # Not recommended to hardcode Access Key ID or Access Key Secret in your code.
>>> o = ODPS(
>>>     os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
>>>     os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
>>>     project='**your-project**',
>>>     endpoint='**your-endpoint**',
>>> )
>>> dual = o.get_table('dual')
>>> dual.name
'dual'
>>> dual.table_schema
odps.Schema {
  c_int_a                 bigint
  c_int_b                 bigint
  c_double_a              double
  c_double_b              double
  c_string_a              string
  c_string_b              string
  c_bool_a                boolean
  c_bool_b                boolean
  c_datetime_a            datetime
  c_datetime_b            datetime
}
>>> dual.creation_time
datetime.datetime(2014, 6, 6, 13, 28, 24)
>>> dual.is_virtual_view
False
>>> dual.size
448
>>> dual.table_schema.columns
[<column c_int_a, type bigint>,
 <column c_int_b, type bigint>,
 <column c_double_a, type double>,
 <column c_double_b, type double>,
 <column c_string_a, type string>,
 <column c_string_b, type string>,
 <column c_bool_a, type boolean>,
 <column c_bool_b, type boolean>,
 <column c_datetime_a, type datetime>,
 <column c_datetime_b, type datetime>]

DataFrame API

>>> from odps.df import DataFrame
>>> df = DataFrame(o.get_table('pyodps_iris'))
>>> df.dtypes
odps.Schema {
  sepallength           float64
  sepalwidth            float64
  petallength           float64
  petalwidth            float64
  name                  string
}
>>> df.head(5)
|==========================================|   1 /  1  (100.00%)         0s
   sepallength  sepalwidth  petallength  petalwidth         name
0          5.1         3.5          1.4         0.2  Iris-setosa
1          4.9         3.0          1.4         0.2  Iris-setosa
2          4.7         3.2          1.3         0.2  Iris-setosa
3          4.6         3.1          1.5         0.2  Iris-setosa
4          5.0         3.6          1.4         0.2  Iris-setosa
>>> df[df.sepalwidth > 3]['name', 'sepalwidth'].head(5)
|==========================================|   1 /  1  (100.00%)        12s
          name  sepalwidth
0  Iris-setosa         3.5
1  Iris-setosa         3.2
2  Iris-setosa         3.1
3  Iris-setosa         3.6
4  Iris-setosa         3.9

Command-line and IPython enhancement

In [1]: %load_ext odps

In [2]: %enter
Out[2]: <odps.inter.Room at 0x10fe0e450>

In [3]: %sql select * from pyodps_iris limit 5
|==========================================|   1 /  1  (100.00%)         2s
Out[3]:
   sepallength  sepalwidth  petallength  petalwidth         name
0          5.1         3.5          1.4         0.2  Iris-setosa
1          4.9         3.0          1.4         0.2  Iris-setosa
2          4.7         3.2          1.3         0.2  Iris-setosa
3          4.6         3.1          1.5         0.2  Iris-setosa
4          5.0         3.6          1.4         0.2  Iris-setosa

Python UDF Debugging Tool

#file: plus.py
from odps.udf import annotate

@annotate('bigint,bigint->bigint')
class Plus(object):
    def evaluate(self, a, b):
        return a + b
$ cat plus.input
1,1
3,2
$ pyou plus.Plus < plus.input
2
5

Contributing

For a development install, clone the repository and then install from source:

git clone https://github.com/aliyun/aliyun-odps-python-sdk.git
cd pyodps
pip install -r requirements.txt -e .

If you need to modify the frontend code, you need to install nodejs/npm. To build and install your frontend code, use

python setup.py build_js
python setup.py install_js

License

Licensed under the Apache License 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyodps-0.11.5b1.tar.gz (1.2 MB view hashes)

Uploaded Source

Built Distributions

pyodps-0.11.5b1-cp311-cp311-win_amd64.whl (1.8 MB view hashes)

Uploaded CPython 3.11 Windows x86-64

pyodps-0.11.5b1-cp311-cp311-win32.whl (1.8 MB view hashes)

Uploaded CPython 3.11 Windows x86

pyodps-0.11.5b1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pyodps-0.11.5b1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

pyodps-0.11.5b1-cp311-cp311-macosx_10_9_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

pyodps-0.11.5b1-cp311-cp311-macosx_10_9_universal2.whl (2.3 MB view hashes)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

pyodps-0.11.5b1-cp310-cp310-win_amd64.whl (1.9 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

pyodps-0.11.5b1-cp310-cp310-win32.whl (1.8 MB view hashes)

Uploaded CPython 3.10 Windows x86

pyodps-0.11.5b1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pyodps-0.11.5b1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

pyodps-0.11.5b1-cp310-cp310-macosx_10_9_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

pyodps-0.11.5b1-cp310-cp310-macosx_10_9_universal2.whl (2.3 MB view hashes)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

pyodps-0.11.5b1-cp39-cp39-win_amd64.whl (1.9 MB view hashes)

Uploaded CPython 3.9 Windows x86-64

pyodps-0.11.5b1-cp39-cp39-win32.whl (1.8 MB view hashes)

Uploaded CPython 3.9 Windows x86

pyodps-0.11.5b1-cp39-cp39-manylinux2014_aarch64.whl (5.9 MB view hashes)

Uploaded CPython 3.9

pyodps-0.11.5b1-cp39-cp39-manylinux1_x86_64.whl (3.4 MB view hashes)

Uploaded CPython 3.9

pyodps-0.11.5b1-cp39-cp39-macosx_10_9_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

pyodps-0.11.5b1-cp39-cp39-macosx_10_9_universal2.whl (2.3 MB view hashes)

Uploaded CPython 3.9 macOS 10.9+ universal2 (ARM64, x86-64)

pyodps-0.11.5b1-cp38-cp38-win_amd64.whl (1.9 MB view hashes)

Uploaded CPython 3.8 Windows x86-64

pyodps-0.11.5b1-cp38-cp38-win32.whl (1.8 MB view hashes)

Uploaded CPython 3.8 Windows x86

pyodps-0.11.5b1-cp38-cp38-manylinux2014_aarch64.whl (6.1 MB view hashes)

Uploaded CPython 3.8

pyodps-0.11.5b1-cp38-cp38-manylinux1_x86_64.whl (3.5 MB view hashes)

Uploaded CPython 3.8

pyodps-0.11.5b1-cp38-cp38-macosx_10_9_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

pyodps-0.11.5b1-cp37-cp37m-win_amd64.whl (2.2 MB view hashes)

Uploaded CPython 3.7m Windows x86-64

pyodps-0.11.5b1-cp37-cp37m-win32.whl (2.1 MB view hashes)

Uploaded CPython 3.7m Windows x86

pyodps-0.11.5b1-cp37-cp37m-manylinux2014_aarch64.whl (3.9 MB view hashes)

Uploaded CPython 3.7m

pyodps-0.11.5b1-cp37-cp37m-manylinux1_x86_64.whl (4.8 MB view hashes)

Uploaded CPython 3.7m

pyodps-0.11.5b1-cp37-cp37m-macosx_10_9_x86_64.whl (2.2 MB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

pyodps-0.11.5b1-cp36-cp36m-win_amd64.whl (2.2 MB view hashes)

Uploaded CPython 3.6m Windows x86-64

pyodps-0.11.5b1-cp36-cp36m-win32.whl (2.1 MB view hashes)

Uploaded CPython 3.6m Windows x86

pyodps-0.11.5b1-cp36-cp36m-manylinux1_x86_64.whl (4.6 MB view hashes)

Uploaded CPython 3.6m

pyodps-0.11.5b1-cp36-cp36m-macosx_10_9_x86_64.whl (2.2 MB view hashes)

Uploaded CPython 3.6m macOS 10.9+ x86-64

pyodps-0.11.5b1-cp35-cp35m-win_amd64.whl (2.1 MB view hashes)

Uploaded CPython 3.5m Windows x86-64

pyodps-0.11.5b1-cp35-cp35m-win32.whl (2.0 MB view hashes)

Uploaded CPython 3.5m Windows x86

pyodps-0.11.5b1-cp35-cp35m-manylinux1_x86_64.whl (4.5 MB view hashes)

Uploaded CPython 3.5m

pyodps-0.11.5b1-cp27-cp27mu-manylinux1_x86_64.whl (4.2 MB view hashes)

Uploaded CPython 2.7mu

pyodps-0.11.5b1-cp27-cp27m-manylinux1_x86_64.whl (4.2 MB view hashes)

Uploaded CPython 2.7m

pyodps-0.11.5b1-cp27-cp27m-macosx_10_9_x86_64.whl (2.1 MB view hashes)

Uploaded CPython 2.7m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page