s3 · PyPI

Python module which connects to Amazon's S3 REST API

These details have not been verified by PyPI

Project links

Homepage

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Overview

s3 is a connector to S3, Amazon’s Simple Storage System REST API.

Use it to upload, download, delete, copy, test files for existence in S3, or update their metadata.

S3 files may have metadata in addition to their content. Metadata is a set of key/value pairs. Metadata may be set when the file is uploaded or it can be updated subsequently.

Installation

From PyPi

$ pip install s3

From source

$ hg clone ssh://hg@bitbucket.org/prometheus/s3
$ pip install -e s3

The installation is successful if you can import s3. The following command must produce no errors:

$ python -c 'import s3'

API to remote storage

S3 Filenames

An S3 file name consists of a bucket and a key. This pair of strings uniquely identifies the file within S3.

The S3Name class is instantiated with a key and a bucket; the key is required and the bucket defaults to None.

The RemoteStore class methods take a remote_name argument which can be either a string which is the key, or an instance of the S3Name class. When no bucket is given (or the bucket is None) then the default_bucket established when the connection is instantiated is used. If no bucket is given (or the bucket is None) and there is no default bucket then a ValueError is raised.

In other words, the S3Name class provides a means of using a bucket other than the default_bucket.

Headers and Metadata

Additional http headers may be sent using the methods which write data. These methods accept an optional headers argument which is a python dict. The headers control various aspects of how the file may be handled. S3 supports a variety of headers. These are not discussed here. See the S3 documentation for more information on S3 headers. Those headers whose key begins with the special preifx: ‘x-amz-meta-’ are considered to be metadata headers and are used to set the metadata attributes of the file.

The methods which read files also return metadata which consists of only those response headers which begin with ‘x-amz-meta-‘.

Methods

The arguments remote_source, remote_destination, and remote_name may be either a string, or an S3Name instance.

local_name is a string and is the name of the file on the local system. This string is passed directly to open().

headers is a python dict used to encode additional request headers.

ok = store.copy(remote_source, remote_destination, headers={}): Copy from remote source to remote destination. If there are no metadata headers in headers, the destination metadata is copied from the source metadata, otherwise it is the metadata in headers.
ok = store.delete(remote_name): Delete file from remote store.
exists, metadata = store.exists(remote_name): Test if file exists in remote store and retrieve its metadata if it does.
ok, metadata = store.read(remote_name, local_name): Download a file from remote store and retrieve its metadata.
ok = store.update_metadata(remote_name, headers): Update a remote file’s metadata.
ok = store.write(local_name, remote_name, headers={}): Upload a file to remote store (and possibly set its metadata).

When a request is sent, the requests module may raise an exception or may return status. The status could indicate success or failure. So there are 2 kinds of failure.

All methods except exists catch connection exceptions, log either type of failure, and return True for success, and False for failure.

The exists method does not catch connection exceptions and may also raise RemoteStoreError. It returns True or False as the remote_name exists in remote storage or not.

copy, update_metadata, and write accept a headers argument which is used to provide additional headers to the request.

exists returns a tuple whose first element indicates if the file exists or not, and whose second element is the file’s metadata when the file exists, and an empty dict otherwise.

read returns a tuple whose first element indicates success or failure and whose second element is the file’s metadata on success, and an empty dict otherwise.

Usage

Configuration

First configure your yaml file.

access_key_id and secret_access_key are generated by the S3 account manager. They are effectively the username and password for the account.
bucket is the name of the default bucket to use when referencing S3 files. bucket names must be unique (on earth) so by convention we use a prefix on all our bucket names: com.prometheus.
endpoint is the Amazon server url to connect to. See http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for a list of the available endpoints.
tls True => use https://, False => use http://. Default is True.

Here is an example s3.yaml

---
s3:
    access_key_id: "XXXXX"
    secret_access_key: "YYYYYYY"
    default_bucket: "ZZZZZZZ"
    endpoint: "s3-us-west-2.amazonaws.com"

Next configure your S3 bucket permissions. Eventually, s3 will support bucket management. Until then use Amazon’s web interface:

Log onto your Amazon account.
Create a bucket or click on an existing bucket.
Click on Properties.
Click on Permissions.
Click on Edit Bucket Policy.

Here is a example policy with the required permissions:

{
        "Version": "2008-10-17",
        "Id": "Policyxxxxxxxxxxxxx",
        "Statement": [
                {
                        "Sid": "Stmtxxxxxxxxxxxxx",
                        "Effect": "Allow",
                        "Principal": {
                                "AWS": "arn:aws:iam::xxxxxxxxxxxx:user/XXXXXXX"
                        },
                        "Action": [
                                "s3:AbortMultipartUpload",
                                "s3:GetObjectAcl",
                                "s3:GetObjectVersion",
                                "s3:DeleteObject",
                                "s3:DeleteObjectVersion",
                                "s3:GetObject",
                                "s3:PutObjectAcl",
                                "s3:PutObjectVersionAcl",
                                "s3:ListMultipartUploadParts",
                                "s3:PutObject",
                                "s3:GetObjectVersionAcl"
                        ],
                        "Resource": [
                                "arn:aws:s3:::com.prometheus.cgtest-1/*",
                                "arn:aws:s3:::com.prometheus.cgtest-1"
                        ]
                }
        ]
}

Examples

Once the yaml file is configured and the bucket policy is set, you can instantiate a S3Connection and you use that connection to instantiate a RemoteStore.

import s3
import yaml

with open('s3.yaml', 'r') as fi:
    config = yaml.load(fi)

connection = s3.S3Connection(**config['s3'])
store = s3.RemoteStore(connection)

Then you call methods on the RemoteStore instance.

The following code uploads a file named “example” from the local filesystem as “example-in-s3” in s3. It then checks that “example-in-s3” exists in storage, downloads the file as “example-from-s3”, compares the original with the downloaded copy to ensure they are the same, deletes “example-in-s3”, and finally checks that it is no longer in storage.

import subprocess
assert store.write("example", "example-in-s3")
exists, metadata = store.exists("example-in-s3")
assert exists
ok, metadata = store.read("example-in-s3", "example-from-s3")
assert ok
assert 0 == subprocess.call(['diff', "example", "example-from-s3"])
assert store.delete("example-in-s3")
exists, metadata = store.exists("example-in-s3")
assert not exists

The following code again uploads “example” as “example-in-s3”. This time it uses the bucket “my_other_bucket” explicitly, and it sets some metadata and checks that the metadata is set correctly. Then it changes the metadata and checks that as well.

headers = {
    'x-amz-meta-state': 'unprocessed',
    }
remote_name = s3.S3Name("example-in-s3", bucket="my_other_bucket")
assert store.write("example", remote_name, headers=headers)
exists, metadata = store.exists(remote_name)
assert exists
assert metadata == headers
headers['x-amz-meta-state'] = 'processed'
assert store.update_metadata(remote_name, headers)
ok, metadata = store.read(remote_name, "example-from-s3")
assert ok
assert metadata == headers

Project details

These details have not been verified by PyPI

Project links

Homepage

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

3.0.0

May 27, 2015

2.0.0

May 18, 2015

1.0.0

May 12, 2015

0.1.3

Jun 11, 2014

0.1.2

Jun 4, 2014

0.1.1

Jun 3, 2014

0.1.0

Jun 3, 2014

This version

0.0.2

May 28, 2014

0.0.1

May 20, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3-0.0.2.tar.gz (16.7 kB view hashes)

Uploaded May 28, 2014 Source

Hashes for s3-0.0.2.tar.gz

Hashes for s3-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`d052a3ca5f73978ccdac01767b9ac8a0eeebebc9acfcd2716f4ab38a81495c4b`
MD5	`24d9c1d3708c615327cb32d4274e9c34`
BLAKE2b-256	`afb1d8f2b020b579c12c37d6416b0922c5352a842e2b68604ff532bb2fee3dcf`