Skip to main content

A data movement app that can use different source/targets tomove data around.

Project description

Data Transfer Microservice
==========================

This microservice is a multi-mode file movement wizard. It can transfer files,
at a scheduled interval, between two different storage devices, using different
transfer protocols and storage types.

The application is built for Python 3, but also tested against Python 2.7. It
is **not** compatible with Python 2.6.

Installing and getting started
------------------------------

The application should be installed using ``pip3`` (or ``pip`` for Python 2.7).

To install from a private PyPI server we suggest using ``~/.pypirc`` to configure
your private PyPI connection details.

``pip3 install data-transfer --extra-index-url <Repo-URL>``

After installing and setting the configuration settings, the application can be
started with the following command:

``data-transfer``


Developing
----------

Start by cloning the project:

``git clone git@github.com:UKHomeOffice/data-transfer.git``

Ensure that ``python3`` is installed and on your ``path``.

Installing for local development

These steps will install the application as a local pip installed package,
using symlinks so that any updates you make to the files are automatically
picked up next time you run the application or tests.

Using venv
""""""""""

To install the app using the standard ``python3 venv`` run the following
commands from the project root folder:

``python3 -m venv ~/.virtualenvs/data-transfer``
``source ~/.virtualenvs/data-transfer/bin/activate``
``pip3 install -e . -r requirements.txt``


Using virtualenvwrapper
"""""""""""""""""""""""

Alternatively, if you are using ``virtualenvwrapper`` then run the following:

``mkvirtualenv data-transfer -p python3``
``pip3 install -e . -r requirements.txt``


Dependancies for local testing
""""""""""""""""""""""""""""""

The project's tests require the following dependencies:

- An AWS S3 bucket or a mock
- An FTP server
- An SFTP server

For local development and testing, we suggest running Docker images.

Test
""""

Once the application is installed and the dependencies are in place, run the
tests:

``pytest tests``


Building & publishing
=====================

This project uses ``setuptools`` to build the distributable package.

Remember to update the ``version`` in ``setup.py`` before building the package.

``python setup.py sdist``

This will create a ``.tar.gz`` distributable package in ``dist/``. This should be
uploaded to an appropriate PyPI registry.

Deploying
---------

The application should be installed using ``pip3`` (or ``pip`` for Python 2.7).

If installing from a private PyPI server then we suggest using ``~/.pypirc`` to
configure your private PyPI connection details.

``pip3 install data-transfer --extra-index-url <Repo-URL>``


Configuration
-------------

The application requires the following environment variables to be set before
running.

All configuration settings automatically default to suitable values for the
tests, based on the local test dependencies running in the Docker images
suggested in this guide.

Application settings
""""""""""""""""""""

These control various application behaviour:

+---------------------+----------------------+-----------+-----------------------------------+
|Environment Variable | Example | Required | Description. |
+=====================+======================+===========+===================================+
|INGEST_SOURCE_PATH | /upload/files | Yes | Source path |
+---------------------+----------------------+-----------+-----------------------------------+
|INGEST_DEST_PATH | /upload/files/done | Yes | Destination path |
+---------------------+----------------------+-----------+-----------------------------------+
|MAX_FILES_BATCH | 5 | No | Number to process each run |
+---------------------+----------------------+-----------+-----------------------------------+
|PROCESS_INTERVAL | 5 | No | Runs the task every (x) seconds. |
+---------------------+----------------------+-----------+-----------------------------------+
|READ_STORAGE_TYPE | See footnote | Yes | The type of read storage |
+---------------------+----------------------+-----------+-----------------------------------+
|WRITE_STORAGE_TYPE | See footnote | Yes | The type of write storage |
+---------------------+----------------------+-----------+-----------------------------------+

Note: the read and write storage types need to be prefixed and options are:

* datatransfer.storage.FolderStorage
* datatransfer.storage.FtpStorage
* datatransfer.storage.SftpStorage
* datatransfer.storage.S3Storage


Source / read settings
""""""""""""""""""""""

Provide the connection settings for either FTP, sFTP or S3. You only need to
configure the settings associated with the source storage type.

+----------------------------+------------------------+--------------------------+
|Environment Variable | Example | Description |
+============================+========================+==========================+
|READ_FTP_HOST | localhost | Hostname or IP of server |
+----------------------------+------------------------+--------------------------+
|READ_FTP_PASSWORD | pass | Password |
+----------------------------+------------------------+--------------------------+
|READ_FTP_PORT | 21 | Port the server uses |
+----------------------------+------------------------+--------------------------+
|READ_AWS_ACCESS_KEY_ID | accessKey1 | Access key for S3 |
+----------------------------+------------------------+--------------------------+
|READ_AWS_S3_BUCKET_NAME | aws-ingest | Bucket name |
+----------------------------+------------------------+--------------------------+
|READ_AWS_S3_HOST | http://localhost:8000 | URL of S3 |
+----------------------------+------------------------+--------------------------+


Target / write settings
"""""""""""""""""""""""

Provide the connection settings for either FTP, sFTP or S3. You only need to
configure the settings associated with the target storage type.

+----------------------------+-----------------------+-------------------------+
|Environment Variable | Example | Description |
+============================+=======================+=========================+
|WRITE_FTP_HOST | localhost | Hostname or IP of server|
+----------------------------+-----------------------+-------------------------+
|WRITE_FTP_USER | user | Username |
+----------------------------+-----------------------+-------------------------+
|WRITE_FTP_PASSWORD | pass | Password |
+----------------------------+-----------------------+-------------------------+
|WRITE_FTP_PORT | 21 | Port for server |
+----------------------------+-----------------------+-------------------------+
|WRITE_AWS_ACCESS_KEY_ID | accesskey1 | Access key for S3 |
+----------------------------+-----------------------+-------------------------+
|WRITE_AWS_SECRET_ACCESS_KEY | verysecret | Secrey key |
+----------------------------+-----------------------+-------------------------+
|WRITE_AWS_S3_BUCKET_NAME | aws-ingest | Bucket name |
+----------------------------+-----------------------+-------------------------+
|WRITE_AWS_S3_HOST | http://localhost:8000 | URL of S3 |
+----------------------------+-----------------------+-------------------------+

Running the application
-----------------------

To run the application from the command line:

``data-transfer``


For production use we recommend running the application using PM2:

``pm2 start bin/data-transfer-start --name data-transfer --interpreter python ``


Contributing
""""""""""""

This project is Open source and we welcome ocntributions to and suggestions to
improve the application. Please raise issues in the usual way on Github and for
contributing code:

- Fork the repo github
- Clone the project locally
- Commit your changes to your own branch
- Push your work back to your fork
- Submit a Pull Request so that we can review the changes


Licensing
"""""""""

This application is released under the [BSD license](LICENSE.txt).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-transfer-1.0.5.tar.gz (12.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page