Skip to main content

TWItter STock market Machine Learning package

Project description

TwistML
=======

Disclaimer
----------
This package is still very much under developement.

At this point most of the intended functionality is in place, but
documentation is still very spotty.

Installation
------------
You can use pip to install TwistML like so::

$ pip install twistml

Please make you sure you **have numpy, scipy and gensim installed** as
well. I have opted out of adding them to the install_requires as this
has caused problems in my own tests on windows machines. (For numpy the
problem is described `here
<https://github.com/numpy/numpy/issues/2434>`_.) So these packages will
not be installed automatically by pip.


Known Issues & Planned Improvements
===================================

- Implement a DateRange class and replace all occurences of fromdate,
todate, dateformat.

- Implement find_files() without dateranges at all. It should be
possible to simply process all files within a directory (also
recursively)

- TwistML currently assumes raw twitter data to be avaialble as one
json file per day. Make sure the internet-archive's file scheme is
supported as well

- Add support for hourly time resolution instead of daily only.

- Evaluation subpackage can only deal with binary classification.
Possibly explore adding multiclass.

- The way logging is currently set up is weird and should be reworked.

- gensim's LabeledSentence is deprecated, use TaggedDocument instead
Changes
=======

Version 0.2.2

- Added sentiment features based on TextBlob sentiments

Version 0.2.1
-------------

- Added functionality for complex category subsets to
tml-generate-features

- Also improved documentation for tml-generate-features (on cmd line as
well as docstring)

- improved test coverage

Version 0.2.0
-------------

- Changed Development Status to Alpha

- Removed Sentence2Vec as that functionality is included in current
gensim versions' Doc2Vec class

- Added Changelog

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twistml-0.2.2.zip (30.7 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page