theano-lstm

Nano size theano lstm module

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Small Theano LSTM recurrent network module
------------------------------------------

@author: Jonathan Raiman
@date: December 10th 2014

Implements most of the great things that came out
in 2014 concerning recurrent neural networks, and
some good optimizers for these types of networks.

**Note**: Dropout causes gradient issues with theano
if placed in scan, so it should be set to 0 for now,
and will be fixed in the future.

### Key Features

This module contains several Layer types that are useful
for prediction and modeling from sequences:

* A non-recurrent **Layer**, with a connection matrix W, and bias b
* A recurrent **RNN Layer** that takes as input its previous hidden activation and has an initial hidden activation
* A recurrent **LSTM Layer** that takes as input its previous hidden activation and memory cell values, and has initial values for both of those
* An **Embedding** layer that contains an embedding matrix and takes integers as input and returns slices from its embedding matrix (e.g. word vectors)

This module also contains the **SGD**, **AdaGrad**, and **AdaDelta** gradient descent methods that are constructed using an objective function and a set of theano variables, and returns an `updates` dictionary to pass to a theano function (see below).

### Usage

Here is an example of usage with stacked LSTM units, using
Adadelta to optimize, and using a scan op.

# bug for now forces us to use 0.0 with scan,
dropout = 0.0

model = StackedCells(4, layers=[20, 20], activation=T.tanh, celltype=LSTM)
model.layers[0].in_gate2.activation = lambda x: x
model.layers.append(Layer(20, 2, lambda x: T.nnet.softmax(x)[0]))

# in this example dynamics is a random function that takes our
# output along with the current state and produces an observation
# for t + 1

def step(x, *prev_hiddens):
new_states = stacked_rnn.forward(x, prev_hiddens, dropout)
return [dynamics(x, new_states[-1])] + new_states[:-1]

initial_obs = T.vector()
timesteps = T.iscalar()

result, updates = theano.scan(step,
n_steps=timesteps,
outputs_info=[dict(initial=initial_obs, taps=[-1])] + [dict(initial=layer.initial_hidden_state, taps=[-1]) for layer in model.layers if hasattr(layer, 'initial_hidden_state')])

target = T.vector()

cost = (result[0][:,[0,2]] - target[[0,2]]).norm(L=2) / timesteps

updates, gsums, xsums, lr, max_norm = \
create_optimization_updates(cost, model.params, method='adadelta')

update_fun = theano.function([initial_obs, target, timesteps], cost, updates = updates, allow_input_downcast=True)
predict_fun = theano.function([initial_obs, timesteps], result[0], allow_input_downcast=True)

for example, label in training_set:
c = update_fun(example, label, 10)

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.15

Jun 17, 2015

0.0.14

Apr 6, 2015

0.0.13

Jan 27, 2015

0.0.12

Jan 10, 2015

0.0.11

Jan 8, 2015

0.0.10

Jan 5, 2015

0.0.9

Jan 5, 2015

0.0.8

Jan 5, 2015

0.0.7

Dec 30, 2014

0.0.6

Dec 14, 2014

0.0.5

Dec 13, 2014

0.0.4

Dec 12, 2014

This version

0.0.3

Dec 12, 2014

0.0.2

Dec 11, 2014

0.0.1

Dec 10, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

theano-lstm-0.0.3.tar.gz (6.0 kB view hashes)

Uploaded Dec 12, 2014 Source

Hashes for theano-lstm-0.0.3.tar.gz

Hashes for theano-lstm-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`bc2f72b3d65032f75c0ed2c8611a8e0fd2fceed2a4594d543ece8872536b9f15`
MD5	`4918fc1b8a5471bbca0e07aae07d8a6c`
BLAKE2b-256	`d14616ea14a58fb8577b9bf9b0f6f5ef0b59e697cdda747a6db6bfff9bfa73be`