<?xml version="1.0" encoding="UTF-8" ?>
<rdf:RDF xmlns="http://usefulinc.com/ns/doap#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><Project><name>reflex</name>
<shortdesc>A lightweight regex-based lexical scanner library.</shortdesc>
<description>Reflex: A lightweight lexical scanner library.

Reflex supports regular expressions, rule actions, multiple scanner states,
tracking of line/column numbers, and customizable token classes.

Reflex is not a "scanner generator" in the sense of generating source code.
Instead, it generates a scanner object dynamically based on the set of
input rules sepecified. The rules themselves are ordinary python regular
expressions, combined with rule actions which are simply python functions.

Example use:

    # Create a scanner. The "start" parameter specifies the name of the
    # starting state. Note: The state argument can be any hashable python
    # type.
    scanner = reflex.scanner( "start" )
    
    # Add some rules.
    # The whitespace rule has no actions, so whitespace will be skipped
    scanner.rule( "\s+" )
    
    # Rules for identifiers and numbers.
    TOKEN_IDENT = 1
    TOKEN_NUMBER = 2
    scanner.rule( "[a-zA-Z_][\w_]*", token=TOKEN_IDENT )
    scanner.rule( "0x[\da-fA-F]+|\d+", token=TOKEN_NUMBER )
    
    # The "string" rule kicks us into the string state
    TOKEN_STRING = 3
    scanner.rule( "\"", tostate="string" )

    # Define the string state. "string_escape" and "string_chars" are
    # action functions which handle the parsed charaxcters and escape
    # sequences and append them to a buffer. Once a quotation mark
    # is encountered, we set the token type to be TOKEN_STRING
    # and return to the start state.
    scanner.state( "string" )
    scanner.rule( "\"", tostate="start", token=TOKEN_STRING )
    scanner.rule( "\\\\.", string_escape )
    scanner.rule( "[^\"\\\\]+", string_text )

Invoking the scanner: The scanner can be called as a function which
takes a reference to a stream (such as a file object) which iterates
over input lines. The "context" argument is for application use,
The result is an iterator which produces a series of tokens.
The same scanner can be used to parse multiple input files, by
creating a new stream for each file.

    # Return an instance of the scanner.
    token_iter = scanner( istream, context )

Getting the tokens. Here is a simple example of looping through the
input tokens. A real-world use would most likely involve comparing
vs. the type of the current token.

    # token.id is the token type (the same as the token= argument in the rule)
    # token.value is the actual characters that make up the token.
    # token.line is the line number on which the token was encountered.
    # token.pos is the column number of the first character of the token.
    for token in token_iter:
        print token.id, token.value, token.line, token.pos
     
Action functions are python functions which take a single argument, which
is the token stream instance.

    # Action function to handle striing text.
    # Appends the value of the current token to the string data
    def string_text( token_stream ):
        string_data += scanner.token.value
        
The token_stream object has a number of interesting and usable attributes:

    states:  dictionary of scanner states
    state:   the current state
    stream:  the input line stream
    context: the context pointer that was passed to the scanner
    token:   the current token
    line:    the line number of the current parse position
    pos:     the column number of the current parse position
    
Note - reflex currently has a limit of 99 rules for each state. (That is
the maximum number of capturing groups allowed in a python regular expression.)</description>
<homepage rdf:resource="http://viridia.org/python-projects/" />
<maintainer><foaf:Person><foaf:name>Talin</foaf:name>
<foaf:mbox_sha1sum>5f8fc4a606b7264dd2a1a41c5adffba846c790d2</foaf:mbox_sha1sum></foaf:Person></maintainer>
<release><Version><revision>0.1</revision></Version></release>
</Project></rdf:RDF>