The smallest possible LLM API

These details have not been verified by PyPI

Project links

Home

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

llama-small

MicroLlama

The smallest possible LLM API. Build a question and answer interface to your own content in a few minutes. Uses OpenAI embeddings, gpt-3.5 and Faiss, via Langchain.

Usage

Combine your source documents into a single JSON file called source.json. It should look like this:

[
    {
        "source": "Reference to the source of your content. Typically a title.",
        "url": "URL for your source. This key is optional.",
        "content": "Your content as a single string. If there's a title or summary, put these first, separated by new lines."
    }, 
    ...
]

See example.source.json for an example.

Install dependencies:

pip install langchain faiss-cpu openai fastapi "uvicorn[standard]"

Get an OpenAI API key and add it to the environment, e.g. export OPENAI_API_KEY=sk-etc. Note that indexing and querying require OpenAI credits, which aren't free.
Run your server with uvicorn serve:app. If the search index doesn't exist, it'll be created and stored.
Query your documents at /api/ask?your question or use the simple front-end at /

Deploying your API

On Fly.io

fly launch # answer no to Postgres, Redis and deploying now 
fly secrets set OPENAI_API_KEY=sk-etc 
fly deploy

On Google Cloud Run

gcloud run deploy --source . --set-env-vars="OPENAI_API_KEY=sk-etc"

For Cloud Run and other serverless platforms you should probably generate the FAISS index at container build time, to reduce cold starts. See the two commented lines in Dockerfile.

Based on

Langchain
Simon Willison's blog post, datasette-openai and datasette-faiss.
FastAPI
GPT Index
Dagster blog post

TODO

Use splitting which generates more meaningful fragments, e.g. text_splitter = SpacyTextSplitter(chunk_size=700, chunk_overlap=200, separator=" ")

Project details

These details have not been verified by PyPI

Project links

Home

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.4.10

Oct 25, 2023

0.4.9

Oct 25, 2023

0.4.8

Apr 24, 2023

0.4.7

Apr 12, 2023

0.4.6

Mar 20, 2023

0.4.4

Mar 16, 2023

0.4.3

Mar 16, 2023

0.4.2

Mar 14, 2023

0.4.1

Mar 14, 2023

0.4

Mar 14, 2023

0.3.1

Mar 14, 2023

0.3

Mar 14, 2023

This version

0.2

Mar 13, 2023

0.1

Mar 13, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

microllama-0.2.tar.gz (15.3 kB view hashes)

Uploaded Mar 13, 2023 Source

Built Distribution

microllama-0.2-py2.py3-none-any.whl (5.3 kB view hashes)

Uploaded Mar 13, 2023 Python 2 Python 3

Hashes for microllama-0.2.tar.gz

Hashes for microllama-0.2.tar.gz
Algorithm	Hash digest
SHA256	`7ef4d2fab3b8b13807c106910dd081635517c2aa7d34fa553735f7fa035fea09`
MD5	`8dbdc9e02d2d0c652c315b0cfaceae0d`
BLAKE2b-256	`f4a7539909e593a7da0a1354ed4c01cd07c97ce31e318c25c38c6fe9b5fc9c37`

Hashes for microllama-0.2-py2.py3-none-any.whl

Hashes for microllama-0.2-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`78585ab5f3bbd59573bdfeca12e87da670b9bc7f547225f978ea1be91022efa0`
MD5	`ac8d04ad787286723641beee97d5234c`
BLAKE2b-256	`c652a0f955c8007407b859dc6a1f8082c4070587df592c8ac352e56ba9dfe8fe`