A caching layer for LLMs that exploits Elasticsearch, fully compatible with Langchain caching

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

llm-elasticsearch-cache

A caching layer for LLMs that exploits Elasticsearch, fully compatible with Langchain caching.

Install

pip install llm-elasticsearch-cache

Usage

The Langchain cache can be used similarly to the other cache integrations.

Basic example

from langchain.globals import set_llm_cache
from llmescache.langchain import ElasticsearchCache
from elasticsearch import Elasticsearch

es_client = Elasticsearch(hosts="http://localhost:9200")
set_llm_cache(
    ElasticsearchCache(
        es_client=es_client, 
        es_index="llm-langchain-cache", 
        metadata={"project": "my_chatgpt_project"}
    )
)

The es_index parameter can also take aliases. This allows to use the ILM: Manage the index lifecycle that we suggest to consider for managing retention and controlling cache growth.

Look at the class docstring for all parameters.

Index the generated text

The cached data won't be searchable by default. The developer can customize the building of the Elasticsearch document in order to add indexed text fields, where to put, for example, the text generated by the LLM.

This can be done by subclassing end overriding a method:

from llmescache.langchain import ElasticsearchCache
from elasticsearch import Elasticsearch
from langchain_core.caches import RETURN_VAL_TYPE
from typing import Any, Dict, List
from langchain.globals import set_llm_cache
import json


class SearchableElasticsearchCache(ElasticsearchCache):

    def build_document(
            self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE
    ) -> Dict[str, Any]:
        body = super().build_document(prompt, llm_string, return_val)
        body["parsed_llm_output"] = self._parse_output(body["llm_output"])
        return body

    @staticmethod
    def _parse_output(data: List[str]) -> List[str]:
        return [json.loads(output)["kwargs"]["message"]["kwargs"]["content"] for output in data]


# let's re-use an existing cache index
es_client = Elasticsearch(hosts="http://localhost:9200")
es_client.indices.put_mapping(
    index="llm-langchain-cache", 
    body={"properties": {"parsed_llm_output": {"type": "text", "analyzer": "english"}}}
)
set_llm_cache(SearchableElasticsearchCache(es_client=es_client, es_index="llm-langchain-cache"))

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.2.6

May 30, 2024

0.2.5

May 9, 2024

0.2.4

May 9, 2024

0.2.3

May 9, 2024

0.2.2

Apr 8, 2024

0.2.1

Mar 26, 2024

0.2.0

Feb 28, 2024

0.1.2

Feb 20, 2024

0.1.1

Feb 20, 2024

This version

0.1.0

Feb 16, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_elasticsearch_cache-0.1.0.tar.gz (4.6 kB view hashes)

Uploaded Feb 16, 2024 Source

Built Distribution

llm_elasticsearch_cache-0.1.0-py3-none-any.whl (5.2 kB view hashes)

Uploaded Feb 16, 2024 Python 3

Hashes for llm_elasticsearch_cache-0.1.0.tar.gz

Hashes for llm_elasticsearch_cache-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7a91a3fc1c4e1c3c54fd608d382bd93ca49cc67b6cf94399d213e19f16274a84`
MD5	`61f4429d33983e1869d25f2601120e02`
BLAKE2b-256	`c060791cd53d02c00807f18202e659a671cd19bedb7e1499a92fcb7b1cc5f0a4`

Hashes for llm_elasticsearch_cache-0.1.0-py3-none-any.whl

Hashes for llm_elasticsearch_cache-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4e06ff699694d43069d54f78cf5e59a983298a71598d0a977942e9ac5d702af6`
MD5	`5bab7ec0a366b47108f4eaca21889cc6`
BLAKE2b-256	`b38885683dbdcc69229bb9493f2c0b9080b645a891e699261a7644b322ba2d73`