Caching

LangChain为聊天模型提供可选的缓存层。这有两个用途：

它可以通过减少对LLM提供者的API调用次数来节省费用，如果您经常多次请求相同的完成。它可以通过减少对LLM提供者的API调用次数来加快应用程序的速度。

import langchain
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()

In Memory Cache

from langchain.cache import InMemoryCache

langchain.llm_cache = InMemoryCache()

# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a joke")

    CPU times: user 35.9 ms, sys: 28.6 ms, total: 64.6 ms
    Wall time: 4.83 s
    
    "\n\nWhy couldn't the bicycle stand up by itself? It was...two tired!"

# The second time it is, so it goes faster
llm.predict("Tell me a joke")

    CPU times: user 238 µs, sys: 143 µs, total: 381 µs
    Wall time: 1.76 ms
    
    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

SQLite Cache

rm .langchain.db

# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache

langchain.llm_cache = SQLiteCache(database_path=".langchain.db")

# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a joke")

    CPU times: user 17 ms, sys: 9.76 ms, total: 26.7 ms
    Wall time: 825 ms
    
    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

# The second time it is, so it goes faster
llm.predict("Tell me a joke")

    CPU times: user 2.46 ms, sys: 1.23 ms, total: 3.7 ms
    Wall time: 2.67 ms
    
    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

Caching

In Memory Cache​

SQLite Cache​

In Memory Cache

SQLite Cache