Chat with documents, search via natural language.
chat-search
supports hybrid language models to add chat capabilities to website.
RAG built with LangChain, Redis, various model providers (OpenAI, Ollama, vLLM, Huggingface).
Demo: Chat about my blog
cp .env.example .env
Populate .env
file with the required environment variables.
Name | Value | Default |
---|---|---|
AUTH_TOKEN | auto token used for ingest | |
CHAT_PROVIDER | model provider, openai or ollama |
openai |
DEBUG | enable DEBUG, 1 or 0 |
0 |
DIGEST_PREFIX | prefix for digest in Redis | digest |
DOCUMENT_CONTENT_DESCRIPTION | document content description | Document content |
EMBEDDING_DIM | embedding dimensions | 1536 |
EMBEDDING_PROVIDER | embedding provider, openai or ollama or huggingface |
openai |
ENABLE_FEEDBACK_ENDPOINT | enable feedback endpoint, 1 or 0 |
1 |
ENABLE_PUBLIC_TRACE_LINK_ENDPOINT | enable public trace link endpoint, 1 or 0 |
1 |
FULLTEXT_RETRIEVER_SEARCH_K | fulltext retriever search result number | 4 |
FULLTEXT_RETRIEVER_SELF_QUERY | whether to enable fulltext retriever self query, 1 or 0 |
1 |
FULLTEXT_RETRIEVER_WEIGHT | fulltext retriever weight | 0.5 |
HEADERS_TO_SPLIT_ON | html headers to split text | h1,h2,h3 |
HF_HUB_EMBEDDING_MODEL | huggingface hub embedding model or Text Embeddings Inference url | http://localhost:8080 |
INDEX_NAME | index name | document |
INDEX_SCHEMA_PATH | index schema path | (will use app/schema.yaml ) |
LANGCHAIN_API_KEY | langchain api key for langsmith | |
LANGCHAIN_ENDPOINT | langchain endpoint for langsmith | https://api.smith.langchain.com |
LANGCHAIN_PROJECT | langchain project for langsmith | default |
LANGCHAIN_TRACING_V2 | enable langchain tracing v2 | true |
LLM_TEMPERATURE | temperature for LLM | 0 |
MERGE_SYSTEM_PROMPT | merge system prompt with user input, for models not support system role, 1 or 0 |
0 |
OLLAMA_CHAT_MODEL | ollama chat model | llama3 |
OLLAMA_EMBEDDING_MODEL | ollama embedding model | nomic-embed-text |
OLLAMA_URL | ollama url | http://localhost:11434 |
OPENAI_API_BASE | openai compatible api base url | https://api.openai.com/v1 |
OPENAI_API_KEY | openai api key | EMPTY |
OPENAI_CHAT_MODEL | openai chat model | gpt-4o-mini |
OPENAI_EMBEDDING_MODEL | openai embedding model | text-embedding-3-small |
OTEL_SDK_DISABLED | disable OpenTelemetry, false or true |
false |
OTEL_SERVICE_NAME | OpenTelemetry service name, also used for Pyroscope application name | chat-search |
PYROSCOPE_BASIC_AUTH_PASSWORD | Pyroscope basic auth password | |
PYROSCOPE_BASIC_AUTH_USERNAME | Pyroscope basic auth username | |
PYROSCOPE_ENABLED | Enable Pyroscope or not, 1 or 0 |
1 |
PYROSCOPE_SERVER_ADDRESS | Pyroscope server address | http://localhost:4040 |
REDIS_URL | redis url | redis://localhost:6379/ |
REPHRASE_PROMPT | prompt for rephrase | check config.py |
RETRIEVAL_QA_CHAT_SYSTEM_PROMPT | prompt for retrieval | check config.py |
RETRIEVER_SEARCH_K | retriever search result number | 4 |
RETRIEVER_SELF_QUERY_EXAMPLES | retriever self query examples as json | check config.py |
TEXT_SPLIT_CHUNK_OVERLAP | chunk overlap for text split | 200 |
TEXT_SPLIT_CHUNK_SIZE | chunk size for text split | 4000 |
VECTORSTORE_RETRIEVER_SEARCH_KWARGS | search kwargs for redis vectorstore retriever as json | check config.py |
VECTORSTORE_RETRIEVER_SEARCH_TYPE | search type for redis vectorstore retriever | mmr |
VECTORSTORE_RETRIEVER_SELF_QUERY | whether to enable vectorstore retriever self query, 1 or 0 |
1 |
VECTORSTORE_RETRIEVER_WEIGHT | vectorstore retriever weight | 0.5 |
VERBOSE | enable verbose, 1 or 0 |
0 |
Follow Ollama instructions
ollama serve
ollama pull llama3
ollama pull nomic-embed-text
pip install poetry==1.7.1
poetry shell
poetry install
Start redis
docker compose -f compose.redis.yaml up
langchain serve
Visit http://localhost:8000/
There is a compose.yml
file for running the app and all dependencies in containers.
Suitable for local end to end testing.
docker compose up --build
Visit http://localhost:8000/
There is a helm
chart for deploying the app in Kubernetes.
cp values.example.yaml values.yaml
Then update values.yaml
accordingly.
Add helm repos:
helm repo add chat-search https://hemslo.github.io/chat-search/
helm repo add redis-stack https://redis-stack.github.io/helm-redis-stack/
helm repo add ollama-helm https://otwld.github.io/ollama-helm/
Install/Upgrade chat-search
helm upgrade -i --wait my-chat-search chat-search/chat-search -f values.yaml
skaffold run --port-forward
crawl --sitemap-url $SITEMAP_URL --auth-token $AUTH_TOKEN
Check crawl.yml for web crawling,
Example auto ingest after Github Pages deploy, jekyll.yml.
flowchart LR
A(Crawl) --> |doc| B(/ingest)
B --> |metadata| C(Redis)
B --> |doc| D(Text Splitter)
D --> |docs| E(Embedding Model)
E --> |docs with embeddings| F(Redis)
flowchart LR
A((Request)) --> |messages| B(/chat)
B --> |messages| C(LLM)
C --> |question| D(Embedding Model)
D --> |embeddings| E(Redis)
E --> |relevant docs| F(LLM)
B --> |messages|F
F --> |answer| G((Response))
Check cicd.yml for Google Cloud Run deployment, deploy-to-cloud-run.