Backend server that helps connecte LLMs (in my case Mistral-Instruct-7B and Falcon-40B) hosted with AWS Sagameker Cloud Platform to Continue.dev VSCode extension. It mocks a Ollama service connection to Continue.dev GUI and streams the LLM response using a SSE (Server Side Events) network streaming.
Scalable to architect personalised RAG infrastructure by leveraging LangChain potential for on-prem enterpise Continue.dev services.
The folowing continudev server is running on MISTRAL-7B model on AWS
demo.mov
Cloning the repository
git clone https://github.com/thisisadityapatel/continuedevSagemakerEndpoint.git
cd continuedevSagemakerEndpoint
Setting up the virtual environment
python3 -m venv env
source env/bin/activate
Loading the environment variables
source .env
Initiating the service on port 11434 (local Ollama port)
python3 server.py