Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using LiteLLM to support more models #64

Open
Greatz08 opened this issue Apr 5, 2024 · 14 comments
Open

Using LiteLLM to support more models #64

Greatz08 opened this issue Apr 5, 2024 · 14 comments
Labels
✨ enhancement New feature or request ▼ prio: low Low priority 🔧 refactor

Comments

@Greatz08
Copy link

Greatz08 commented Apr 5, 2024

This project is pretty great BUT we need more options to use different LLM's.You don't have to worry about creating a solution which supports 100+ LLM easily as LiteLLM is another foss project which is capable of doing this task for you.

Project LiteLLM link - https://github.com/BerriAI/litellm

Adding LiteLLM will be big win for the project as many will be easily able to use many more LLM easily which everyone wants and project will require 3 major parameters from user like base url,model name,api key that's all and with open ai api general structure it can query and give back result for the query.Many big projects have started adding support for this project in there project to make things advanced in easier way so study it and after that if you have any query you can ask them they are pretty responsive plus if u want to know more about my personal experience of using it with other great projects like flowise then I can tell you that too .

@klieret klieret added the ✨ enhancement New feature or request label Apr 5, 2024
@klieret
Copy link
Member

klieret commented Apr 5, 2024

Sounds nice, I agree that this makes sense. However, this would need some amount of refactoring (cost tracking for example needs to be set up differently).

@klieret klieret changed the title Please add LiteLLM to this project Using LiteLLM to support more models Apr 5, 2024
@ofirpress
Copy link
Member

This is a super low priority issue right now. Most LMs would not really be able to achieve good performance on SWE-agent + SWE-bench... so this would be a waste of time right now + be hard to maintain probably?

@erkinalp
Copy link

erkinalp commented Apr 5, 2024

@ofirpress Not only that, there is another coding assistant that offers LiteLLM support: https://github.com/OpenDevin/
It is also under MIT licence.

@Greatz08
Copy link
Author

Greatz08 commented Apr 5, 2024

@ofirpress Litellm make things easy for both devs and user to test and try 100+ LLM easily. Implementing Litellm right now maybe little challenging but in my opinion will make things much more easy for future rather than implementing later(with low priority mindset) when projects become much more complex.Benefit is for both and that is why many projects are implementing too.
One more thing i would like to add in this is that do not underestimate different closed source model and open source models compared to open ai gpt variants because many have equalized or outperformed them and will keep on outperforming them in future maybe in one domain/segment like just coding or sql(specific task based open source model) or in generalized way so i believe in giving users option to test whatever they want and in easiest way possible so that they can show you more interesting stuff then what u can imagine someday.
Rest i leave this matter to team All the best :-))

@Aaronminer1
Copy link

add support for LM-studio so we can test with local opensource models this should be pretty easy since it uses the openai library.

@xiechengmude
Copy link

Supporting local model with litellm or vllm is much more necessary to boost the project with open sources power.

@elsatch
Copy link

elsatch commented Apr 6, 2024

As a way simpler approach, could you consider exposing OPEN_AI_BASE_URL enviromental variable? (Model might be also required for the calls to different models)

In this way, users could use any OpenAI compatible endpoint to run these models. This would open up compatiblity with backends such as Ollama, Text Generation WebUI, LM Studio, etc.

So, if a user want to use Ollama, they should just type the base_url: http://127.0.0.1:11434/, with any string as API Key, and the chosen model. LM Studio users would use http://127.0.0.1:1234

Note that base_url parameter is supported natively by OpenAI python library, so it probably will not require any extra configuration down the road: https://github.com/openai/openai-python/blob/0470d1baa8ef1f64d0116f3f47683d5bf622cbef/src/openai/_base_client.py#L328

@bvandorf
Copy link
Contributor

bvandorf commented Apr 7, 2024

#118

@ofirpress
Copy link
Member

Didn't #118 solve this? should we close this issue?

@klieret
Copy link
Member

klieret commented Apr 10, 2024

I think LiteLLM might make sense eventually because it would allow us to get rid of a chunk of code with hard-coded values. Right now, every time openAI updates their costs or their models, we'll have to update the config as well. LiteLLM would handle a lot of these things for us and give us more support of other models for free. But it's low priority.

If LiteLLM doesn't do it, projects like https://github.com/AgentOps-AI/tokencost might help with cost estimation

@klieret
Copy link
Member

klieret commented Apr 22, 2024

I'd be open to include litellm as long as we don't disrupt the current research that's using GPT4, Claude 2, Claude 3. So a good way to start would be to add it as an alternative in models.py and then we start thinking if we move more stuff over. Having someone open a PR with a proof of concept of how this would look would also be helpful.

However, at the moment it's also not a big priority, because many of the cheaper models aren't good enough to perform well with SWE-agent.

@EwoutH
Copy link

EwoutH commented Apr 22, 2024

While I agree on the problem and the motivation for solving it, I think it would be good to do a quick scan of the solution space to see what options are out there. If LiteLLM is indeed the best solution for this problem then that would only confirm that.

I think it's important to support multiple API inference providers, as it allows adapting quickly with new releases. Especially with Llama 3's strong performance and a 405B version on the horizon, we won't know yet which API providers are going to offer it and at what prices (if any).

Or the next model from Cohere, Mistral, Reka or someone else.

It would probably also be good for the robustness of the project to be independent of specific LLMs.

@klieret klieret modified the milestone: 0.6.0 May 28, 2024
@BradKML
Copy link

BradKML commented Jun 3, 2024

@erkinalp nice observation, are there other software development agent like OpenDevin CodeAct that does similar things?

@EwoutH
Copy link

EwoutH commented Sep 1, 2024

Having Cerebras as provider would be very interesting, since they currently have by far the fastest inference, at a competitive price. LiteLLM added support for it yesterday:

avis408 pushed a commit to emergentbase/SWE-agent-public that referenced this issue Sep 7, 2024
Fix outdated reference.
Thanks for your contribution!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request ▼ prio: low Low priority 🔧 refactor
Projects
None yet
Development

No branches or pull requests

10 participants