Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for gemini #525

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

papayalabs
Copy link

This is where I got stuck @krschacht. I need to change app/jobs/get_next_ai_message_job.rb for Gemini to work but that is not the idea. Sorry I do not know to go foward from this point. System instructions does not work on gemini-ai gem. ( I think is a bug ).

@krschacht
Copy link
Contributor

Hi @papayalabs, it’s totally fine to make changes to get_next_ai_message_job.rb if you needed to make this work. What’s the specific issue you ran into? Are you saying that the issue you can’t resolve is to get system message to work?

@krschacht
Copy link
Contributor

Regarding system instructions, I looked at the gem docs and it appears they think it supports it:
https://github.com/gbaptista/gemini-ai?tab=readme-ov-file#system-instructions

If you try this example in console does do system instructions work there? I’m curious if the issue with system instructions is within the gem itself or an issue getting it to work within this HostedGPT app.

@papayalabs
Copy link
Author

Regarding system instructions, I looked at the gem docs and it appears they think it supports it: https://github.com/gbaptista/gemini-ai?tab=readme-ov-file#system-instructions

If you try this example in console does do system instructions work there? I’m curious if the issue with system instructions is within the gem itself or an issue getting it to work within this HostedGPT app.

I have tried in the console and always got this error: `on_complete': the server responded with status 400 (Faraday::BadRequestError). I think is a problem within the gem but I did not got more deeply into it.

@krschacht
Copy link
Contributor

@papayalabs I spent a little while on this system instructions issue today and I couldn’t figure it out. I validated that the docs also confirm system instructions are supported, although it says it’s “in beta” and it’s “on some models”. To confirm it’s not an issue with the gem, I got the API working through curl and I can get normal replies back but I get the same 400 error when I use a system message through curl. I believe the root cause is that we’re using a model that doesn’t support it, or we’re using an API authentication method that doesn’t support it. I was access the API through a key which I got through AI Studio. I thought maybe the key must be obtained through Vertex AI instead but after about 20 minutes of digging I couldn’t figure out how to get a key through there. So anyway, I didn’t solve it but wanted to capture what I learned since there may be clues.

@papayalabs
Copy link
Author

@krschacht hey! when I first tested throught an API key I have and in the console I also checked all the models ( also the ones that are supported ) and got same result ( error 400 ). When I have more time, I'll dig a little deeper, but I haven't checked it recently.

@papayalabs
Copy link
Author

@krschacht Also fix the bug. Now the code as is works fine without system instructions. I need to add some tests and check any functionality that may be left.

@krschacht
Copy link
Contributor

Hi @papayalabs I'm really excited you're getting this PR done. I gave it a skim and resolved a merge issue with main and did a tiny bit of cleanup. Also, there are some other PRs related to API models that landed in main so merge that back into this branch so it stays kept up.

And I had a thought... I know the system message aren't working for gemini. I just ran into another model, OpenAI o1, which has not yet implemented support for system instructions so those also need to be disabled for that model.

The Language Model table already has booleans for image support and function calling. I think we need another boolean for supports_system_message. That way we can have this bit of metadata deeply associated with a language model and our three backends can conditionalize whether they pass that in with their API calls or whether they ignore it.

Would you be willing to tee up a separate PR for this? Landing that would make it easier to land this PR #525 for Gemini because then we'd just disable system message for it.

@papayalabs
Copy link
Author

Hi @krschacht! I already sync from main. So the PR is updated. I created a new (#542) for the new boolean in language model. Just need to create the tests.

I was thinking about why it does not work the system message in this gem and maybe is the API Key and as you mention early. Maybe this API Key does not support the gemini-pro model. Is not my API KEY, it is a client API KEY so I need to ask them what plan they have. I came to this conclusion because in Claude they give a 400 error when you do not have credits to run the model, even if you have an API KEY.

@krschacht
Copy link
Contributor

krschacht commented Nov 13, 2024

@papayalabs I figured it out!! Let me document it:

  1. I created an API key through Google AI Studio: https://makersuite.google.com/
  2. I confirmed the system prompt works through their interactive developer tool even when I set the model to Gemini 1.5 Pro 002. I clicked the "Get code" button and that will show me the cURL for what's entered in the tool. I tried it with curl and it works. The curl was:
curl \
  -X POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002:generateContent?key=${API_KEY} \
  -H 'Content-Type: application/json' \
  -d @<(echo '{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "Hi, What'\''s your name?"
        }
      ]
    }
  ],
  "systemInstruction": {
    "role": "user",
    "parts": [
      {
        "text": "You are a helpful assistant named Samantha. When I ask your name you should answer with that."
      }
    ]
  },
  "generationConfig": {
    "temperature": 1,
    "topK": 40,
    "topP": 0.95,
    "maxOutputTokens": 8192,
    "responseMimeType": "text/plain"
  }
}')
  1. I then tried this same thing with gemini-ai gem and noticed in the output that the URL was different: curl is using .../v1beta/... and I figured out that the Gemini gem allows this to be specified with the version: key. This is the code that works in rails console:
client = Gemini.new(
  credentials: {
    service: 'generative-language-api',
    api_key: ENV['API_KEY'],
    version: 'v1beta'
  },
  options: { model: 'gemini-1.5-pro-002', server_sent_events: true }
)

result = client.generate_content({
  contents: { role: 'user', parts: { text: 'Hi, What is your name?' } },
  systemInstruction: { role: 'user', parts: { text: 'You are a helpful assistant named Samantha. When I ask your name you should answer with that.' } },
})

The takeaway is: update this PR to use version: 'v1beta' when it calls the API!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants