Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use Azure open AI #29

Open
royrajjyoti12 opened this issue Jul 22, 2024 · 5 comments
Open

Can I use Azure open AI #29

royrajjyoti12 opened this issue Jul 22, 2024 · 5 comments

Comments

@royrajjyoti12
Copy link

I am trying to use azure openai but I got this error.

raise OpenAIError( openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

Also can we use multiple models instead of only two strong and weak model?

@iojw
Copy link
Collaborator

iojw commented Jul 23, 2024

Can you share the full code that you're using?

We still currently depend on OpenAI for embeddings if you're using mf or sw_ranking so that may be why. You can use bert instead and set a dummy value for the OpenAI key as a workaround, fix incoming.

We don't currently support >2 models! More research is required here.

@xXBlackMasterXx
Copy link

That totally explains why it's asking for OpenAI API Key even though I've set my correct api keys for both, strong and weak models.

Is there a way to use an embedding model from Azure OpenAI instead?
I have the same issue trying to route an Azure OpenAI model (gpt-4o) and a Groq model (llama3-8b-8192).

I can share the code I'm using for this:

# Weak model secrets
os.environ["GROQ_API_KEY"] = "<groq-api-key>"

# Strong model secrets
os.environ["AZURE_API_KEY"] = "<azure-openai-api-key>"
os.environ["AZURE_API_BASE"] = "<azure-openai-endpoint>"
os.environ["AZURE_API_VERSION"] = "<azure-openai-api-version>"

# Import the controller
from routellm.controller import Controller

# Create the controller
# I've used the prefix azure/ according to the LiteLLM docs
# https://litellm.vercel.app/docs/providers/azure
client = Controller(
    routers = ["mf"],
    strong_model = "azure/gpt-4o",
    weak_model = "groq/llama3-8b-8192"
)

# Make a request
response = client.chat.completions.create(
    model = "router-mf-0.11593",
    messages = [
        {"role":"user", "content":"Hello!"}
    ]
)

# AI Message
message = response.choices[0].message.content
# Model used
model_used = response.model

print(f"Model used: {model_used}")
print(f"Response: {message}")

It throws this error:

{
	"name": "AuthenticationError",
	"message": "Error code: 401 - {'error': {'message': 'Incorrect API key provided: ********************. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}",
	"stack": "---------------------------------------------------------------------------
AuthenticationError                       Traceback (most recent call last)
Cell In[9], line 1
----> 1 response = client.chat.completions.create(
      2     model = \"router-mf-0.11593\",
      3     messages = [
      4         {\"role\":\"user\", \"content\":\"Hola\"}
      5     ]
      6 )
      8 message = response.choices[0].message.content
      9 used_model = response.model

File ~/.local/lib/python3.10/site-packages/routellm/controller.py:150, in Controller.completion(self, router, threshold, **kwargs)
    147     router, threshold = self._parse_model_name(kwargs[\"model\"])
    149 self._validate_router_threshold(router, threshold)
--> 150 kwargs[\"model\"] = self._get_routed_model_for_completion(
    151     kwargs[\"messages\"], router, threshold
    152 )
    153 return completion(api_base=self.api_base, api_key=self.api_key, **kwargs)

File ~/.local/lib/python3.10/site-packages/routellm/controller.py:111, in Controller._get_routed_model_for_completion(self, messages, router, threshold)
    105 def _get_routed_model_for_completion(
    106     self, messages: list, router: str, threshold: float
    107 ):
    108     # Look at the last turn for routing.
    109     # Our current routers were only trained on first turn data, so more research is required here.
    110     prompt = messages[-1][\"content\"]
--> 111     routed_model = self.routers[router].route(prompt, threshold, self.model_pair)
    113     self.model_counts[router][routed_model] += 1
    115     return routed_model

File ~/.local/lib/python3.10/site-packages/routellm/routers/routers.py:42, in Router.route(self, prompt, threshold, routed_pair)
     41 def route(self, prompt, threshold, routed_pair):
---> 42     if self.calculate_strong_win_rate(prompt) >= threshold:
     43         return routed_pair.strong
     44     else:

File ~/.local/lib/python3.10/site-packages/routellm/routers/routers.py:239, in MatrixFactorizationRouter.calculate_strong_win_rate(self, prompt)
    238 def calculate_strong_win_rate(self, prompt):
--> 239     winrate = self.model.pred_win_rate(
    240         self.strong_model_id, self.weak_model_id, prompt
    241     )
    242     return winrate

File ~/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    113 @functools.wraps(func)
    114 def decorate_context(*args, **kwargs):
    115     with ctx_factory():
--> 116         return func(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/routellm/routers/matrix_factorization/model.py:124, in MFModel.pred_win_rate(self, model_a, model_b, prompt)
    122 @torch.no_grad()
    123 def pred_win_rate(self, model_a, model_b, prompt):
--> 124     logits = self.forward([model_a, model_b], prompt)
    125     winrate = torch.sigmoid(logits[0] - logits[1]).item()
    126     return winrate

File ~/.local/lib/python3.10/site-packages/routellm/routers/matrix_factorization/model.py:113, in MFModel.forward(self, model_id, prompt)
    109 model_embed = self.P(model_id)
    110 model_embed = torch.nn.functional.normalize(model_embed, p=2, dim=1)
    112 prompt_embed = (
--> 113     OPENAI_CLIENT.embeddings.create(input=[prompt], model=self.embedding_model)
    114     .data[0]
    115     .embedding
    116 )
    117 prompt_embed = torch.tensor(prompt_embed, device=self.get_device())
    118 prompt_embed = self.text_proj(prompt_embed)

File ~/.local/lib/python3.10/site-packages/openai/resources/embeddings.py:114, in Embeddings.create(self, input, model, dimensions, encoding_format, user, extra_headers, extra_query, extra_body, timeout)
    108         embedding.embedding = np.frombuffer(  # type: ignore[no-untyped-call]
    109             base64.b64decode(data), dtype=\"float32\"
    110         ).tolist()
    112     return obj
--> 114 return self._post(
    115     \"/embeddings\",
    116     body=maybe_transform(params, embedding_create_params.EmbeddingCreateParams),
    117     options=make_request_options(
    118         extra_headers=extra_headers,
    119         extra_query=extra_query,
    120         extra_body=extra_body,
    121         timeout=timeout,
    122         post_parser=parser,
    123     ),
    124     cast_to=CreateEmbeddingResponse,
    125 )

File ~/.local/lib/python3.10/site-packages/openai/_base_client.py:1259, in SyncAPIClient.post(self, path, cast_to, body, options, files, stream, stream_cls)
   1245 def post(
   1246     self,
   1247     path: str,
   (...)
   1254     stream_cls: type[_StreamT] | None = None,
   1255 ) -> ResponseT | _StreamT:
   1256     opts = FinalRequestOptions.construct(
   1257         method=\"post\", url=path, json_data=body, files=to_httpx_files(files), **options
   1258     )
-> 1259     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))

File ~/.local/lib/python3.10/site-packages/openai/_base_client.py:936, in SyncAPIClient.request(self, cast_to, options, remaining_retries, stream, stream_cls)
    927 def request(
    928     self,
    929     cast_to: Type[ResponseT],
   (...)
    934     stream_cls: type[_StreamT] | None = None,
    935 ) -> ResponseT | _StreamT:
--> 936     return self._request(
    937         cast_to=cast_to,
    938         options=options,
    939         stream=stream,
    940         stream_cls=stream_cls,
    941         remaining_retries=remaining_retries,
    942     )

File ~/.local/lib/python3.10/site-packages/openai/_base_client.py:1040, in SyncAPIClient._request(self, cast_to, options, remaining_retries, stream, stream_cls)
   1037         err.response.read()
   1039     log.debug(\"Re-raising status error\")
-> 1040     raise self._make_status_error_from_response(err.response) from None
   1042 return self._process_response(
   1043     cast_to=cast_to,
   1044     options=options,
   (...)
   1048     retries_taken=options.get_max_retries(self.max_retries) - retries,
   1049 )

AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: ********************. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}"
}

@praveennvr
Copy link

I have same error while using Azure OpenAI URL and Key. For weak model, I am using an internal API and key. Is there any work around to use custom base URLs and keys for Strong and Weak models?

@xXBlackMasterXx
Copy link

I was digging into the source code and found out that the file inside routellm/routers/routers.py was using a default client from OpenAI.

For my use case, I've only change it to AzureOpenAI

image

and modified the embedding model deployment name to text-embedding-ada-002 (this name depends on the deployment name.

image

I need to some testings in order to see if this works (the router, not the embedding model, that works good) but I think would be good to be able to choose an embedding model of our preference into the Controller.

@iojw
Copy link
Collaborator

iojw commented Aug 11, 2024

Yes, this makes perfect sense. We are looking into other embedding models at the moment and will release an update soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants