Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to use models deployed on Azure AI Studio #3902

Closed
rohanthacker opened this issue Oct 23, 2024 · 11 comments
Closed

Add the ability to use models deployed on Azure AI Studio #3902

rohanthacker opened this issue Oct 23, 2024 · 11 comments
Assignees
Labels
Milestone

Comments

@rohanthacker
Copy link
Contributor

rohanthacker commented Oct 23, 2024

What feature would you like to be added?

I would like the ability to use a model that is deployed on Azure AI Studio and uses the Azure AI Model Inference API.


If needed, I would like to assist in the creation of the feature. However I have a few questions and require some help about what would be the best way to implement this feature.

Questions:

  1. Since the Azure AI Model Inference API is compatible with Azure OpenAI model deployments can we extend the already created AzureOpenAIChatCompletionClient?

I have already tried to do this however the API produces an invalid URL and responds with a 404 error, as the endpoint created by Azure AI Studio and the client are not the same.


  1. Or would it be preferred to extend the ChatCompletionClient, I started with this initially but I noticed a fair bit of overlap with the OpenAI Clients.

  1. Or perhaps there is a simpler way we can integrate this API that I am not aware of?

Looking forward to discussing more on this

Why is this needed?

Azure AI Studio provides a large catalog of models along with various deployment options that make it easy for developers to access a wide variety of models. Given the nature of this project, having the ability to integrate this diverse set of models out of the box will allow for more adoption of the project and allow developers to bring their own model in without the need to code a new client for each.

@jackgerrits
Copy link
Member

Thanks for the issue! I think supporting the Azure AI Model Inference API would be great! I think it makes sense to have it as a separate model client that also implements the ChatCompletionClient protocol makes sense.

We'd love if you're interested in helping build this!

@rohanthacker
Copy link
Contributor Author

@jackgerrits I'll be happy to implement these changes, can this task be assigned to me as I have already started work on this task in my fork of this repository. I'll raise a draft pull request in a day or so for us to discuss.

@rysweet
Copy link
Collaborator

rysweet commented Oct 25, 2024

@rohanthacker - this is supported in dotnet now with #3790

@edirgarcia
Copy link

Following this issue, since this will enable us to use phi, on some internal cases.

@ekzhu
Copy link
Collaborator

ekzhu commented Nov 6, 2024

@rohanthacker any update on this one?

@ekzhu
Copy link
Collaborator

ekzhu commented Nov 6, 2024

@edirgarcia Depending on the API you are using. If you are using the Core API, you don't need to wait for this feature you can use the azure.ai.inference.models directly in your agent implementation. If you are using the AgentChat API, you may need to wait for the wrapper but you can also implement your own agent.

@rohanthacker
Copy link
Contributor Author

Hi @ekzhu,

I’ve completed the initial implementation of ChatCompletionClient using azure.ai.inference.

This is a draft, as I wanted to discuss the significant code duplication between this client and the existing OpenAIChatCompletionClient. If the team agrees with this approach, I’ll code/copy the rest of the implementation as needed.

Currently, the two clients are nearly identical, with the only differences being the type variations required by each library. Given that azure.ai.inference is compatible with OpenAI, At the moment I don't see the need to have a separate concrete class for this considering the azure.ai.inference library is OpenAI compatible. What are your thoughts?


Additionally, I was able to get OpenAIChatCompletionClient to work with models deployed on Azure AI Studio by setting the base_url and api_key. However, I encountered a few minor compatibility issues with specific models:

  • Cohere-command-r: Returns an error with an invalid response_format when json_output is set to True.
  • Mistral-Nemo: Responds with “extra inputs are not permitted” due to an error in the message formatting.

Both models work fine when connected directly to the OpenAI API.

Happy to keep working on this—just looking for team input on code duplication and next steps.

@rohanthacker
Copy link
Contributor Author

@edirgarcia Phi-3.5 is working with the OpenAIChatCompletionClient. Please refer this example I came across today by azureml-examples

@edirgarcia
Copy link

@edirgarcia Phi-3.5 is working with the OpenAIChatCompletionClient. Please refer this example I came across today by azureml-examples

Thank you I will test this out next week.

@ekzhu
Copy link
Collaborator

ekzhu commented Nov 8, 2024

Currently, the two clients are nearly identical, with the only differences being the type variations required by each library. Given that azure.ai.inference is compatible with OpenAI, At the moment I don't see the need to have a separate concrete class for this considering the azure.ai.inference library is OpenAI compatible. What are your thoughts?

Thanks @rohanthacker!

For the Core API, the user can choose any client they want to use. So this is not a blocker.

I think there is still benefit of wrapping the azure.ai.inference client behind our ChatCompletionClient protocol. For AgentChat users it is useful as the built-in agents accepts a ChatCompletionClient. We can resolve the code duplication in the future.

@ekzhu ekzhu modified the milestones: 0.4.1, 0.4.0 Nov 8, 2024
@ekzhu
Copy link
Collaborator

ekzhu commented Dec 12, 2024

@rohanthacker, it has been a while. I am closing this issue for now unless you have already finished or close to finish the implementation, in that case, please let @yanivvak and I know under #4683, and submit a PR for your implementation.

@ekzhu ekzhu closed this as completed Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants