Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: New Feature: more generic KernelFunctionFromMethod parameter parsing in Python #10008

Open
Druid-of-Luhn opened this issue Dec 18, 2024 · 6 comments
Assignees
Labels
python Pull requests for the Python Semantic Kernel

Comments

@Druid-of-Luhn
Copy link


name: Feature request
about: Suggest an idea for this project


When creating a KernelFunction from a method in Python, the KernelFunctionFromMethod._parse_parameter method is called to verify the values passed to the function against the function definition.

It performs a series of checks to handle different cases:

  • If the value matches the type and the type has a model_validate method (it's a pydantic type): validate the value's model; this does not allow base classes like BaseModel to be used as function arguments, because the model_validate method tries to instantiate the class, which fails for base classes, nor does it work for a Union of concrete pydantic classes.
  • If the type is a list: run the function again for all values in the list, for the first list item's type; no issue here.
  • If the value is a dictionary and its type is a class: instantiate the class with the exploded dictionary; this does not allow base classes like BaseModel to be used as function arguments.
  • Otherwise, try wrapping the value in its type constructor: this works for basic types like int and str; this does not work with Union or Any.

This seems to mean that function argument types for kernel functions must be specific: they can't be Unions, Any or pydantic base classes (like BaseModel or KernelBaseModel).

In the process framework, this prevents me from creating a generic step (which accepts all types of event payload, for example, or some subset of types). I would need to create some kind of non-abstract base class to replicate the desired behaviour. Ideally I could just specify the types as Any or Union[...] or BaseModel.

Would this be possible?

@markwallace-microsoft markwallace-microsoft added python Pull requests for the Python Semantic Kernel triage labels Dec 18, 2024
@github-actions github-actions bot changed the title New Feature: more generic KernelFunctionFromMethod parameter parsing in Python Python: New Feature: more generic KernelFunctionFromMethod parameter parsing in Python Dec 18, 2024
@eavanvalkenburg
Copy link
Member

@Druid-of-Luhn thanks for reaching out, I've looked a bit into this, and indeed when a Union (or | ) is used we don't do any parsing of the values since we can't know which one is meant and it becomes very expensive to do that for all cases, for instance what if one is a known type and the other is a forwardref, then we can't parse the second, so what would we do? I did check and BaseModel's work, when I took this function and run it with a dict as input, it does work (with this code):

async def test_kernel_function_invoke(kernel: Kernel):
    decorator_test = MiscClass()
    my_func = getattr(decorator_test, "func_input_object_annotated")
    input_obj = dict(arg1="test", arg2=1)
    kernel.add_function(plugin_name="test", function=my_func)
    result = await kernel.invoke(function_name="func_input_object_annotated", plugin_name="test", input=input_obj)
    assert result.value == input_obj

Also there is a escape for more complex scenario's and that is to set the arguments as a keyword, in that case the full set of arguments passed to the invoke is passed, this can then be parsed inside your function to handle more complex loading, together with custom parameters this can also be leveraged with function calling.

Let me know if this helps!

@Druid-of-Luhn
Copy link
Author

@eavanvalkenburg Thanks for the explanation and how to get around some cases.

I had spotted the potential for the arguments workaround, but I don't know whether that would work in the case of the process framework, where it only calls a step once all of its arguments have values (since each previous step can only fill a single argument, this can make it wait for concurrent steps to complete). When using arguments as a keyword, would this not invoke the step as soon as a single argument is received?

@eavanvalkenburg
Copy link
Member

hmm I haven't spent much time on the process side of things, @moonbox3 can you have a look?

@moonbox3
Copy link
Contributor

Hi @Druid-of-Luhn, could you give a more concrete example, please, of what you want to achieve in the process framework?

where it only calls a step once all of its arguments have values (since each previous step can only fill a single argument, this can make it wait for concurrent steps to complete). When using arguments as a keyword, would this not invoke the step as soon as a single argument is received?

This is currently true about how we handle the "fan-in" behavior from previous steps -- we are making sure that for the given kernel function signature, we have all of the required parameters, which could be coming from different (previous) steps.

In any case, I would love to see further details, if possible. Thanks.

@Druid-of-Luhn
Copy link
Author

The diagram below should illustrate the case that led to me raising this request.

One step can emit one of two events, which fan out. Depending on the event, it might skip one of the fanned-out steps. The step that it may skip outputs a different type of data in its event, meaning that the fan-in step doesn't know what type of data it is receiving (in my case, it is one of two types).

So in the diagram below, the fan-in step D would want the signature d_function(left: ClassX | ClassY, right: ClassZ), or if it were to be a generic fan-in step (although it needs to output a specific event type, so it probably can't be that generic), it could just be d_function(left: Any, right: Any).

flowchart LR
    START --> A
    A -->|Path1, data=ClassX| B
    A -->|Path1&2, data=ClassX| C
    A -->|Path2, data=ClassX| D
    B -->|Path1, data=ClassY| D
    C -->|Path1&2, data=ClassZ| D
    D --> END
Loading

At the moment, since the fan-in only happens at the end of the process, I have just removed it and used stop_process() on the outputs of A (path 2), B (path 1) and C (paths 1 & 2).

@Druid-of-Luhn
Copy link
Author

It sounds like the case where the variants are pydantic classes can be solved by emitting the values as dicts, as eavanvalkenburg said, so the cases that aren't covered by that workaround are probably less common.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
None yet
Development

No branches or pull requests

5 participants