-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: TemplateExpression with parameters #394
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Pull Request Test Coverage Report for Build 12531289252Details
💛 - Coveralls |
Apologies in advance for the delay Miles, I'm away over Christmas, and won't be able to test it out till the 28th. Happy holidays |
Hi Miles. Apologies for the later reply, I am also on and off these days. The API looks great to me, thanks so much for it! |
Hi Miles, Really like the syntax although I misunderstood that parameters = length(unique(i)) For full clarity: If batching is running, then length(unique(lens)) may be variable with each run. So lets say i has 4 categories, 1,2,3,4. To find the form p*sin(x) = y, I should set number of parameters to be 4 or 1? structure = TemplateStructure{(:f,)}(
((; f), (x1, x2, x3, x4, x5, x6, y, cat), c) -> begin
o = f(x1, x2, x3, x4, x5, x6, c[cat])
if !o.valid
return ValidVector(Vector{Float32}(undef, length(o.x)), false)
end
#Compute gradients for first 3 variables at once
for varidx in 1:3
grad_column = D(f, varidx)(x1, x2, x3, x4, x5, x6, c[cat]).x
if !(all(grad_column .>= 0) || all(grad_column .<= 0))
return ValidVector(fill(Float32(1e9), length(o.x)), true)
end
end
return o
end;
num_parameters = length(unique(cat)) #I understand that cat is not defined here, but how do I pass a variable number of parameters so that this code works with batching?
)
model = SRRegressor(
niterations=1000000,
binary_operators=[+,-,*,/],
maxsize=50,
bumper=true,
turbo=true,
populations=18,
expression_type = TemplateExpression,
expression_options = (; structure),
population_size=100
parsimony = 0.01,
batching=true,
)
mach = machine(model, X, y)
fit!(mach) |
And if In other words, Note that confusingly, The one downside is you do need to do a bit more work to set up a problem. But I feel like it's so explicit and flexible that it's worth it, and maybe even easier to understand? This also means you could have two category types: Do you think |
What about |
ecb3050
to
415fc5c
Compare
Oh and to try this PR, you need to do ]add SymbolicRegression#template-parametric-expression |
Thank you Miles, it works well now, I was half asleep from a very delayed flight and didn't realise it wasn't merged yet! structure = TemplateStructure{(:f,)}(
((; f), (x1, x2, x3, x4, x5, x6, y, cat), c) -> begin
o = f(x1, x2, x3, x4, x5, x6, c[cat])
...
return o
end;
num_parameters = 12
) Here if I set num_parameters to 12 (which it is for the whole dataset, but batching is true and num_parameters is only 1 or 2 for that particular batch which calls upon structure, does it matter if num_parameters > length(unique(cat)) in that particular batch? The code seems to be running ok otherwise.
Would it better to name it exactly what it is total_num_unique_cat although a bit unwieldy?
structure = TemplateStructure{(:f,:g)}(
((; f, g), (x1, x2, x3, x4, x5, x6, y, i1, i2), c) -> begin
o = f(x1, x2, x3, x4, x5, x6, c[i1]) + g(x1,c[i2])
...
return o
end;
num_parameters = 6
) if i1 had 4 unique categories and i2 had 2 unique categories? |
Agreed on parameter being an overloaded word here.
Well technically you could use the same category but have it correspond to multiple parameters. Like The way you would set it up for this would be: ((; f), (x1, x2, cat), p) -> let
p1 = p[cat] # gets value 1 to 4
p2 = p[cat + 4] # gets value 5 to 8
p3 = p[cat + 8] # gets value 9 to 12
# for example:
f(x1) + p1 * x2 + p2 * x2^2 + p3 * x2^3
end So the number of parameter scalar values is 12, even though there are 4 categories in the dataset. The other tricky thing is that this p vector doesn’t need to be used for this type of pattern. You could actually just use the p vector to learn a single constant for some sort of functional form — no categories needed. (As an alternative to giving a |
In your example it would need - o = f(x1, x2, x3, x4, x5, x6, c[i1]) + g(x1,c[i2])
+ o = f(x1, x2, x3, x4, x5, x6, c[i1]) + g(x1,c[i2+4])
Assuming that i1 is from 0 to 3 and i2 is from 0 to 1. But if you let i2 be from 4 to 5, then you wouldn’t need the offset. |
I can’t think of another API here that is as flexible and generic. But clearly this API is a bit confusing for this common type of pattern. So maybe try to brainstorm some alternatives if you have any ideas. |
Oh here’s one other idea: structure = TemplateStructure{(:f, :g)}(
parameters = (p1=4, p2=2), # Each parameter has its own range 1:n
template = ((; f, g), (; p1, p2), (x1, x2, i1, i2)) ->
f(x1) + p1[i1] + g(x2) + p2[i2]
) So basically you would provide the parameter groups you want (p1, p2) and then say how large you want each of them to be. It’s cleaner than needing to manually offset things. The implementation is probably easy too. |
And actually we could infer the number of parameters with this. If we write structure = TemplateStructure{(:f, :g), (:p1, :p2)}(
template = ((; f, g), (; p1, p2), (x1, x2, i1, i2)) ->
f(x1) + p1[i1] + g(x2, p2[i2])
) Then technically you could have the constructor figure out that p1 is indexed by i1, and compute the number of parameters needed in the forward pass. But I’m not sure if we should do that or not, since it’s a bit “spooky action at a distance”? If things are explicit it might be easier for the user to see how to modify it to an unusual use case. But maybe not, I’m not sure! |
Hi Miles, I have to say (with a large disclaimer) that the last suggestion makes much more intuitive sense for me. The disclaimer is this is not my profession and I just do this stuff in my spare time when I get the chance as it greatly adds to my research output. What might not be intuitive for me might make perfect sense to other users. Maybe this modification is best of both worlds? ((; f), (x1, x2, cat), p) -> let
p1 = p[cat] # gets value 1 to 4
p2 = p[cat + 4] # gets value 5 to 8
p3 = p[cat + 8] # gets value 9 to 12
# for example:
f(x1) + p1 * x2 + p2 * x2^2 + p3 * x2^3
end In this example, cat presumably will be a vector of I would have thought it makes more sense to have p take a tuple p[cat,<parameter_num>]. So that the input would be: ((; f), (x1, x2, cat), p) -> let
p1 = p[cat,1] # gets value 1 to 4
p2 = p[cat,2] # gets value 5 to 8
p3 = p[cat,3] # gets value 9 to 12
# for example:
f(x1) + p1 * x2 + p2 * x2^2 + p3 * x2^3
end Then the constructor would use a mod and length(unique(cat)) to convert it, something like (although I know it would need more tweaking). p1 = p[mod(1,length(unique(cat)) * length(unique(cat))] #which should return 4
p2 = p[mod(2,length(unique(cat)) * length(unique(cat))] #which should return 8
p3 = p[mod(3,length(unique(cat)) * length(unique(cat))] #which should return 12
Hopefully that would give you the same functionality but without the user having to keep track of the number of unique categories within each categorial variable, but only have to keep track of how many parameters they have called on for each categorial variable? |
@gm89uk @Andrea-gm here's an attempt at MilesCranmer/PySR#787. If you get a chance to try, I'd be very interested to hear what you think.
The API is like this:
You need to specify how many total parameters you ask for in advance. Usually this is just
length(unique(i))
for some vectori
that contains the categorization of each row. But you can have multiple categorizations too – just pass those as additional columns to the dataset. (They will get converted back to integers when indexingp
here)Those parameters can be used in any way you want. You can pass them to one of the subexpressions:
which basically let's you do what
ParametricExpression
does, but in a fixed functional form way.You can also get parameters manually, like this:
Where you can see we just ask for a single parameter in the structure.
Full example: