Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize struct names #37

Open
jiegillet opened this issue Oct 31, 2021 · 12 comments
Open

Normalize struct names #37

jiegillet opened this issue Oct 31, 2021 · 12 comments
Labels
x:action/improve Improve existing functionality/content x:knowledge/elementary Little Exercism knowledge required x:module/representer Work on Representers x:size/large Large amount of work x:type/coding Write code that is not student-facing content (e.g. test-runners, generators, but not exercises)

Comments

@jiegillet
Copy link
Contributor

This is mentioned in #29.

Struct names should be replaced by placeholders as well.

Example input

defmodule Something do
  defstruct [:a, :b]
  def something(%Something{a: a}), do: %Something{a: 2 * a}
  def something_else(%__MODULE__{a: a}), do: %__MODULE__{a: 3 * a}
end

Desired output

defmodule(PLACEHOLDER_1) do
  defstruct([:a, :b])
  def(PLACEHOLDER_2(%PLACEHOLDER_1{a: PLACEHOLDER_3})) do
    %PLACEHOLDER_1{a: 2 * PLACEHOLDER_3}
  end
  def(PLACEHOLDER_4(%__MODULE__{a: PLACEHOLDER_3})) do
    %__MODULE__{a: 3 * PLACEHOLDER_3}
  end
end

I think __MODULE__ should not be replaced, because it means different things in different modules. It's already normalized, in a way...

Current output

defmodule(PLACEHOLDER_1) do
  defstruct([:a, :b])
  def(PLACEHOLDER_2(%Something{a: PLACEHOLDER_3})) do
    %Something{a: 2 * PLACEHOLDER_3}
  end
  def(PLACEHOLDER_4(%__MODULE__{a: PLACEHOLDER_3})) do
    %__MODULE__{a: 3 * PLACEHOLDER_3}
  end
end
@angelikatyborska angelikatyborska added x:action/improve Improve existing functionality/content x:knowledge/elementary Little Exercism knowledge required x:module/representer Work on Representers x:type/coding Write code that is not student-facing content (e.g. test-runners, generators, but not exercises) x:size/large Large amount of work labels Oct 31, 2021
@butterknight
Copy link
Contributor

Hi! 👋
I tried playing around with this in an effort to both learn Elixir and try to make something productive.

I created a new folder inside test_data with:

input.ex

defmodule Something do
  defstruct [:a, :b]
  def something(%Something{a: a}), do: %Something{a: 2 * a}
  def something_else(%__MODULE__{a: a}), do: %__MODULE__{a: 3 * a}
end

expected_representation.txt

defmodule Placeholder_1 do
  defstruct [:a, :b]

  def placeholder_2(%Placeholder_1{a: placeholder_3}) do
    %Placeholder_1{a: 2 * placeholder_3}
  end

  def placeholder_4(%__MODULE__{a: placeholder_3}) do
    %__MODULE__{a: 3 * placeholder_3}
  end
end

expected_mapping.json

{
  "Placeholder_1": "Something",
  "placeholder_2": "something",
  "placeholder_3": "a",
  "placeholder_4": "something_else"
}

And I wanted to see the current result so I ran the test I got this:
image

If I read this correctly, it would appear that the struct name is already properly normalised, but the key wasn't. Did I do something wrong or is this already partially implemented?

@angelikatyborska
Copy link
Member

Hmm... @jiegillet is it possible that we accidentally implemented this in #44 ?

@jiegillet
Copy link
Contributor Author

jiegillet commented Apr 20, 2022

Yes, it's entirely possible :)
It happened when we normalized module names, the struct names followed along. I didn't close this issue because of the key behavior.

I don't necessarily think that we should normalize all map keys indiscriminately, but we certainly can do it for the ones in defstruct because they are user defined. In that sense the current behavior is a step in the right direction and what is missing is

defstruct [:placeholder_1, :placeholder_2] # or whatever numbers

@angelikatyborska what do you think?
I also suspect that #44 may have introduced some undesirable behavior with key replacement, so I would suggest to test the representer on some larger codebase that uses maps in various ways...

Quick example: in import A, only: [function_a: 1] we don't want to touch only or function_a. Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.

@angelikatyborska
Copy link
Member

angelikatyborska commented Sep 3, 2022

I don't necessarily think that we should normalize all map keys indiscriminately

I'm having a hard time coming up with a good rule about what to normalize when with regard to map, struct, and keyword list keys.

In general, it seems to me like key names are just like variable names and should be normalized. But then I start coming up with a bunch of exceptions: keys in structs that come from the standard library are always known and shouldn't be normalized. Keys in some special usage keyword lists like import A, only: ... or IO.inspect(x, label: "x"). I feel like trying to chase each exception will drive us crazy and I would much rather go all in or all out - normalize all of the keys or none of the keys. I'm also curious what other languages do with similar problems.

I'm currently writing a bunch of more tests that just pass to highlight the current behavior: https://github.com/exercism/elixir-representer/pull/73/files

For example struct names are only kept if they're not user defined, which is great. And you're right about this bug:

Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.

@angelikatyborska
Copy link
Member

@iHiD advised on Slack:

About representers: if you have key-value data structures, should you replace key names with placeholders?

If they don't matter to the implementation of the exercise (which as far as I can tell they shouldn't otherwise the tests would fail), then yes.

@iHiD
Copy link
Member

iHiD commented Sep 4, 2022

Quick example: in import A, only: [function_a: 1] we don't want to touch only or function_a. Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.

I know this isn't the point of this discussion, but removing import statements entirely is potentially quite a good idea. (F# does this: https://exercism.org/docs/tracks/fsharp/representer-normalizations#h-remove-import-declarations)

@angelikatyborska
Copy link
Member

@iHiD What's the reasoning behind removing imports? I suspect they might not apply in Elixir.

@iHiD
Copy link
Member

iHiD commented Sep 4, 2022

I'd sort of flip the question and ask what's the reason for keeping them?

The solution being normalised passes the tests, so it's importing whatever it needs to import. So the question becomes are they things you'd ever want to give feedback on?

  • If they are, then it's good to to keep them. (e.g. if you want to comment when someone imports more than needed (e.g. missing only: [...]) or imports something but doesn't use it).
  • If they're not, then getting rid of them is just less potential difference in the representations (e.g. if someone imports A then B, and another student imports B then A, the placeholders will end up entirely different and so the representations will not normalise).

My instinct is that they're not things that we need to give feedback on automatically, so increasingly the likelihood of normalisation feels better.

@angelikatyborska
Copy link
Member

So the question becomes are they things you'd ever want to give feedback on?

Most of the time, yes.

I can see those scenarios for imports in Exercism exercises:

  • Somebody is doing the one learning exercise where importing a module given in the boilerplate is expected and required. Here giving feedback on the wrong usage of :only vs :except would be good.
  • Somebody is trying to import something from the standard library in another exercise, where I would want to tell them to stop. The whole standard library is available in Elixir without doing any imports by referencing functions with module names, which is the expected way of using the standard library.
  • Somebody is trying to import their own module that they submitted as part of the solution. This means that their solution is already unique enough that the extra normalization won't help much - most solutions are a single module.

@jiegillet
Copy link
Contributor Author

e.g. if someone imports A then B, and another student imports B then A, the placeholders will end up entirely different and so the representations will not normalise

That's a good idea, let's order imports :)

@angelikatyborska
Copy link
Member

@jiegillet I take your comment to mean that you agree that imports should stay in the representations 😁 I created an issue about ordering them: #75

@iHiD
Copy link
Member

iHiD commented Sep 5, 2022

I can see those scenarios for imports in Exercism exercises:

While those are solid examples, all three of them feel like they could be really good analyzer checks that would cover all exercises and then allow the representations to be further normalised and then reduce the amount of feedback given manually in the automation UI.

But I'm entirely happy to defer to whatever you decide - you know the range of possibilities in Elixir much better than me! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
x:action/improve Improve existing functionality/content x:knowledge/elementary Little Exercism knowledge required x:module/representer Work on Representers x:size/large Large amount of work x:type/coding Write code that is not student-facing content (e.g. test-runners, generators, but not exercises)
Projects
None yet
Development

No branches or pull requests

4 participants