Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Fractions in nb and nn and their use of ordinals #688

Open
tarjeiba opened this issue Oct 27, 2022 · 4 comments
Open

Question: Fractions in nb and nn and their use of ordinals #688

tarjeiba opened this issue Oct 27, 2022 · 4 comments

Comments

@tarjeiba
Copy link

I am looking into how SRE treats fractions in Norwegian bokmål (nb) and nynorsk (nn).

Both numbers_nn.ts and numbers_nb.ts define numberToOrdinal(num: number, plural: boolean): string. What I don't really understand is the use of this function.

It looks to me like it's used through vulgarFraction(node: Element): Span[] in numbers_util.ts, where we say numberToOrdinal(denom, enum !== 1).

My issue is the following. If I now transform a fraction \frac{1}{3} in nb, I'll get "en tredje". The problem is this, while this is a literal translation of the english "one third", in bokmål it is "one third" in the sense of "I've got one third place, two forth places and one fifth place over my career as a racing driver". To get the fraction, "one third", I'd say "en tredjedel" (one part of three).

What strikes me, though, is that the tests in SRE-tests says that the current behavior is the wanted behavior. Is the current implementation what is considered correct?

Further, I'd like to refactor numberOrdinal to get my wanted behavior, but I am skeptical of doing so as I only want to affect its behavior when used in fractions. What I'd really like to do, is modify vulgarFraction to use a separate function for denom so that numberToOrdinal could do just that, take a number and return an ordinal, without the need to know whether the number is to be used in a fraction or is "plural".

Am I making any sense, or have I just not understood the implementation correctly? (The latter is highly likely.)

Thanks for all your work!

@tarjeiba tarjeiba changed the title Question: Fractions in nb and nn gives a mixture of numbers and ordinals Question: Fractions in nb and nn and their use of ordinals Oct 27, 2022
@zorkow
Copy link
Member

zorkow commented Nov 9, 2022

Thanks for reporting this.

Firstly, please note that the current tests are effectively the results SRE produced after the initial localisation. (Think of them as galley proofs in typesetting.) The normal procedure is that the native speaker working on the localisation read over these proofs, gives feedback and I fix things in an iterative process.
Unfortunately, I never received any feedback regarding the two Norwegian languages from the translator on the initial output at the time. This was also during Covid, which meant that we could not just spent a couple of days in the same room together, which usually helps insure correctness. So there might still be plenty of skeletons in the closet.

If you would find the time and have a look over the tests and send me anything you thinks is wrong, I'll be happy to make corrections.

@tarjeiba
Copy link
Author

I see. Thanks for your reply.

I might need some help getting the tests to compile locally. Right now I've just done a brutforce search-and-replace for e.g. "to femte" -> "to femtedeler", which seems to work ... but it feels like the wrong approach.

Or would you prefer that I submit a PR to the SRE repo directly, and you can use that to generate new tests for the affected parts?

@zorkow
Copy link
Member

zorkow commented Nov 10, 2022

Firstly, reading the test code is probably too difficult. However, all the output that we usually use for proof reading is at:
https://speech-rule-engine.github.io/sre-tests/output/
That gives you the different test categories, rendered math expressions and English speech in comparison.
I have not updated it in a while but Bokmal should still be the up to date.

If you want to fix up tests directly, a PR against the test repo https://github.com/Speech-Rule-Engine/sre-tests/ is of course always welcome. Again doing this by hand might be a very tedious job. But since speech generation is rule based (or procedural for numbers) its usually easier to spot the pattern that goes wrong and I can try to fix the rules and push new output.

If you want to actually fix code or rules up yourself, you are again very welcome. But as you said you might need a bit of help to get started. I'll be happy to answer questions here or by email. Alternatively, we can have an online chat.
I'll be travelling in the US the next two weeks (starting tomorrow morning). But maybe we can schedule a call somewhere around 21-23 Nov, i.e., before Thanksgiving?

@tarjeiba
Copy link
Author

Thanks for your reply. I sent an email to the address listed on your Github profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants