Key Error in f2i Dict #105

zeinab-sheikhi · 2024-04-04T16:55:41Z

Hi,

I recently came across an issue while working with your library's Latin example. Upon inspection, I noticed that the test dataset used in the example is actually part of the training dataset.

Additionally, I attempted to apply your library to my custom language dataset, where the test dataset is distinctly separate from the training dataset. However, during implementation, I encountered a "KeyError in f2i" for the test dataset. This error indicates that some trigrams in the test dataset are not present in the f2i (feature-to-index) mapping.

Could you please provide guidance on how to handle this scenario? It seems crucial for the library to support cases where the test dataset contains trigrams not present in the f2i mapping.

Thank you for your attention to this matter.

MariaHei · 2024-04-08T14:45:07Z

Hi @zeinab-sheikhi,

thanks for raising this issue! I'll have a look into the Latin example.

Regarding your issue, there is a function which is able to deal with exactly that, the make_combined_cue_matrix function (see documentation here). You have to provide both training and validation data to the function and it will make sure that there are are columns in the C matrix also for any trigrams that only occur in the validation data.

Hope that helps!
Maria

MariaHei · 2024-04-08T16:09:09Z

The latin issue is now fixed in both the readme and the documentation.

MariaHei · 2024-07-03T15:54:01Z

Closing this because the issue seems to be fixed. Please let me know in case you still have any questions.

MariaHei added bug Something isn't working question Further information is requested labels Apr 8, 2024

MariaHei closed this as completed Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Key Error in f2i Dict #105

Key Error in f2i Dict #105

zeinab-sheikhi commented Apr 4, 2024

MariaHei commented Apr 8, 2024 •

edited

Loading

MariaHei commented Apr 8, 2024

MariaHei commented Jul 3, 2024

Key Error in f2i Dict #105

Key Error in f2i Dict #105

Comments

zeinab-sheikhi commented Apr 4, 2024

MariaHei commented Apr 8, 2024 • edited Loading

MariaHei commented Apr 8, 2024

MariaHei commented Jul 3, 2024

MariaHei commented Apr 8, 2024 •

edited

Loading