[FIX] Import documents: normalize imported text and file names #568

PrimozGodec · 2020-09-10T10:03:10Z

Issue

Filename (and potentially text too) can contain characters that are written in decompose form (č is composed as a char c and separate caron). It causes problems when we filter documents (user inputs č as precomposed Unicode char).

Description of changes

With this PR text is normalized (all decomposed chars are changed to precomposed) before outputted from the widget

Includes

Code changes
Tests
Documentation

codecov-commenter · 2020-09-10T10:12:43Z

Codecov Report

Merging #568 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #568   +/-   ##
=======================================
  Coverage   73.80%   73.81%           
=======================================
  Files          66       66           
  Lines        7464     7465    +1     
  Branches     1000     1000           
=======================================
+ Hits         5509     5510    +1     
  Misses       1744     1744           
  Partials      211      211

Import documents: normalize imported text and file names

b12c4c8

ajdapretnar merged commit 3f7526b into biolab:master Sep 10, 2020

PrimozGodec deleted the unicode-normalize branch March 29, 2023 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Import documents: normalize imported text and file names #568

[FIX] Import documents: normalize imported text and file names #568

PrimozGodec commented Sep 10, 2020

codecov-commenter commented Sep 10, 2020

[FIX] Import documents: normalize imported text and file names #568

[FIX] Import documents: normalize imported text and file names #568

Conversation

PrimozGodec commented Sep 10, 2020

Issue

Description of changes

Includes

codecov-commenter commented Sep 10, 2020

Codecov Report