cuts and x_bboxes #79

kba · 2016-10-22T20:58:25Z

Why have mechanisms for relative and absolute positioning of codepoints within a word/cinfo?

Why not a bboxes attribute without the engine-specific prefix?

Related to #69

The text was updated successfully, but these errors were encountered:

kba · 2016-10-26T13:58:59Z

The "cuts" attribute is for representing cuts. It exists as a compact,
pixel-accurate representation of a character segmentation. Cuts are not
bounding boxes, and, in fact, are not all that useful unless you have the
original page image available.

kba · 2016-10-26T15:06:15Z

#17 (comment)

Cuts are for pixel-accurate segmentation in the presence of kerning,
something bounding boxes can't represent.

def decode_cuts(s, x=0, ymax=None):
    print repr(x)
    cuts = []
    for path in s.split():
        turns = [int(p) for p in path.split(",")]
        print repr(x), repr(turns)
        x += turns[0]
        pos = [x, 0]
        cut = [tuple(pos)]
        for i, d in enumerate(turns[1:]):
            pos[(i+1)%2] += d
            cut.append(tuple(pos))
        if ymax is not None:
            pos[1] = ymax
            cut.append(tuple(pos))
        cuts.append(cut)
    return cuts

To convert these to tight bounding boxes, you need the original binary
image (it's another 10-20 lines to do that conversion).

kba · 2016-10-26T15:51:27Z

@mttagessen in #17 (comment)

My point with the x_cuts, x_confs, x_* still stands even if you cut it down to a single engine and reencoding existing output. Without access to the particular model it is still impossible to align confidences/bboxes with code points even when you can make sure that nobody "tampered" with the file by renormalizing it to another Unicode normalization. The fundamental reason is that there is no mapping between Unicode code points and recognition units. Formats like AbbyyXML actually allow this alignment by being designed bottom-up (glyph-first) instead of top down like hOCR. I use "glyph" as the lowest level of label an engine may produce.

While per-character bounding boxes are indeed rather useless (and techniques like CTC layers may or may not produce them randomly), quite a few people seem keen on confidences for postprocessing.

kba · 2016-10-26T20:09:20Z

Kerning:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuts and x_bboxes #79

cuts and x_bboxes #79

kba commented Oct 22, 2016

kba commented Oct 26, 2016

kba commented Oct 26, 2016 •

edited

Loading

kba commented Oct 26, 2016

kba commented Oct 26, 2016

cuts and x_bboxes #79

cuts and x_bboxes #79

Comments

kba commented Oct 22, 2016

kba commented Oct 26, 2016

kba commented Oct 26, 2016 • edited Loading

kba commented Oct 26, 2016

kba commented Oct 26, 2016

kba commented Oct 26, 2016 •

edited

Loading