Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(JS) Option to group checkboxes by proximity #183

Open
athewsey opened this issue Jun 12, 2024 · 0 comments
Open

(JS) Option to group checkboxes by proximity #183

athewsey opened this issue Jun 12, 2024 · 0 comments
Labels
enhancement New feature or request javascript Relates to the JavaScript/TypeScript version of TRP

Comments

@athewsey
Copy link
Contributor

In real-world forms, checkboxes / selection elements are usually grouped similar to the example below (from the Textract try-it-out console doc):

image

In Textract Key-Value Forms results, these items generally appear as un-grouped K-V pairs like:

  • (Key) VA -> (Value) NOT_SELECTED
  • (Key) Conventional -> (Value) SELECTED
  • ...
  • (Key) Fixed Rate -> (Value) SELECTED
  • ...

As of today, Textract doesn't do any grouping of these selection element fields, and also doesn't give us any mapping to predicted overall group label (e.g. Mortgage Applied for: versus Authorization Type:). I received a request from a customer for TRP (JS) to try and help more with this.

Since we don't really do ML within TRP itself, we can't get too fancy here... But I think it should be feasible to provide a way to access and iterate "selection groups" of form fields whose values are selection elements, by basic proximity heuristics?

something along the lines of e.g:

for (const group of page.form.iterSelectionGroups({
  // Whatever *optional* heuristic grouping parameters make sense:
  vDistTol: 0.6,
  hDistTol: 2.4,
})) {
  // Can loop through the Form Fields:
  group.listFields();
  // Maybe some other convenience methods?:
  group.listSelectedNames() == ["Conventional"];
  group.listUnselectedNames() == ["VA", "Other (explain):", "FHA", "USDA/Rural Housing Service"];

  // This will *not* be feasible:
  // group.name == "Mortgage Applied For:"
}

Tagging the label/name of the group wouldn't really be possible without a feasible ML model, which I don't think we're looking to introduce in TRP at this time. While I think we could get okay performance on grouping the checkboxes from heuristics alone, identifying the label would be much less likely to work well.

Interested to hear feedback from others on what kind of API & accessors you'd find most helpful for this feature

@athewsey athewsey added enhancement New feature or request javascript Relates to the JavaScript/TypeScript version of TRP labels Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request javascript Relates to the JavaScript/TypeScript version of TRP
Projects
None yet
Development

No branches or pull requests

1 participant