-
Notifications
You must be signed in to change notification settings - Fork 35
Open
Description
Recent versions of Watson Discovery have made undocumented changes to the format of the output of the Table Understanding enrichment. The old column names are documented at https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-understanding_tables#table-output-schema
Rough translation of field names into the new naming convention:
new_name_to_old = {
"row_min": "row_index_begin",
"row_max": "row_index_end",
"column_min": "column_index_begin",
"column_max": "column_index_end",
"cell_text": "text",
"id": "cell_id"
}Also, the field location at the top of the table record now appears to be optional.
Our conversion to Pandas needs to be updated to cover both the old schema and the new schema.
I recommend that we first determine which schema is the canonical one and convert non-canonical schemas to the canonical one as a preprocessing step.
Metadata
Metadata
Assignees
Labels
No labels