-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
In context of OERSI (see https://gitlab.com/oersi/oersi-etl/-/issues/360) we are in need of a fix function that gets rid of any html tags in a text and removes HTML-encoded special characters.
Idea could be:
html_to_text {
@Override
public void apply(final Metafix metafix, final Record record, final List<String> params, final Map<String, String> options) {
record.transform(params.get(0), s -> Jsoup.parse(s).wholeText());
}
},
Based on idea here: https://stackoverflow.com/questions/3607965/how-to-convert-html-text-to-plain-text
Not sure if we wait for #706
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Ready