Fixed the GetUTF8Text method to return nullptr instead of asserting #4451
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes a crash in the
GetUTF8Textmethod by returningnullptrinstead of asserting whenbest_choiceisnullptr.Background
The issue was originally reported in sirfz/tesserocr#324, where a crash/assertion failure occurred when passing an empty image to tesserocr/Tesseract. The error message was:
While reproducing the issue with a test C++ program, I encountered a similar crash in another location:
Changes
GetUTF8Textto safely returnnullptrifbest_choiceisnullptr, instead of triggering an assertion failure.Additional Notes
I tested this with Warp2, which suggested updating multiple instances of
ASSERT_HOST(best_choice != nullptr);. This PR addresses the crash in the context ofGetUTF8Text.There are still 3 other occurrences of this assertion in the following files:
src/ccmain/recogtraining.cppsrc/ccstruct/pageres.cppIt's unclear if those should also be replaced, as they may be used in different contexts (e.g., training or layout analysis). Further review is recommended before changing them.
Reproduction
Here is a minimal test program that reproduces the crash (when using an empty image):