-
Notifications
You must be signed in to change notification settings - Fork 4k
FEAT: Image recognition in PDF files #1318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
enabled image recognition in PDF files
@onebottlekick please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
There is already a PR addressing this: I'm not sure which implementation is the better one, but I'd rather reuse the already existing llm caption helper instead of creating a new one for each converter. This is already refactored in this PR: #1254 Your LLM implementation is missing the arg llm_prompt from the image converter and llm_caption helper. This has not been passed down correctly, but i created a PR myself to fix this, so really would be better to have only one implementation and not multiple with different args used. |
added llm prompt for image recognition
enabled image recognition in PDF files