This project is a library for extracting data from PDF files and generating a summary from the extracted content.
- Extracts all data from any object from PDF documents (Image, Table, Chart)
- Generates a concise summary from the extracted information
- Python 3.12.10
Clone the repository:
git clone https://github.com/techflare641/pdf-parsing-with-llm.git
cd pdf-parsing-with-llmInstall dependencies (if applicable):
python -m venv venv
pip install -r requirements.txtRun test scripts:
./venv/scripts/activate
./test.bat