A Sanic-based web service for processing PDF files with OCR text detection and insertion. It allows users to upload PDF files, which are processed asynchronously using PaddleOCR, and return the modified PDF with detected text embedded at the detected positions. The UI is built using Vue.js for simplicity.
-
Clone the repository:
git clone https://github.com/[your-repo]/pdf-ocr-service.git cd pdf-ocr-service -
Install dependencies (create a virtual environment first if preferred):
pip install -r requirements.txt # Ensure requirements.txt includes: # sanic sanic-jinja2 sanic-session paddleocr numpy pypdfium2
-
Install PaddleOCR models:
# The Latin language model is required (ensure your environment meets PaddleOCR's prerequisites) pip install paddleocr[extra]
- Start the Sanic application:
The server runs on
sanic app --port 5000
http://localhost:5000.
docker build -t ocr-pdf-tool .
docker run -it --rm -p 5000:5000 -v .paddleocr:/root/.paddleocr ocr-pdf-tool-
Open the browser and go to
http://localhost:5000. -
Select a PDF file from your computer and click "Process PDF". The output PDF will be downloaded automatically.
curl -X POST -F 'file=@path/to/your/file.pdf' http://localhost:5000/process-pdf -o output.pdf| Package | Description |
|---|---|
| Sanic | Fast Python web server |
| PaddleOCR | OCR engine from PaddlePaddle |
| pypdfium2 | PDF rendering/pdf modification library |
| numpy | Array processing |
| Vue.js (template) | Lightweight frontend framework |
This project is licensed under the Apache License 2.0. See LICENSE for details.
Contributions are welcome! For issues or feature requests, please open an issue.