A few thingz


Joseph Basquin


21/12/2024

Working on PDF files with Python

There are many solutions to work on PDF files with Python. Depending on whether you need to read, parse data, extract tables, modify (split, merge, crop...), or create a new PDF, you will need different tools.

Here is a quick diagram of some common tools I have used:

If you need to extract data from image PDF files, it's a whole different story, and you might need to use OCR libraries like (Py)Tesseract or other tools.

Have some specific data conversion / extraction needs? Please contact me for consulting - a little script can probably automate hours of manual processing in a few seconds!

← Other articles

My blog – Joseph Basquin

twitter
email
github
linkedin
freelancing

Available for freelancing: Python expert / R&D / Automation / Embedded / Audio / Data / UX

I create products such as SamplerBox, YellowNoiseAudio, Jeux d'orgues, this blogging engine, etc.

Articles about: #all, #music, #opensource, #python.