A few thingz


Joseph Basquin


29/03/2024

Working on PDF files with Python

There are many solutions to work on PDF files with Python. Depending on whether you need to read, parse data, extract tables, modify (split, merge, crop...), or create a new PDF, you will need different tools.

Here is a quick diagram of some common tools I have used:

If you need to extract data from image PDF files, it's a whole different story, and you might need to use OCR libraries like (Py)Tesseract or other tools.

Have some specific data conversion / extraction needs? Please contact me for consulting - a little script can probably automate hours of manual processing in a few seconds!

← Other articles

My personal blog.

twitter
email
github

Data / AI / Python consulting and freelancing.

Articles about:
#all
#music
#photo
#opensource
#python