pdf-processing – Shekhar Gulati

I am working on a regulatory intelligence product. We are using LLMs in multiple document processing tasks. One of the tasks that I was working on today required me to extract text from a PDF document. The PDF in question is a circular from SAMA (The Saudi Central Bank). The 2-page long PDF is in Arabic language and looks like a natively digital document with consistent formatting rather than a scanned document.

The task that I wanted to perform was to extract the Arabic text and translate it to English. This looks like an easy task that most LLMs should be able to do since they all support PDF processing and they can read and write different languages.

Below is the screenshot of the first page of the document.

I tried four models on this PDF: gpt-4o via https://chatgpt.com/, Grok 3 via https://grok.com/, Gemini 2.0 Flash via https://gemini.google.com/, and Claude 3.7 Sonnet via https://claude.ai/. Below are my findings.

For all the models, I just uploaded the PDF and prompted Extract text as English.

Tag: pdf-processing

Claude 3.7 Sonnet is good at PDF processing