Last couple of days I’ve been working on optimizing the Docker image size of a PDF processing microservice. The service uses Docling, an open-source library developed by IBM Research, which internally uses PyTorch. Docling can extract text from PDFs and various other document types. Here’s a simplified version of our FastAPI microservice that wraps Docling’s functionality.
import os
import shutil
from pathlib import Path
from docling.document_converter import DocumentConverter
from fastapi import FastAPI, UploadFile
app = FastAPI()
UPLOAD_DIR = "uploads"
os.makedirs(UPLOAD_DIR, exist_ok=True)
converter = DocumentConverter()
@app.post("/")
async def root(file: UploadFile):
file_location = os.path.join(UPLOAD_DIR, file.filename)
with open(file_location, "wb") as buffer:
shutil.copyfileobj(file.file, buffer)
result = converter.convert(Path(file_location))
md = result.document.export_to_markdown()
return {"filename": file.filename, "text": md}
The microservice workflow is straightforward:
- Files are uploaded to the
uploads directory
- Docling converter processes the uploaded file and converts it to markdown
- The markdown content is returned in the response
Here are the dependencies listed in requirements.txt:
fastapi==0.115.8
uvicorn==0.34.0
python-multipart==0.0.20
docling==2.18.0
You can test the service using this cURL command:
curl --request POST \
--url http://localhost:8000/ \
--header 'content-type: multipart/form-data' \
--form file=@/Users/shekhargulati/Downloads/example.pdf
On the first request, Docling downloads the required model from HuggingFace and stores it locally. On my Intel Mac machine, the initial request for a 4-page PDF took 137 seconds, while subsequent requests took less than 5 seconds. For production environments, using a GPU-enabled machine is recommended for better performance.
The Docker Image Size Problem
Initially, building the Docker image with this basic Dockerfile resulted in a massive 9.74GB image:
FROM python:3.12-slim
RUN apt-get update \
&& apt-get install -y
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
docling-blog v1 51d223c334ea 22 minutes ago 9.74GB
The large size is because PyTorch’s default pip installation includes CUDA packages and other GPU-related dependencies, which aren’t necessary for CPU-only deployments.
The Solution
To optimize the image size, modify the pip installation command to download only CPU-related packages using PyTorch’s CPU-specific package index. Here’s the optimized Dockerfile:
FROM python:3.12-slim
RUN apt-get update \
&& apt-get install -y \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Building with this optimized Dockerfile reduces the image size significantly:
docling-blog v2 ac40f5cd0a01 4 hours ago 1.74GB
The key changes that enabled this optimization:
- Added
--no-cache-dir to prevent pip from caching downloaded packages
- Used
--extra-index-url https://download.pytorch.org/whl/cpu to specifically download CPU-only PyTorch packages
- Added
rm -rf /var/lib/apt/lists/* to clean up apt cache
This optimization reduces the Docker image size by approximately 82%, making it more practical for deployment and distribution.