Processing PDFs with Mathpix Python SDK
The Mathpix Python SDK allows you to process entire PDF files and extract content such as text, LaTeX, Mathpix Markdown (MMD), and more.
Code Example
from mpxpy.mathpix_client import MathpixClient
client = MathpixClient(
app_id="your-app-id",
app_key="your-app-key"
)
# Process a PDF file
pdf_file = client.pdf_new(
file_url="http://cs229.stanford.edu/notes2020spring/cs229-notes1.pdf",
conversion_formats={
"md": True
}
)
# Wait until the processing is complete
pdf_file.wait_until_complete(timeout=60)
# Download the converted files to a local folder
pdf_file.download_output_to_local_path("md", "./output")
Rendered Output Example: From PDF to MMD
This example shows how the Mathpix Python SDK can process a PDF and convert its content into structured Markdown with math formatting.

The converted result includes structured text, LaTeX math, and even tables extracted from the original document.
This is a great way to turn scientific PDFs into clean, editable Markdown that you can post-process or publish.