September 5, 2023

Deployed new version (v3/pdf*), RSK-P101.

Overview of changes

This update adds basic support for text in the footnote section of the page. Instead of breaking the main flow, especially in multi-column documents,
the text will be wrapped inside \footnotetext{ ... }.

May 25, 2023

Deployed new version, RSK-115

Overview of changes

This update improves quality of printed and handwritten Chinese recognition.

April 12, 2023

Deployed new version, RSK-113.

Overview of changes

This update is focused on chemistry recognition.
As a reminder, when "include_smiles": true is a part of the request, Mathpix can recognize chemistry diagrams such as:
and return the SMILES representation which looks like this:
The list of improvements:
  • support for stereochemistry
    • for example, the image stereochemistry_1 is transcribed to: <smiles>O=S(=O)(c1ccc(F)cc1)N1C[C@@H](O)[C@H](N2CCCC2)C1</smiles>
  • support for Markush structures
    • for example, the image markush_structure_0 is transcribed to <smiles>[Z2]Nc1c(CC([R10])CSC)ncn1CC#C</smiles>
  • basic support for superatoms
    • more superatoms will be supported in future
  • significant recognition accuracy improvement for both handwritten and printed chemistry diagrams

February 22, 2023

Deployed new version, RSK-111.

Overview of changes

  • Added support for the new table recognition algorithm
  • Minor general changes needed to properly support the algorithm

Table recognition - new algorithm

A new algorithm is available in the v3/text and v3/pdf endpoints. It can be enabled by specifying "enable_tables_fallback": true as one of the request arguments.
We care deeply about backwards compatibility. The new algorithm will only be used if both of the following conditions are fulfilled:
  • Our standard algorithm failed to recognize a table.
  • "enable_tables_fallback": true is specified as a request argument.
We have ensured that there will be no computational overhead for customers who do not specify this option, so response times will not be affected.
We have invested in a hybrid approach that will be able to tackle complicated cases like:
  • extremely large tables (e.g. tables with hundreds of cells)
  • tables with very complex structures (e.g. tables with many \multirow and \multicolumn cells)
  • tables featuring table cells with complex content like:
    • table cells with complex math like large matrices or several aligned equations
    • tables cells containing whole paragraphs of text
  • tables containing text in languages that are more challenging to recognize properly compared to English:
    • this includes languages with rich alphabets like Chinese, Japanese, Hindi, Hebrew, Arabic, and others
  • tables containing cells with rotated text
  • tables containing diagrams inside cells like:
    • table cells with chemistry diagrams (note that these can be converted to SMILES)
    • table cells with natural images or similar:
      • table will still be recognized and contain the image link for the diagram inside its cells
We will also support all combinations of the above cases.
The algorithm we are releasing now might still struggle with:
  • tables with cells containing complex math like large matrices or several aligned equations
  • tables containing diagrams inside cells
  • empty grids of cells without textual content
  • tables with rotated text are partially supported:
    • in v3/pdf cells containing rotated text will be embedded as images
We will add improvements that will cover specified cases shortly.
Some differences in output produced by the new algorithm compared to the standard one:
  • Column alignment is always central.
  • All cells have all borders (top, bottom, right, and left).

February 1, 2023

Deployed new version, RSK-110.

Overview of changes

  • Added support for new Latex commands: \measuredangle, \grave, \bumpeq, and \amalg.
  • Improved recognition of constructs expressed with \lceil, \rceil, \lfloor, and \rfloor in combination with \left and \right.
  • Improved formatting of equations in text mode (see bellow for details).
  • Improved recognition of equations that contain large sub-equations in subscripts.
  • Improved recognition of handwritten French.
  • Improved recognition of handwritten German.
  • Improved recognition of handwritten Chinese.
  • Improved recognition of handwritten Japanese.
  • General improvements (new data iteration).

Formatting changes

The default “text” output has been changed, for example see the following equation:

from the current “text” output:
\( y=mx+b \)\n\( x=y^{2}-1 \)
and 2 asciimath equation outputs, to:
\( \begin{array}{l} y = mx + b \\ x=y^{2}-1 \end{array} \)
with 1 single asciimath equation output:
which will make the “text” derived formats more consistent with what is currently returned for equations with a left brace:
which currently yields this “text” output:
\( \left\{\begin{array}{l}2 x+8 y=21 \\ 6 x-4 y=14\end{array}\right. \)
and this “asciimath” output:
Since we are already emitting in certain cases (eg, equations with no braces aligned around the “=” sign instead of being left aligned) asciimath for v3/text that looks like:
we consider this update to be an inconsistency bugfix instead of a new feature with the potential to break backwards compatibility.
In general, it is desirable for our API for small changes in input to result in small changes in output. For example, removing the left brace from equation 2 will simply change the v3/text asciimath from:
which is a smaller change than the previous behavior, in which subtracting a left brace results in two equations instead of 1.

January 30, 2023

Deployed new version, RSK-109.
Improved recognition of isolated symbols.

January 12, 2023

Deployed new version, RSK-108.
Improved handling of images that contain mixed math and text in Russian.

December 2, 2022

Deployed new version, RSK-107p2
This update features two changes:
  • changes in spacing of arrays, aligned arrays and similar, & and \\ now always have spaces around them (even with rm_spaces in the request)
  • visually unpleasant blocks of equations are being converted to left alignment instead of keeping the wrong alignment

November 25, 2022

Deployed new version, RSK-107.
Improvements related to worksheet crops, small images with strong or dashed border near the content.

November 22, 2022

Deployed new version, RSK-106.
Improvements related to formatting of references in PDF pages, especially pages with green/red link boxes.

November 15, 2022

Deployed new version, RSK-105.
General improvements to handling zoomed out and zoomed in images. No changes to output formatting or error characteristics.

November 14, 2022

Deploying new version RSK-104p1.
Formating of block math is fixed in certain cases where the equations were wrongly kept in the text mode.

November 11, 2022

Deploying new version RSK-104.
Incremental improvement of image parsing module. Includes fixes for images with many lines of text. Accuracy improvements on handwritten data. No changes to output formatting.

November 3, 2022

Deploying new version RSK-103p1 which fixes string post processing issues.
In this version, we have changed the default Markdown / LaTeX for the following character:
# -> \# 
While # works fine in Markdown and has the same behavior as \#, the former causes LaTeX compilation issues, whereas \# succeeds in LaTeX without any problem. We chose to always emit LaTeX \# instead of # so that our output would be more compatible and less likely to cause issues. The updated character \# is compatible with Markdown as well as LaTeX.
Unescaped # will simply no longer appear in OCR Markdown / LaTeX outputs.
Alternative math formats such as Asciimath are not affected by the change, this is a Markdown / LaTeX change only.

October 21, 2022

  • New enable_spell_check option to the v3/text and v3/pdfs endpoints greatly improves handwriting OCR for English (other languages coming soon)

June 6, 2022

  • Resolved a critical bug that impacted PDF processing of 2 column PDFs
  • You can now request that only certain subsets of pages are processed in a PDF, via the new page_ranges field
  • Pushed latency improvements that benefit all endpoints, reducing processing time by 30% on average
  • You can now query hour by hour API usage using the following endpoint

April 27, 2022

  • Updated how line data is represented for PDFs from using rectangular regions to polygonal contours (this is helpful for handwritten PDFs where text lines are generally not rectangular)
  • Added page dimensions to the line-by-line data structures
  • There are two available data structures for line-by-line data:
    • Raw PDF lines data: this is the ideal data structure for searching; does not contain contextual annotations for titles, abstracts, etc.
    • Context enhanced PDF mmd lines data: you can use this to re-create the full document, including contextual annotations for titles, abstracts, etc. (see here for syntax)
  • Published a Github repo which contains client-side code for live drawing with the Mathpix digital ink API containing a fully working example of leveraging user actions like scribbling and strikethrough to delete content

April 18, 2022

  • Added an EU server region (AWS region eu-central-1) to decrease latencies for European customer and also for adherance to GDPR
  • You can now use app-tokens for authenticating requests inside client side app code
  • The new app-tokens route provides a include_strokes_session_id flag, which when true, returns a strokes_session_id string that can be used inside calls to v3/strokes, enabling digital ink sessions with live updates
    • Pricing for the strokes endpoint when using session_ids can be found here
  • Add OCR support for basic handwritten PDFs

March 28, 2022

  • You can now get detailed line-by-line data for PDFs, including geometric coordinates, via the new GET v3/pdf/<pdf_id>.lines.json endpoint.
  • Better robustness for our v3/text endpoint:
    • Our ability to correctly interpret complex layouts involving math and text has improved, with much-improved edge case handling and handling of line text for skewed images and other image distortions that occur frequently in consumer photo search applications.

March 14, 2022

  • Our new OCR models feature stringent guarantees of syntactic correctness, resolving a rare but long-standing problem of occasionally malformed LaTeX strings, resulting in rendering errors due to double subscripts, double superscripts, malformed tables, and other syntax issues. This has been fixed at a fundamental level. Syntax issues are essentially completely fixed.
  • Deprecated \atop command in favor of \substack

February 8, 2022

We have recently switched to a new, faster database to save image and PDF data. Next week, we will decommission our old database. This will result in OCR API image results log data from before December 1st, 2021 becoming unavailable via the GET v3/ocr-results endpoint. Note that we have already migrated all PDF data to the new database, so there will be no data loss for PDF data.

November 15, 2021

  • Deployed incremental update to our re OCR engine, resulting in:
    • significantly improved handwriting recognition, including disambiguating symbols based on context
    • improved table parsing accuracy
    • notably fewer errors

September 2, 2021

  • Deployed a core algorithm update for our image parsing module, resulting in significantly better accuracy and edge case behavior for all endpoints
  • Updated CLI with additional conversion types
  • Significant improvements to chemical diagram OCR
  • Support for asynchronous image processing

July 27, 2021

  • Added support for Tamil, Telugu, Gujarati, and Bengali
  • Updated our OCR to use a more effective representation of Chinese characters, leading to higher accuracy, and better coverage.
  • Added support for \bigcirc

July 19, 2021

  • Support for sending image binaries for lower image upload latencies
  • Support for tags which allow you to associate an attribute with your requests and subsequently retrieve the associated requests by using tags in a /v3/ocr-results query
  • PDF processing updates:
    • Fixed a bug where pages were getting skipped
    • Improved processing of PDFs with foreign languages
    • Added support for configurable math delimiters

April 12, 2021

  • Servers in Singapore for faster latencies for API customers in Asia
  • Triangle diagram OCR now supported for diagrams commonly found in trigonometry textbooks
  • Added InChI option for chemical diagram OCR

April 2, 2021

  • Added a include_word_data parameter to the v3/text endpoint, which when set to true, returns word by word information, with separate results, confidences values, and contour coordinates for each word
  • Beta printed chemical diagram OCR to return SMILES format

March 10, 2021

  • New v3/pdf API endpoint (beta)
  • PDF conversion CLI tool
  • Fixed miscellaneous bugs in v3/text processing for messy images
  • Incremental improvements to handwriting recognition and printed table recognition for all endpoints
  • Added support for the following printed characters:

February 7, 2021

January 5, 2021

  • Improved math handwriting recognition
  • Improved printed Romanian, Polish, Serbian, Ukrainian recogntion
  • Added support for the following LaTeX characters:

December 1, 2020

November 12, 2020

  • added autorotation for v3/text
Images like this now work in v3/text:
The goal of automatic rotation is to pick correct orientation for received images before any processing is done. The result of auto rotation looks like:
We will soon add these features to v3/latex and v3/batch as well. We implemented a very conservative rotation confidence threshold, meaning you should still try to call the API with a properly oriented image if possible!
API docs on autorotation:

November 9, 2020

  • v3/text general improvements
First of all, we trained our models on a larger dataset, resulting in a general accuracy increase.
Secondly, we improved the precision of predictions, at the potential cost of slightly decreasing prediction recall in some circumstances. Here is an example of an image, where previously our v3/text tried to read the bottom, cut off parts of the image:
Now, v3/text ignores these sections, resulting in a much cleaner output than before. The endpoint will still try to read everything in an image (vs the v3/latex endpoint which tries to read the main equation), but will be slightly less aggressive in reading unusual image sections in order to avoid garbage outputs.
  • chemistry diagram detection
We have added a new field in our LineData object, subtype, so that we can return more information about diagrams to API clients. Currently subtype can only be chemistry, but more diagram subtypes are coming soon.
See an example request with chemistry:
  • added ability to create and disable new API keys to OCR dashboard
  • added ability to invite users (to have access to API keys, usage statistics, image results dashboard) to OCR dashboard

October 14, 2020

  • Add support for the following characters:
  • Improved accuracy on:
    • Handwriting (math and text)
    • Hindi language recognition (printed)
    • Tables and matrices
  • We now support backgroundless PNGs with alpha channels

August 14, 2020

August 1, 2020

July 13, 2020

  • Replace \dots with either \ldot or \cdot
  • Predict empty braces when appropriate, like in Chemistry images (eg {}^)
  • Fix v3/text bug where very wide lines of text were getting skipped
  • Improved accuracy on handwritten math
  • Improved accuracy on table predictions
  • Improved accuracy on photo images of printed Hindi and Chinese text
  • Add support for \mid predictions inside set notation
  • Add support for the following languages: Czech, Turkish, Danish

July 9, 2020

July 1, 2020

  • Skip diagrams in v3/text which caused garbage results
Yet another reason to use v3/text over v3/latex! v3/text intelligently skips diagrams!