Scribe OCR - Free Web OCR That you can Self-host
Scribe OCR is a free, web-based application designed to help you extract text from images, proofread OCR data, and create fully digital documents. Whether you’re dealing with scanned PDFs, books, or any other document, Scribe OCR makes the process smoother and more accurate.
You can try it out live at scribeocr.com.
What Can You Do with Scribe OCR?
Scribe OCR is built for three main tasks:
- Add Searchable Text Layers to PDFs
If you’ve used tools like Adobe Acrobat to recognize text in a PDF, you know it can be frustrating when errors sneak in. Scribe OCR lets you easily correct those mistakes, ensuring your PDFs have accurate, searchable text layers. It’s an easy-to-use alternative that puts control back in your hands. - Proofread Existing OCR Data
Got OCR data from another tool, like Tesseract or Abbyy, that needs cleaning up? Scribe OCR makes proofreading a breeze. By accurately aligning text with the original image, you can quickly spot and correct errors—much faster than with traditional methods. - Create Fully Digital Documents and Books
Unlike many OCR tools that simply layer roughly-positioned invisible text over images, Scribe OCR goes a step further. It allows you to create true digital versions of documents—text-native, ebook-style PDFs that mirror the original layout perfectly.
Install and Run
There is currently no standalone desktop application, so running locally requires serving the files over a local HTTP server. To run a local copy, run the following commands (requires npm):
git clone --recursive https://github.com/scribeocr/scribeocr.git
cd scribeocr
npm i
npx http-server
License
AGPL-3.0 License