Tag Archives: ocr

Capture full webpage and convert to PDF with OCR.

1. Capture webpage using Awesome Screenshot add-on in firefox. 2. You’ll get a very big image with 72 pp. Adobe Professional can’t perform the OCR in such a big image. You need to resize the image to a lower size, … Continue reading

Posted in Uncategorized | Tagged , , , | Leave a comment

Lios. Very interesting OCR software for linux

I haven’t proved it yet, but it seems great. http://www.tiflolinux.org/node/455 http://code.google.com/p/linux-intelligent-ocr-solution/

Posted in Uncategorized | Tagged , , , | 1 Comment

OCR from terminal

http://www.webupd8.org/2010/02/how-to-extract-all-text-from-pdfs.html

Posted in Uncategorized | Tagged , , | Leave a comment

Yagf

Yagf is another graphical front-end cuneiform OCR tool. It can use tesseract and cuneiform. To install it you must do it from http://www.getdeb.net and previously have added the getdeb repositories. In precise pangolin: http://www.ubuntuupdates.org/ppa/getdeb_apps wget -q -O – http://archive.getdeb.net/getdeb-archive.key | … Continue reading

Posted in Uncategorized | Tagged , , , , | Leave a comment

gimagereader, lios

gimagereader and lios are other good ocr software for linux. http://code.google.com/p/linux-intelligent-ocr-solution/

Posted in Uncategorized | Tagged , , | Leave a comment

Ocrfeeder in spanish

http://hatteras.wordpress.com/2011/10/28/escanear-con-ocr-reconocimiento-optico-de-caracteres-ocrfeeder/ Por defecto el programa usa en el OCR el idioma ingles (es decir el paquete tesseract-ocr-eng) , aunque tengas todo el sistema en español y hayas instalado el paquete tesseract-ocr-spa; para “obligar” a usar el español, en Argumentos del … Continue reading

Posted in Uncategorized | Tagged , , , | 2 Comments

Studying how to put a ocr layer in a pdf.

http://www.tobias-elze.de/pdfsandwich/index.html http://tpeelen.wordpress.com/2010/12/17/alfresco-using-tesseract-ocr-on-ubuntu-linux/ http://superuser.com/questions/28426/how-to-extract-text-with-ocr-from-a-pdf-on-linux http://superuser.com/questions/28426/how-to-extract-text-with-ocr-from-a-pdf-on-linux http://blog.konradvoelkel.de/2010/01/linux-ocr-and-pdf-problem-solved/ http://tfischernet.wordpress.com/2008/11/26/searchable-pdfs-with-linux/ http://www.watchocr.com/forum/viewtopic.php?f=7&t=56 http://ubuntuforums.org/showthread.php?t=1456756

Posted in Uncategorized | Tagged , | Leave a comment