Optical Character Recognition (OCR) Explained

Text which can’t be selected by your cursor is not searchable by a computer until it’s converted to recognized characters via the OCR process.


First, a paper document is digitized using a scanner or camera to create an "image" of the contents.

The document is then visible on a computer screen; but the letters and words are not individually selectable (or searchable) yet. The whole page is one image.

The OCR process helps the computer recognize text characters in the image and convert them to a selectable/searchable format (text).

Once the OCR process is complete, the flat images of the text/letters/characters are now selectable and searchable. This process is critical to performing redaction on scanned documents.

Do More with GovQA’s In-Tool Redaction™

  • In-tool Redaction™ from GovQA allows the redaction process to happen within the GovQA software, increasing security and reducing steps to successful redaction
  • Easily collaborate and generate an audit log
  • Retain attachments on emails imported for redaction (most file types compatible)
  • Batch collate all email attachments into PDFs along with the emails themselves and redact en masse (as a packet)
  • Redaction functions include:
    • Bulk Redact
    • Text Search
    • Pattern Matching
    • Redact Similar
    • Exemption Tracking
    • Responsive Records Packeting (combine multiple attachments into one responsive packet)

DID YOU KNOW?  Some FOIA software companies take shortcuts with their redaction tools by linking to out-of-the-box Adobe redaction tools instead of going the extra mile to create their own tool set. Why should you care?

Using Adobe redaction (instead of an in-tool solution like GovQA) breaks the audit trail the moment users open Adobe — and it opens a hole in the security when users upload and download files to and from Adobe.

