Skip to content

Configure OCR Settings

OCR does not require you to configure any settings before use. But you can use the following settings to tailor OCR to the data and workflow needs of your project.

Set Denoiser and Spellcheck

Denoisier improves the legibility of documents that are hard to read, such as documents that were copied or faxed multiple times. Turning on Denoiser does increase the time required to process an uploaded file, so OCR gives you an additional option of Smart denoiser. Smart denoiser allows the system to selectively apply the process to documents that would benefit and reduces overall processing time.

Spellcheck helps you identify terms that may have been incorrectly entered in the original document.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Denoiser Status, click Denoiser on to improve all documents or Smart denoiser to have the system determine what documents require the service.
  3. Under Use Spellcheck?, click Yes.
  4. Click Update Settings. OCR Denoiser

Enable Document Classification

A faxed patient record can contain multiple document types in a single file. Classifying these sections creates a linked table of contents for the document. You can also configure OCR to attempt to classify the document on import.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Documentation Classification, click Classification on, or Advanced Classification.
  3. Click Update Settings. OCR Classification
  4. Import a document or open an existing document.
  5. To confirm OCR's classification or set the classification manually, at the top of the open document, click the pencil icon next to the greyed-out text Class: at the top of the document to reveal the Classification panel.
  6. On the panel, if your document type for the current page is listed under the Recent tab, click the document type.
  7. On the panel, if your document type for the current page is not listed under the Recent tab, click on the Custom tab.
  8. In the Search field, enter a document type to locate an existing type or create a new document type.
    1. If you create a new document type, click Add New Classification to confirm the creation.
  9. Click Update Page for each classification entry.
  10. Page through the document and repeat the classification process whenever you encounter a new document type.
  11. To display the created table of contents, at the top of the screen, click the Data icon , click Document Details , and click Document Classification.
  12. Under Document Classification, click any Classification section to navigate to that page. OCR Classification

Enable Report Extraction

For common structured report types, such as a CBC panel, OCR can employ specific configurations to more accurately extract data. Any loaded extractors are automatically run against uploaded documents when you enable report extraction.

To obtain a report extractor customized for your workflow, you can contact your LifeOmic representative or create your own extractor. Creating your own extractor requires a technical understanding of JSON, regular expressions, and HL7 FHIR. LifeOmic provides instructions and an extractor builder tool, see Building OC Report Extractors.

  1. Under the OCR tab on the left side menu, click Settings.

  2. Under Report Extraction, click Report extraction on.

  3. Click Update Settings. OCR Report

  4. To verify report extractors are operating: view a document and at the top of the screen, click the Data Tables icon and click on the Report Extractors tab.

  5. The Report Extractors page displays the configured extractors or the message There are no extractors in this project.

Additional Report Extraction Operations

All loaded extractors are automatically run against uploaded documents, but you may want to run the extractors again in the case of additional extractors being loaded after a document was originally processed or other circumstances.

  1. View a document and at the top of the screen, click the Data Tables icon and click on the Report Extractors tab.
  2. To remove any data in the current document generated by an extractor, click Delete All Extractor Data.

  3. To run all extractors against a document, click Rerun All Extractors.

    OCR Report

Run Automated Analysis

OCR uses automated analysis to identify and highlight phrases with data potential. You can configure full or project data based analysis. Full analysis triggers Amazon Comprehend Medical processes and includes large, standard medical databases, such as ICD-10-CM. Full analysis also includes all the operations of project data based analysis. This includes using your PHC project data and any ontologies you loaded for that specific project. Both analyses also look at your project's previously extracted data and infer similarities. Project data based analysis does not employ Amazon Comprehend Medical.

Tip

If you are unsure of the type of analysis needed, configure project data based analysis. Project data based analysis is faster and consumes less resources.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Automated Analysis, click Project data based analysis or Full Analysis.
  3. Click Update Settings.
  4. Import a document or view an existing document.
    1. If you are viewing an existing document that was imported before you configured automated analysis, click the Data Tables icon and click Reanalyze Document.
    2. After automated analysis is configured, importing a document automatically triggers the analysis.
  5. At the top of the screen, click the Annotations icon and click Automated Analysis. OCR Data Analysis

  6. Hold down your mouse button and select a highlighted term.

    Tip: You can hit the select icon to extend the selection to the entire line.

  7. From the menu that appears, click the Analyze Selection icon Analyze Selection.

    The Analyze Contents menu appears with suggestions for FHIR data.

  8. Click on the most accurate suggestion. Your choice populates other fields.
    OCR Data Suggestion

  9. Go through the drop-down menus and select the most accurate options and confirm the suggested information or enter more accurate information in the fields.

  10. Once you are satisfied with the information, click Create ....
    OCR Data Field

  11. To view the created data, at the top of the screen, click the Data icon and then the Analyzed Data icon Analyze Selection.

  12. To view tables of the suggested data, click the Data Tables icon and click the Analyzed Suggestions tab.
  13. To download a CSV file of the extracted data, click the Data Tables icon , click the Extracted Data tab, and click Download Extracted Data. OCR Data Download

Analysis Confidence Threshold

You can choose a level of certainty and configure OCR to only display analyzed suggestions that are above that threshold. The default setting is for OCR to show all analyzed suggestions.

Understanding the Analysis Confidence Threshold

When OCR analyzes a document and generates suggestions, the system assigns different levels of certainty to the results. For example, the system may assign a very high level of certainty to the correlation between the term CBC in the source text and the code 58410-2 for a CBC panel, but it may assign a lower level of certainty to the correlation between the term Heart in the source text and the code 18142-0 for a Heart chambers study observation.

In our first example, the user sets the confidence threshold for the OCR analyzed suggestions to Very High Confidence. The Data Tables page displays only the 75 analyzed suggestions that meet or exceed that threshold. In the Procedures section, only five suggestions are shown, including CBC. The Heart suggestion is not displayed, since its degree of certainty falls below the Very High Confidence threshold set by the user.
OCR Example 1

In our second example, the user sets the confidence threshold for the OCR analyzed suggestions to Low Confidence. The Data Tables page now displays the 150 analyzed suggestions that meet or exceed this lowered threshold. In the Procedures section, the number of suggestions increases to 22. The Heart suggestion is now displayed, since its same degree of certainty now meets or exceeds the new Low Confidence threshold set by the user.
OCR Example 1

Set an Analysis Confidence Threshold

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Analysis Confidence Threshold, expand the menu and click a confidence level.
  3. Click Update Settings. OCR Confidence
  4. To see the Analyzed Suggestions: view a document, and at the top of the screen, click the Data Tables icon .

Configure Review Stages

Review stages help you keep track of where OCR documents are in your workflow by allowing you to assign a label to the document. The label allows anyone who views the document to identify what operations are completed or pending. You can set the number and names of stages based on how your organization operates.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Review Stages, click Configure Review Stages.
  3. Enter a list of stages in a list separated by commas, for example, not started, in progress, completed.
  4. Click the check icon .
  5. Click Update Settings. OCR Stages
  6. To set a review stage for a document, view a document and click the Data icon and the Document Details icon .
  7. Under Document Details, expand the Review Stage menu and under Document Review Stage expand the menu and click the desired stage. OCR Stage-Set
  8. To view the review stage of a document, look under the Document Details section of an open document or in the OCR Documents section of the Subjects viewer. OCR Stage-Set

Suggest or Automate Subject Assignments to Documents

You can configure PHC to make subject suggestions that you pick from or to automatically assign the subject to the document.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Scroll down to the Subject Assignment section and click the pulldown menu. OCR Subject
  3. Click Suggest Subject Assignment if you want to pick a subject from the PHC subject suggestions.
    1. On an open document, at the top of the screen, click the Data icon and click Document Details .
    2. On the Document Details pane, under The Subject of this Document, examine the Patient Assignment Suggestions. The matched criteria are highlighted in green. The location in the document for the match is also shown.
    3. Find the correct subject and click Assign. OCR Suggest
  4. Click Automatically Perform Subject Assignment if you want PHC to automatically assign a subject from the PHC subject suggestions.

    For a subject to automatically be assigned to a document, it must match at least one of the following assignment criteria groups. A subject matches an assignment group when all criteria in a group are found in the document and their values match those of the subject.

    1. Click Add Group.
    2. Click your chosen criteria, such as DOB. A best practice is to use multiple criteria.
    3. Click Save. OCR Subject
    4. To add additional criteria groups, click Add Group again and repeat the process. Each criteria group functions independently.
  5. Click Update Settings.

Note: To confirm automatic assignment, upload a document and after it processes, click the Data icon , click Document Details and click The Subject of this Document. Confirm a name is assigned.

Streamline Suggested Subject Assignments

If you have a large quantity of documents to assign subjects, the streamlined workflow saves you from having to open each document.

  1. Under the OCR tab on the left side menu, click All Files.
  2. Click the Pending Subject Assignment tab.
  3. Mouseover a document row to reveal hidden icons and click the Assign subject to document icon. OCR Pending
  4. On the dialog that appears, find the correct suggestion and click Assign. OCR Pending Assign

Configure the Search List

In addition to a standard document search function, OCR lets you create a search list of multiple terms that you can access from any document within the project.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Scroll down to Document Search Lists.
  3. Click Add Search List.
  4. Enter a descriptive name in the List Title field.
  5. Enter a search term or multiple search terms separated by commas.
  6. Click Add Search Term.
  7. Click Update Settings. OCR Search List
  8. To access the search list, view a document.
  9. Click the list icon to the right of the search field and click your choice of search lists. OCR Search List Choice

    All the terms on the search list are highlighted in the document.

  10. Click the list icon to the right of the search field to display a list of search results.

Add a Classification Field for Document Details

Classification fields allow you to add additional data to organized document sections. You can customize the field and enter any value for the field. The classification information is contained in a manifest.pdf file that is included in the downloaded zip file, when you Download a Split File. In PHC, you can view the information under the Organize tab in the open document.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Scroll down to Classification Fields.
  3. Click the plus symbol next to the classification where you want to add a field.
  4. In the Add Field... field that displays, enter a name.
  5. Click Add.
  6. Click Update Settings. OCR Classification Field
  7. To use the new field, open a document and click the the Organize tab at the top of the screen.
  8. Select a page and click the classification with the added field.
  9. Under the Details section, enter the expected data.
  10. Click Submit. OCR Field Entry

OCR Ontology

In addition to the machine learning analysis driven by Amazon Comprehend Medical and its general, public databases, such as ICD-10-CM, OCR can analyze documents using a custom ontology created by your organization. The ontology-based analysis is driven by your needs and a set of key terms and details you provide. This targeted focus allows the ontology-based analysis to be very accurate and efficient.

Creating an OCR ontology is simple. You download the spreadsheet template from OCR and add your desired terms and additional information, such as medical coding. You then upload the completed spreadsheet and configure OCR automated analysis.

When you configure the automated analysis full analysis option, OCR automatically runs both types of analysis during document ingestion. When you configure the project data based analysis option, OCR runs only the ontology-based analysis. Both the resulting Ontologies Suggestions from the ontology analysis and the Smart Suggestions from the Amazon Comprehend analysis display on a document's Data Tables page, which also allows you to filter the sources.

Note

OCR ontology is one of two OCR tools that search a document for a set of terms you provide. The search list is a basic tool that is simple to configure in the app and highlights a limited number of terms. OCR ontology is a sophisticated tool that uses a versioned spreadsheet of potentially thousands of terms and provides detailed suggestions to help you create recorded values.

Create and Upload an OCR Ontology

  1. Under the OCR tab on the left side menu, click Settings and scroll down to the OCR Ontologies section.
  2. Click Download Template. OCR Ontologies Download
  3. Open the template spreadsheet with Microsoft Excel or any spreadsheet program compatible with the Microsoft Excel Open XML Format (.xlsx).
  4. Fill in the spreadsheet cells with the appropriate information using the example and chart below:

    Ontology Example OCR Ontologies Template Ontology Cell Description Chart

    Cell Type Description
    Category This is a user-defined subset of information that includes the display term. For example, the display term heart attack might fall under the category cardiac. You can include multiple terms and separate them with a pipe delimiter (|) (required).
    Display OCR searches for this term and displays it in the analysis suggestions (required).
    Code Any code, such as an ICD-10-CM code or laboratory testing company code, that you want associated with the display term. If you do not have a code, you can add the display term to this field (required).
    System The source for medical coding. If you are not concerned with proper medical coding, you can use http://lifeomic.com/temp (required).
    Synonyms OCR searches for words or abbreviations that have the same meaning as the display term but uses the display term in the suggestion. You can include multiple terms and separate them with a pipe delimiter (|) (optional).
    IsObservation Put Yes if the display term falls under the FHIR Observation resource. Leave blank if it does not. (It is required that at least one of the four resources is marked Yes.)
    IsMedication Put Yes if the display term falls under the FHIR Medication resource. Leave blank if it does not.
    IsCondition Put Yes if the display term falls under the FHIR Condition resource. Leave blank if it does not.
    IsProcedure Put Yes if the display term falls under the FHIR Procedure resource. Leave blank if it does not.
  5. Delete row 2 (the example row) in the spreadsheet and save the ontology spreadsheet to your computer with a useful name.

  6. Click Upload New Ontology Version. OCR Ontologies
  7. Navigate to your saved ontology .xlsx file and click Open.
  8. Complete the Run Automated Analysis procedure.
  9. To view tables of the ontology suggestions, from an open document, click the Data Tables icon and click the Analyzed Suggestions tab.
  10. Confirm the Ontologies box is checked to display the ontology-based suggestions. If Smart Suggestions is checked, click it to deselect Smart Suggestions and hide any Smart Suggestions. Smart Suggestions are generated from the AWS Comprehend Medical analysis. Note: You can filter Ontologies results by category in the Data Tables view or the Document view with suggestions turned on.
    OCR Ontologies Suggestions

Index a Project with an Ontology

Indexing a project with an ontology searches for all of the terms in the ontology in all of the documents in a project. It creates an index document that highlights the location of all of the ontology terms in the documents along with other relevant information. You can download the index document in CSV format.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Scroll down to the OCR Ontologies section and click Index Project or Reindex Project. The indexing can take several minutes for larger projects.
  3. Click the Download Index CSV. PHC generates the index file and starts a download to your PC. OCR Ontologies Index
  4. Find the index file in the Downloads folder or other location on your PC. The file name is ocr_ocr-ontology-index.csv or something similar.
  5. Open the file with a spreadsheet program, such as Microsoft Excel, and use the chart below to help read the data.
Column Name Description
id This is the document id of the indexed document.
subjectid A subject ID appears if the document was imported under a specific patient in the subject viewer or a subject was manually assigned under Document Details.
page The document page that contains the term.
term The indexed term from the ontology.
lineNumbers The line on the page that contains the term. Multiple lineNumbers appear for multiple occurrences of a term.
categories The root of the ontology term if available.

OCR Ontologies Index


Last update: 2021-08-19