Use regex extract action in validation rule

The regex extract action allows users to apply a custom regular expression (regex) to the values extracted from documents. With this action, users can manipulate the extracted data by extracting specific sub-strings based on the specified regex pattern.

When a document is processed and a value is extracted, you can apply the regex extract action to this value. You can define a regex pattern which will be used to identify and extract a specific part of the extracted value.
Note: It extracts only the first match found in the value based on the specified regex pattern.

Prerequisites

  • When you apply the regex extract action, the system sets a field value with the first regex match during extraction.
  • If the specified regex does not have a match for the field, it returns an empty field value.
  • If you do not have an updated (v.31) package connected to a learning instance, you will encounter a warning message to indicate that this rule might not work as expected.

Example

This feature is particularly useful in scenarios where only a portion of the extracted data is needed. For example, extracting a specific set of numbers from a larger string, isolating a part of an address, or retrieving a specific fragment from a table description.

In the below example, if you want to extract the Vendor Code from the Description column of the document, it retrieves all the description text from the Description column.

The following image shows Vendor Code values with regular extraction process.

Before applying regex extract action

In this example, we will see how to extract only Vendor Code from the Description column of document.

Procedure

  1. On the Field Rules tab, click Add Rule.
  2. Specify the is not empty condition for the Vendor Code field.
  3. Select the regex extract action type.
  4. Specify the regex pattern. For example, Vendor Code: \d{6}
  5. Test the regex pattern by providing appropriate value for the specified pattern and click Update. For example, Vendor Code: 381823.
    Using regex extract action
  6. Click Process to process the document.
    Based on the specified regex pattern, only Vendor Code value is extracted from the Description column.

    The following image shows Vendor Code values after applying the regex extract action.

    Vendor code extraction using regex extract action

    Following are some sample regex patterns that you can use for extraction:
    Date type Regex pattern Examples
    Text or address \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Z|a-z]{2,}\b test@gmail.com
    \b\d{3}[-.]?\d{3}[-.]?\d{4}\b 123.456.7890 or 123-456-7890
    Number ^\d{2}$ 12, 23, or 99
    ^[0-9]+$ 123 or 12434
    Date \b\d{1,2}[/-]\d{1,2}[/-]\d{4}\b 12/31/2022 or 02/07/2012
    ^\d{2}/\d{2}/\d{4}$ 28/02/2222
    Note: These regex patterns are not fixed and might vary based on your use case requirement.