Prompting for document data transformation
- Updated: 2026/06/13
Use Co-Pilot for Automators to simplify post-processing of data extracted by a Document Automation learning instance.
Why prompt for data transformation
Once Document Automation extracts data from a document, the real power comes from shaping that data exactly the way you need it. Transforming, normalizing, and enriching extracted data is a natural next step, and with Co-Pilot for Automators, it's never been more straightforward.
Co-Pilot removes the traditional complexity from data transformation. Instead of navigating multiple command packages, memorizing internal data structures, or writing Python or JavaScript by hand, developers can simply describe what they want in plain English. Co-Pilot handles the rest, automatically generating a ready-to-run task bot from a single prompt.
Use Cases
| Tier | Description | Examples |
|---|---|---|
| Simple | Single string value converted to another value based on defined logic. |
|
| Medium | Multiple manipulations, or logic that depends on values from more than one field. |
|
| Complex | Transformations involving external data sources or fully custom logic. |
|
Prompt Template
Use the following template when invoking Co-Pilot for Automators. Only plain English instructions are needed, no special syntax is required.
Get fields: [<field name>, <field name>, ...]
Transformation description: <describe the transformation in plain English>
Example: [optional — provide a before/after example if helpful]
Update fields: [<field name>, <field name>, ...]
Use the following example to enter your own data for the transformation.
Example Prompt
Get fields: [Total Amount]
Tranformation description: If the field has 4 decimal digits, round it to 2 decimal digits.
Example: "11,435.0000" → "11,435.00"
Update fields: [Total Amount]
Building the task bot
- Get Document Data: Uses your variable to retrieve the full document data and stores
the output in a variable as a Recordset (example: $DocumentData$).Note: The Python package cannot operate on Recordset objects directly. The next step converts the Recordset to a JSON string.
- Convert Record to JSON String: Uses the String: Assign action to serialize $DocumentData$ into a plain JSON string. This is the representation that Python will read and modify. Stores the result in a variable (example: $DocumentJson$).
- Open and Apply Python Logic: A Python session is opened and $DocumentJson$ passes as
input. The session contains a function (example: <normalize data>) that
applies the transformation described in the prompt (see the following example
JSON script).
Result: The result is that the function returns the same JSON structure as the input, with the updated field values. Stores the return value in a variable (example: $UpdatedJson$).
- Update Document Data Pass $UpdatedJson$ to the Update Data action to transform values
to send back to the server. Note: Ensure the output (example: $UpdateOutput$) is configured as an output variable to carry the document status, that a parent process uses to determine if the document should be routed to a validation queue.
| Variable | Purpose |
|---|---|
$DocumentData$ |
Recordset output from the Get Document Data action. |
$DocumentJson$ |
JSON string serialized from DocumentData; passed into Python. |
$UpdatedJson$ |
JSON string returned from Python with transformed field values. |
$UpdateOutput$ |
Status result from the Update Data action; used as output variable at the process level. |
End-to-End Flow of the task
The complete sequence, starting with the Document Data from the prerequisite.
| # | Action | Output Variable |
|---|---|---|
| 1 | Document Extraction: Extract Data |
|
| 2 | Document Extraction: Get Document Data | DocumentData (Recordset) |
| 3 | String: Assign | DocumentJson (string) |
| 4 | Python Script: Open (custom logic) & Python Script: Execute function <normalize data> | UpdatedJson (string) |
| 5 | Document Extraction: Update document data | UpdateOutput (output variable) |
JSON Reference
The DocumentJson variable holds the full document record as a JSON object. The Python function receives this object, applies the requested transformations to the relevant fields, and returns the same structure with updated values. Field names and the overall schema must remain unchanged.
- Use the prompt template provided. Deviating from the four-line structure can produce incomplete output.
- Keep field names in the prompt consistent with the field names as they appear in the document extraction output.
- For Medium and more complex tasks, include an example in the prompt to reduce ambiguity.
- For Complex tasks, review the generated scaffold carefully. Update the Python script to integrate your external data source (CSV, database, API) before running.
- Your $UpdateOutput$ must always be an output variable at the process level. Do not discard it.
{
"pages": [
{
"width": 1700,
"height": 2200
}
],
"fields": {
"VendorID": {
"value": "10001",
"bounds": "0,0,0,0"
},
"vendor_name": {
"value": "",
"bounds": "0,0,0,0"
},
"invoice_number": {
"value": "10280",
"bounds": "1446,444,72,20"
}
},
"tables": {
"table": [
{
"quantity": {
"value": "1",
"bounds": "188,783,13,17"
},
"total_price": {
"value": "22.00",
"bounds": "1506,784,69,18"
},
},
{
"quantity": {
"value": "1",
"bounds": "188,819,13,17"
},
"total_price": {
"value": "36.75",
"bounds": "1508,817,65,20"
}
}
]
}
}