Using the Extract text action
Use the Extract text action to extract text from a PDF file and save it as a text file.
Important: If the correct fonts are not embedded in
the PDF file, the Extract text
action will not extract the text
correctly.
Procedure
To extract text from a PDF file, follow these steps:
- In the Actions palette, double-click or drag the Extract text action from the PDF package.
-
In the PDF path, select one of the following options to
specify the location of the PDF:
- Control Room file: Enables you to select a PDF file that is available in a folder in the Control Room.
- Desktop profile: Enables you to select a PDF file that is available on your device.
- Variable: Enables you to specify the file variable that contains the location of the PDF file.
- Optional:
In the User password or Owner
password field, enter a password to restrict access to the
encrypted PDF file.
- User password: Allow users to perform specific operations on the encrypted PDF file.
- Owner password: Allow users to use a password to open the file.
-
In the Text type field, select one of the following
options:
- Plain text: Enables you to extract the text and
copy it to a text file.
It is similar to copying and pasting text from a PDF file to a text file.
- Structured text: Enables you to preserve the original formatting of the extracted text from the PDF file.
- Plain text: Enables you to extract the text and
copy it to a text file.
-
In the Page range field, select one of the following
options:
- All pages: Enables you to save all the pages in the PDF file as an image.
- Pages: Enables you to enter the page numbers of the pages that you want to save as an image.
-
In the Export data to text file field, specify a name
and location for the text file.
Note: You must include the .txt extension in the name of the text file. For example, if the file name is June_Quarter_report, the .txt extension is June_Quarter_report.txt.
-
Select the Overwrite files with the same name check box
to overwrite existing files with the same name.
Note: If this option is not selected and the bot encounters a file with the same name at the specified location, the bot will fail.
- Optional:
From the Assign PDF properties to a dictionary variable
list, select a dictionary variable to hold the file properties.
For more information, see Using a dictionary variable for PDF properties.
- Click Save.