Extract text from a PDF file and save it as a text file by using the
Extract text
action.
Important:
- If the correct fonts are not embedded in the PDF file, the
Extract text
action does not extract the text correctly.
- When you use this action to extract text from a PDF file, and if
that text is a single line data but the PDF file has the same data in two lines,
then the data might appear in two lines.
Note: When you extract fields from a PDF that contains 20 form
fields, processing time might be 30 to 40 % longer than PDFs without form
fields.
Procedure
To extract text from a PDF file, perform the following steps:
-
In the
Actions
palette, double-click or drag the
Extract text
action from the
PDF
package.
-
In the PDF path, select one of the following options to
specify the location of the PDF:
-
Control Room file: Enables you
to select a PDF file that is available in a folder in the Control Room.
-
Desktop profile: Enables
you to select a PDF file that is available on your device.
-
Variable: Enables you to
specify the file variable that contains the location of the PDF
file.
- Optional:
In the User password or Owner
password field, enter a password to restrict access to the
encrypted PDF file.
-
User password: Allow users to perform specific
operations on the encrypted PDF file.
-
Owner password: Allow users to use a password to
open the file.
-
In the Text type field, select one of the following
options:
-
In the Page range field, select one of the following
options:
-
All pages: Enables you to save all the pages in
the PDF file as an image.
-
Pages: Enables you to enter the page numbers of
the pages that you want to save as an image.
-
In the Export data to text file field, specify a name
and location for the text file.
Note: You must include the .txt extension in the name of the text file. For
example, if the file name is June_Quarter_report, the
.txt extension is June_Quarter_report.txt.
-
Select the Overwrite files with the same name check box
to overwrite existing files with the same name.
Note: If this option is not selected and the bot encounters a
file with the same name at the specified location, the bot
will fail.
- Optional:
From the Assign PDF properties to a dictionary variable
list, select a dictionary variable to hold the file properties.
-
Click Save.