The action Capture text

Modified on Thu, 14 Aug at 6:26 PM

Purpose  

Extract Text reads the content of a document and exports the extracted text as raw text. Use this action to produce a plain-text file from PDFs, Office documents, emails, HTML, etc., and optionally insert the extracted text into a text template for downstream processing (CRM, ERP, indexing, archival, OCR verification, etc.).


Where to add  

Add the Extract Text action in any scenario step that receives documents (PDF, Office files, EML/MSG, HTML, text files). Place it before actions that consume the extracted text (Save to, Upload, Send, AI processing, etc.).


Extract text from any file


Main settings

  • Choose your model
    Select the extraction model from the drop-down list (for example Default.txt). Click Edit models to create or adjust models if you need custom text.


Supported file types  

  • This action can be applied to: .pdf, .txt, .md, .afc, .doc, .docx, .odt, .ott, .xls, .xlsx, .eml, .msg, .html, .htm, .ppt, .pptx (and other formats shown in the UI). If an unsupported file reaches the action, the action will skip or fail depending on scenario failure settings.


Insert into a template (text file)  

  • You can map the extracted raw text into a template (a .txt file) so the output follows a fixed layout or contains additional fields. This is useful when integrating with downstream systems that expect particular formats.  
  • Use fields in the template to include metadata (filename, processing date, source folder) together with the extracted text.


Output behavior  

The action produces a text output that becomes the current item in the scenario pipeline. Use subsequent actions to save it (Save to), attach it to emails, upload it, or use it as input for AI or parsing steps.


How to configure (steps)

  1. Add Extract Text to the scenario.  
  2. Choose the extraction model from the list (Default.txt or a custom model).  
  3. Click OK to save the action.


Examples

  • Extract text from invoices (PDF) and save to a .txt file for ingestion into an accounting system. (We recommend to use the action Summarize with AI to extract specific data from a file, with prompts you can be very precise). 
  • Read email bodies (EML/MSG) and store raw text in a template that includes the sender and date tokens.  
  • Convert multi-page PDF to a single text file for full-text indexing or search.


Tips and best practices

  • We recommend to use the action Summarize with AI to extract specific data from a file, with prompts you can be very precise



This action provides a simple way to convert many document types into raw text and to insert that text into templates for automated downstream workflows such as CRM/ERP ingestion, indexing, or archival.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article