Extraction Actions

Extraction actions capture data from web pages during automation—essential for scraping, validation, and feeding dynamic values into subsequent actions.

Extraction Types

Type	Purpose	Best For
`llm`	AI-powered structured data extraction	Tables, forms, text content
`network_call`	Capture API/AJAX responses	API data, JSON responses
`screenshot`	Save visual snapshot	Receipts, proofs, visual records
`state`	Capture page URL and title	Navigation validation
`two_fa_action`	Wait for and extract 2FA code	2FA codes

LLM Extraction

The most powerful extraction method. Uses AI to parse page content into structured data.

{
  "extraction_action": {
    "llm": {
      "source": ["axtree"],
      "extraction_format": {
        "product_name": "str",
        "price": "str",
        "availability": "str"
      },
      "extraction_instructions": "Extract product details from the product page"
    }
  }
}

Properties

Property	Type	Default	Description
`source`	`list["axtree" \| "screenshot"]`	`["axtree"]`	Data sources to analyze
`extraction_format`	`dict`	Required	Expected output structure
`extraction_instructions`	`str`	Required	What to extract
`output_variable_names`	`list[str]`	`None`	Store values as variables
`llm_model_name`	`str`	`"gemini-2.5-flash"`	LLM model to use

Source Selection

Source	Best For
`["axtree"]`	Text, tables, forms (default, fastest)
`["screenshot"]`	Charts, images, visual layouts
`["axtree", "screenshot"]`	Complex pages needing both

Extraction Format

Define output structure with type hints:

{
  "extraction_format": {
    "title": "str",
    "items": "List[str]",
    "count": "str"
  }
}

Only str and List[str] are supported types.

Storing as Variables

Use output_variable_names to make extracted values available for subsequent actions:

{
  "extraction_action": {
    "llm": {
      "extraction_format": {
        "order_ids": "List[str]",
        "total": "str"
      },
      "extraction_instructions": "Extract order IDs from the table",
      "output_variable_names": ["order_ids"]
    }
  }
}

After this action, use {order_ids[0]}, {order_ids[index]}, or iterate with for_loop_node.

Writing Good Instructions

Good examples:

{"extraction_instructions": "Extract all authorization numbers from the Auth Nbr column in the Authorizations table"}

{"extraction_instructions": "From the patient info section, extract: name (shown as 'Name:'), DOB, and member ID"}

Poor examples:

{"extraction_instructions": "Get the data"}

{"extraction_instructions": "Extract the numbers"}

Be specific about where data appears, what it looks like, and expected format.

Network Call Extraction

Capture data from API requests and responses:

{
  "extraction_action": {
    "network_call": {
      "url_pattern": "https://api.example.com/orders"
    }
  }
}

Properties

Property	Type	Default	Description
`url_pattern`	`str \| None`	`None`	URL substring to match
`extract_from`	`"request" \| "response"`	`None`	Extract from request or response
`download_from`	`"request" \| "response"`	`None`	Download as file
`download_filename`	`str \| None`	Auto-generated	Filename for download

Screenshot Extraction

Save a screenshot for later analysis:

{
  "extraction_action": {
    "screenshot": {
      "filename": "confirmation.png",
      "full_page": true
    }
  }
}

Property	Type	Default	Description
`filename`	`str`	Required	Output filename
`full_page`	`bool`	`True`	Entire page or viewport only

State Extraction

Capture current page URL and title:

{
  "extraction_action": {
    "state": {}
  }
}

Two-Factor Authentication Extraction

Wait for and extract 2FA code:

{
  "extraction_action": {
    "two_fa_action": {
      "action": "email_two_fa_action",
      "output_variable_name": "two_fa_code"
    }
  }
}

Properties

Property	Type	Default	Description
`action`	`"email_two_fa_action" \| "slack_two_fa_action"`	Required	The type of 2FA action to use
`output_variable_name`	`str`	Required	The name of the variable to store the 2FA code in
`instructions`	`str`	`None`	Optional Custom instructions for code extraction
`max_wait_time`	`float`	`300.0`	The maximum time to wait for the 2FA code
`check_interval`	`float`	`10.0`	The interval to check for the 2FA code

Action Types

Action Type	Description
`email_two_fa_action`	Wait for and extract 2FA code from email
`slack_two_fa_action`	Wait for and extract 2FA code from Slack

For more information on how to use the 2FA code in your automation, please refer to the Two-Factor Authentication Integration documentation.

Timing

Extraction actions have different timing defaults to allow pages to fully load:

Property	Default for Extractions
`before_sleep_time`	`3.0` seconds
`end_sleep_time`	`0.0` seconds

Override if needed:

{
  "type": "action_node",
  "extraction_action": {
    "llm": { ... }
  },
  "before_sleep_time": 5.0
}

When to Use Each Type

Scenario	Recommended
Extract text/tables from page	`llm` with `axtree`
Charts, images, visual content	`llm` with `screenshot`
Intercept API data	`network_call`
Visual proof/documentation	`screenshot`
Validate navigation	`state`

Quickstart

Building Automations

Action Types

Inference

Advanced

FAQs

Extraction Types

LLM Extraction

Properties

Source Selection

Extraction Format

Storing as Variables

Writing Good Instructions

Network Call Extraction

Properties

Screenshot Extraction

State Extraction

Two-Factor Authentication Extraction

Properties

Action Types

Timing

When to Use Each Type

Quickstart

Building Automations

Action Types

Inference

Advanced

FAQs

​Extraction Types

​LLM Extraction

​Properties

​Source Selection

​Extraction Format

​Storing as Variables

​Writing Good Instructions

​Network Call Extraction

​Properties

​Screenshot Extraction

​State Extraction

​Two-Factor Authentication Extraction

​Properties

​Action Types

​Timing

​When to Use Each Type

Extraction Types

LLM Extraction

Properties

Source Selection

Extraction Format

Storing as Variables

Writing Good Instructions

Network Call Extraction

Properties

Screenshot Extraction

State Extraction

Two-Factor Authentication Extraction

Properties

Action Types

Timing

When to Use Each Type