Extraction actions capture data from web pages during automation—essential for scraping, validation, and feeding dynamic values into subsequent actions.
| Type | Purpose | Best For |
|---|
llm | AI-powered structured data extraction | Tables, forms, text content |
network_call | Capture API/AJAX responses | API data, JSON responses |
screenshot | Save visual snapshot | Receipts, proofs, visual records |
state | Capture page URL and title | Navigation validation |
two_fa_action | Wait for and extract 2FA code | 2FA codes |
The most powerful extraction method. Uses AI to parse page content into structured data.
{
"extraction_action": {
"llm": {
"source": ["axtree"],
"extraction_format": {
"product_name": "str",
"price": "str",
"availability": "str"
},
"extraction_instructions": "Extract product details from the product page"
}
}
}
Properties
| Property | Type | Default | Description |
|---|
source | list["axtree" | "screenshot"] | ["axtree"] | Data sources to analyze |
extraction_format | dict | Required | Expected output structure |
extraction_instructions | str | Required | What to extract |
output_variable_names | list[str] | None | Store values as variables |
llm_model_name | str | "gemini-2.5-flash" | LLM model to use |
Source Selection
| Source | Best For |
|---|
["axtree"] | Text, tables, forms (default, fastest) |
["screenshot"] | Charts, images, visual layouts |
["axtree", "screenshot"] | Complex pages needing both |
Define output structure with type hints:
{
"extraction_format": {
"title": "str",
"items": "List[str]",
"count": "str"
}
}
Only str and List[str] are supported types.
Storing as Variables
Use output_variable_names to make extracted values available for subsequent actions:
{
"extraction_action": {
"llm": {
"extraction_format": {
"order_ids": "List[str]",
"total": "str"
},
"extraction_instructions": "Extract order IDs from the table",
"output_variable_names": ["order_ids"]
}
}
}
After this action, use {order_ids[0]}, {order_ids[index]}, or iterate with for_loop_node.
Writing Good Instructions
Good examples:
{"extraction_instructions": "Extract all authorization numbers from the Auth Nbr column in the Authorizations table"}
{"extraction_instructions": "From the patient info section, extract: name (shown as 'Name:'), DOB, and member ID"}
Poor examples:
{"extraction_instructions": "Get the data"}
{"extraction_instructions": "Extract the numbers"}
Be specific about where data appears, what it looks like, and expected format.
Capture data from API requests and responses:
{
"extraction_action": {
"network_call": {
"url_pattern": "https://api.example.com/orders"
}
}
}
Properties
| Property | Type | Default | Description |
|---|
url_pattern | str | None | None | URL substring to match |
extract_from | "request" | "response" | None | Extract from request or response |
download_from | "request" | "response" | None | Download as file |
download_filename | str | None | Auto-generated | Filename for download |
Save a screenshot for later analysis:
{
"extraction_action": {
"screenshot": {
"filename": "confirmation.png",
"full_page": true
}
}
}
| Property | Type | Default | Description |
|---|
filename | str | Required | Output filename |
full_page | bool | True | Entire page or viewport only |
Capture current page URL and title:
{
"extraction_action": {
"state": {}
}
}
Wait for and extract 2FA code:
{
"extraction_action": {
"two_fa_action": {
"action": "email_two_fa_action",
"output_variable_name": "two_fa_code"
}
}
}
Properties
| Property | Type | Default | Description |
|---|
action | "email_two_fa_action" | "slack_two_fa_action" | Required | The type of 2FA action to use |
output_variable_name | str | Required | The name of the variable to store the 2FA code in |
instructions | str | None | Optional Custom instructions for code extraction |
max_wait_time | float | 300.0 | The maximum time to wait for the 2FA code |
check_interval | float | 10.0 | The interval to check for the 2FA code |
Action Types
| Action Type | Description |
|---|
email_two_fa_action | Wait for and extract 2FA code from email |
slack_two_fa_action | Wait for and extract 2FA code from Slack |
For more information on how to use the 2FA code in your automation, please refer to the Two-Factor Authentication Integration documentation.
Timing
Extraction actions have different timing defaults to allow pages to fully load:
| Property | Default for Extractions |
|---|
before_sleep_time | 3.0 seconds |
end_sleep_time | 0.0 seconds |
Override if needed:
{
"type": "action_node",
"extraction_action": {
"llm": { ... }
},
"before_sleep_time": 5.0
}
When to Use Each Type
| Scenario | Recommended |
|---|
| Extract text/tables from page | llm with axtree |
| Charts, images, visual content | llm with screenshot |
| Intercept API data | network_call |
| Visual proof/documentation | screenshot |
| Validate navigation | state |