Skip to main content
Extraction actions capture data from web pages during automation—essential for scraping, validation, and feeding dynamic values into subsequent actions.

Extraction Types

TypePurposeBest For
llmAI-powered structured data extractionTables, forms, text content
network_callCapture API/AJAX responsesAPI data, JSON responses
screenshotSave visual snapshotReceipts, proofs, visual records
stateCapture page URL and titleNavigation validation
two_fa_actionWait for and extract 2FA code2FA codes

LLM Extraction

The most powerful extraction method. Uses AI to parse page content into structured data.
{
  "extraction_action": {
    "llm": {
      "source": ["axtree"],
      "extraction_format": {
        "product_name": "str",
        "price": "str",
        "availability": "str"
      },
      "extraction_instructions": "Extract product details from the product page"
    }
  }
}

Properties

PropertyTypeDefaultDescription
sourcelist["axtree" | "screenshot"]["axtree"]Data sources to analyze
extraction_formatdictRequiredExpected output structure
extraction_instructionsstrRequiredWhat to extract
output_variable_nameslist[str]NoneStore values as variables
llm_model_namestr"gemini-2.5-flash"LLM model to use

Source Selection

SourceBest For
["axtree"]Text, tables, forms (default, fastest)
["screenshot"]Charts, images, visual layouts
["axtree", "screenshot"]Complex pages needing both

Extraction Format

Define output structure with type hints:
{
  "extraction_format": {
    "title": "str",
    "items": "List[str]",
    "count": "str"
  }
}
Only str and List[str] are supported types.

Storing as Variables

Use output_variable_names to make extracted values available for subsequent actions:
{
  "extraction_action": {
    "llm": {
      "extraction_format": {
        "order_ids": "List[str]",
        "total": "str"
      },
      "extraction_instructions": "Extract order IDs from the table",
      "output_variable_names": ["order_ids"]
    }
  }
}
After this action, use {order_ids[0]}, {order_ids[index]}, or iterate with for_loop_node.

Writing Good Instructions

Good examples:
{"extraction_instructions": "Extract all authorization numbers from the Auth Nbr column in the Authorizations table"}
{"extraction_instructions": "From the patient info section, extract: name (shown as 'Name:'), DOB, and member ID"}
Poor examples:
{"extraction_instructions": "Get the data"}
{"extraction_instructions": "Extract the numbers"}
Be specific about where data appears, what it looks like, and expected format.

Network Call Extraction

Capture data from API requests and responses:
{
  "extraction_action": {
    "network_call": {
      "url_pattern": "https://api.example.com/orders"
    }
  }
}

Properties

PropertyTypeDefaultDescription
url_patternstr | NoneNoneURL substring to match
extract_from"request" | "response"NoneExtract from request or response
download_from"request" | "response"NoneDownload as file
download_filenamestr | NoneAuto-generatedFilename for download

Screenshot Extraction

Save a screenshot for later analysis:
{
  "extraction_action": {
    "screenshot": {
      "filename": "confirmation.png",
      "full_page": true
    }
  }
}
PropertyTypeDefaultDescription
filenamestrRequiredOutput filename
full_pageboolTrueEntire page or viewport only

State Extraction

Capture current page URL and title:
{
  "extraction_action": {
    "state": {}
  }
}

Two-Factor Authentication Extraction

Wait for and extract 2FA code:
{
  "extraction_action": {
    "two_fa_action": {
      "action": "email_two_fa_action",
      "output_variable_name": "two_fa_code"
    }
  }
}

Properties

PropertyTypeDefaultDescription
action"email_two_fa_action" | "slack_two_fa_action"RequiredThe type of 2FA action to use
output_variable_namestrRequiredThe name of the variable to store the 2FA code in
instructionsstrNoneOptional Custom instructions for code extraction
max_wait_timefloat300.0The maximum time to wait for the 2FA code
check_intervalfloat10.0The interval to check for the 2FA code

Action Types

Action TypeDescription
email_two_fa_actionWait for and extract 2FA code from email
slack_two_fa_actionWait for and extract 2FA code from Slack
For more information on how to use the 2FA code in your automation, please refer to the Two-Factor Authentication Integration documentation.

Timing

Extraction actions have different timing defaults to allow pages to fully load:
PropertyDefault for Extractions
before_sleep_time3.0 seconds
end_sleep_time0.0 seconds
Override if needed:
{
  "type": "action_node",
  "extraction_action": {
    "llm": { ... }
  },
  "before_sleep_time": 5.0
}

When to Use Each Type

ScenarioRecommended
Extract text/tables from pagellm with axtree
Charts, images, visual contentllm with screenshot
Intercept API datanetwork_call
Visual proof/documentationscreenshot
Validate navigationstate