For interactions too complex or unpredictable for static automations, use AI agents that autonomously navigate and interact based on goals.
When to Use
| Use Agentic For | Use Static Actions For |
|---|
| Unpredictable UI layouts | Known, stable elements |
| Complex navigation paths | Simple click/type actions |
| Handling popups/modals | Performance-critical paths |
| Sites with frequent changes | Cost-sensitive automations |
| CAPTCHAs and verification | Deterministic workflows |
AgenticTask
Use agentic_task when AI should autonomously accomplish a goal:
{
"interaction_action": {
"agentic_task": {
"task": "Navigate to settings and enable two-factor authentication",
"max_steps": 15,
"backend": "browser_use",
"use_vision": false,
"keep_alive": true
}
}
}
Properties
| Property | Type | Default | Description |
|---|
task | str | Required | Natural language goal description |
max_steps | int | Required | Maximum actions the agent can take |
backend | "browser_use" | "browserbase" | Required | Agent backend |
use_vision | bool | False | Include screenshots for agent |
keep_alive | bool | True | Keep browser session after task |
max_steps Guidelines
| Task Complexity | Suggested max_steps |
|---|
| Simple (1-2 clicks) | 3-5 |
| Medium (navigate + fill form) | 10-15 |
| Complex (multi-page workflow) | 20-30 |
Higher max_steps means longer execution time and higher LLM costs.
Writing Good Task Descriptions
Good:
{"task": "Click 'Account Settings' in the sidebar, scroll down, click 'Security'"}
{"task": "1. Close any popups 2. Click search 3. Search 'laptop' 4. Click first result"}
Poor:
{"task": "Complete the form"}
Vision Mode
Enable use_vision for visual elements without good text labels:
{
"agentic_task": {
"task": "Click on the red 'Sale' banner",
"max_steps": 5,
"use_vision": true
}
}
| Use Vision For | Avoid Vision For |
|---|
| Image-based navigation | Text-based navigation |
| Visual verification | Speed-critical tasks |
| Elements without ARIA labels | Cost minimization |
Specialized action for dismissing popups, modals, and overlays:
{
"interaction_action": {
"close_overlay_popup": {
"max_steps": 5
}
}
}
Default Behavior
| Property | Default |
|---|
task | Comprehensive popup dismissal prompt |
max_steps | 5 |
use_vision | True |
keep_alive | True |
What It Handles
- Cookie consent banners
- Privacy policy notices
- Newsletter signup prompts
- Age verification gates
- Promotional popups
- Blocking overlays
Variables in Agentic Tasks
Use parameter substitution in task descriptions:
{
"agentic_task": {
"task": "Search for '{search_query[0]}' and filter to items under ${price_max[0]}",
"max_steps": 10
}
}
Combining with Static Actions
Best pattern: use static actions for predictable steps, agentic for uncertainty:
[
{
"type": "action_node",
"interaction_action": {
"input_text": {
"command": "get_by_label(\"Email\")",
"input_text": "{email[0]}"
}
}
},
{
"type": "action_node",
"interaction_action": {
"click_element": {
"command": "get_by_role(\"button\", name=\"Sign In\")"
}
}
},
{
"type": "action_node",
"interaction_action": {
"agentic_task": {
"task": "Navigate to Reports and find the Monthly Summary",
"max_steps": 10
}
}
},
{
"type": "action_node",
"interaction_action": {
"click_element": {
"command": "get_by_role(\"button\", name=\"Download\")",
"expect_download": true
}
}
}
]
Best Practices
| Practice | Recommendation |
|---|
| Start with static | Use agentic only where needed |
| Keep tasks focused | Break complex goals into smaller tasks |
| Start low on max_steps | Increase if agent can’t complete |
| Review execution logs | Refine task descriptions based on results |
| Use vision selectively | Only when visual context is necessary |