LaVague Architecture
Core Components
User Objective: "Book the cheapest flight from NYC to London"
|
+-- World Model (LLM)
| Understands the page and plans next action
|
+-- Action Engine
| Generates Selenium code to execute the action
|
+-- Selenium Driver
Executes actions in the browserHow It Works
- Observe — The agent takes a screenshot and reads the page DOM
- Think — The World Model (LLM) decides the next action based on the objective
- Act — The Action Engine generates and executes Selenium code
- Repeat — Until the objective is achieved or max steps reached
Step-by-Step Observability
agent = WebAgent(WorldModel(), ActionEngine(driver))
agent.get("https://example.com")
# Enable detailed logging
for step in agent.run_step_by_step("Find pricing information"):
print(f"Step {step.number}:")
print(f" Observation: {step.observation}")
print(f" Thought: {step.thought}")
print(f" Action: {step.action_code}")
print(f" Result: {step.result}")Use Cases
| Use Case | Example |
|---|---|
| Web scraping | "Extract all product prices from this catalog" |
| Form filling | "Fill out this job application with my resume data" |
| QA testing | "Test the checkout flow and verify the order total" |
| Research | "Find the latest papers on RAG from arXiv" |
| Monitoring | "Check if the deployment status page shows all green" |
Multi-Step Workflows
# Chain multiple objectives
agent.get("https://shopping-site.com")
agent.run("Search for wireless headphones under $50")
agent.run("Sort by customer rating")
agent.run("Extract the top 5 results with names and prices")
results = agent.resultConfiguration
from lavague.core import WorldModel
# Use Claude instead of GPT
world_model = WorldModel(
model_name="anthropic/claude-sonnet-4-6",
api_key="sk-ant-..."
)
# Headless mode for CI/CD
driver = SeleniumDriver(headless=True)FAQ
Q: What is LaVague? A: LaVague is a Large Action Model framework with 6,300+ GitHub stars for building AI web agents that automate browser tasks using natural language objectives, with full step-by-step observability.
Q: How is LaVague different from Browser Use or Stagehand? A: LaVague focuses on objective-driven automation — you state what you want to achieve, not the individual steps. Browser Use is a Python agent framework. Stagehand provides three TypeScript primitives. LaVague emphasizes observability and debugging for production web automation.
Q: Is LaVague free? A: Yes, open-source under Apache-2.0. You bring your own LLM API keys.