Introducing Navigator n1.5

The Most Capable Computer Use Model for the Web

By the Yutori team on May 06, 2026

Today, we're releasing Navigator n1.5.

Navigator n1 drove a browser via human-like keyboard and mouse actions (click, type, scroll).

Navigator n1.5 expands the action space: it can now operate directly on the webpage DOM through JavaScript and other code-based tools. As a result, Navigator n1.5 continues to excel in interacting with interfaces in a human-like way, but can take a more efficient programmatic path when one is available.

This hybrid action space (of human-like actions and direct DOM manipulation) is particularly useful for tasks that are tedious to execute through the UI. When filling out multi-page forms, the model can complete multiple fields in a single step instead of progressing one at a time. When extracting information from dense interfaces, it can read the page state to gather all the data at once, rather than clicking through elements individually.

This combines the strengths of vision and DOM based approaches: programmatic access reduces redundant steps and surfaces structured data, while vision provides generality and gold-standard visual verification (because UIs were designed for visual consumption) — together leading to pareto-domination in accuracy, latency, and costs.

Navigator n1.5 also improves accuracy across all web-agent benchmarks¹ we track, with real-world usability gains from structured outputs and increased robustness to in-the-wild agent harness failures.

State-of-the-Art Performance
New Capabilities
Pricing
Get Started

State-of-the-Art Performance

Bar charts showing Navigator n1.5 and Navigator n1 accuracy on Online-Mind2Web, Navi-Bench v2, and Westworld compared to GPT, Claude, and Gemini baselines — Figure 1. Navigator n1.5 sets a new state of the art on Online-Mind2Web, Navi-Bench v2, and Westworld.

Online-Mind2Web. Navigator n1.5 achieves a new state-of-the-art performance of 94.5%² on Online-Mind2Web, a benchmark designed to evaluate the performance of web agents on 300 tasks across 136 popular live websites, surpassing prior results from GPT-5.4 (92.8%)³ and Gemini 2.5 Pro (69.0%). (GPT-5.5 and Gemini 3 Flash results were not reported as of the writing of this post.)

Navi-Bench v2. We introduced Navi-Bench v1 with the goal of evaluating web agents on everyday consumer tasks using verifiable rewards — such as checking availability on OpenTable, searching for tickets on Google Flights, and finding deals on Craigslist. To advance this goal, in collaboration with Encord and Vibrant Labs, we are now expanding coverage to more everyday consumer tasks with new domains including Redfin, Zillow, Gymshark, Trip, and Allbirds. These include new shopping environments from Vibrant that read privileged Shopify cart states to verify success.⁴

We refer to this new expanded benchmark as Navi-Bench v2⁵. Navigator n1.5 achieves 88.0% accuracy on Navi-Bench v2, outperforming Claude Opus 4.7 (80.5%), GPT-5.5 (75.0%), and Gemini 3 Flash (64.0%).

Westworld. Navigator n1.5 also reaches 93.0% accuracy on Westworld⁶ — a benchmark featuring highly realistic simulators for the safe evaluation of high-stake write tasks on e-commerce and travel websites.

Model	Online-Mind2Web⁷	Navi-Bench v2	Westworld
Navigator n1.5	94.5%	88.0%	93.0%
Navigator n1	78.7%	72.0%	92.0%
Claude Opus 4.7	—	80.5%	89.0%
Claude Opus 4.6	—	83.5%	90.0%
GPT-5.5	—	75.0%	88.0%
GPT-5.4	92.8%	68.0%	86.0%
Gemini 3 Flash	—	64.0%	63.0%
Gemini 2.5 Pro	69.0%	41.5%	55.0%

Table 1. Detailed accuracy comparison across Online-Mind2Web, Navi-Bench v2, and Westworld benchmarks.

New Capabilities

Hybrid Vision-DOM Interaction

Previously, Navigator n1 covered the core primitives of web interaction. Navigator n1.5 expands this space with tools for direct DOM inspection and JavaScript execution. The model autonomously chooses when to invoke these capabilities, so simple tasks stay simple while more complex workflows become much more tractable.

Below are videos of two tasks — multi-page form filling and dense information extraction — each ran with the new tools disabled (core vision-based tools only) and enabled (expanded hybrid tools).

Multi-page form filling

Core vision-based tools only—2 minutes

Expanded hybrid tools (UI + DOM + JS)—1 minute

With its hybrid vision-DOM setup, Navigator n1.5 completes multiple inputs in a single step instead of advancing field by field, combining fast JavaScript execution with verification against the rendered interface.

Dense information extraction

Core vision-based tools only—22 steps

Expanded hybrid tools (UI + DOM + JS)—5 steps

DOM-based extraction gathers the full data set from one read of the page state, avoiding click-by-click traversal.

JavaScript Coding & Execution

Due to the diversity and dynamism of the web, some tasks may require or benefit from additional solutions beyond the pre-defined tools. In these situations, Navigator n1.5 can generate JavaScript to directly interact with webpages after extracting and understanding the DOM.

Navigator n1.5 extracting dense product information from a Nike product page using JavaScript — Navigator n1.5 extracting size availability from a Nike product page by writing and executing JavaScript directly in the browser.

Given a task to extract all available sizes for each shoe color, Navigator n1.5 writes custom JavaScript code after understanding the website layout.

Show example generated JavaScript

(() => {
const labels = document.querySelectorAll('label[for*="grid-selector"]');
const sizes = [];
labels.forEach(l => {
  const btn = l.closest('button') || l;
  const computedStyle = getComputedStyle(btn);
  sizes.push({
    label: l.textContent.trim(),
    soldOut: computedStyle.textDecorationLine === 'line-through' || computedStyle.color === 'rgb(158, 158, 160)',
    available: computedStyle.textDecorationLine === 'none' && computedStyle.color === 'rgb(17, 17, 17)' && computedStyle.opacity >= 0.5
  });
});
return JSON.stringify(sizes);
})()

After executing the JavaScript in the browser, Navigator n1.5 is able to extract structured data and interact with web-pages at the code-level.

Show example JavaScript execution output

[
{ "label": "M 6 / W 7.5",   "soldOut": true,  "available": false },
{ "label": "M 6.5 / W 8",   "soldOut": true,  "available": false },
{ "label": "M 7 / W 8.5",   "soldOut": true,  "available": false },
{ "label": "M 7.5 / W 9",   "soldOut": false, "available": true  },
{ "label": "M 8 / W 9.5",   "soldOut": false, "available": true  },
{ "label": "M 8.5 / W 10",  "soldOut": false, "available": true  },
{ "label": "M 9 / W 10.5",  "soldOut": false, "available": true  },
{ "label": "M 9.5 / W 11",  "soldOut": false, "available": true  },
{ "label": "M 10 / W 11.5", "soldOut": false, "available": true  },
{ "label": "M 10.5 / W 12", "soldOut": false, "available": true  },
{ "label": "M 11 / W 12.5", "soldOut": false, "available": true  },
{ "label": "M 11.5 / W 13", "soldOut": false, "available": true  },
{ "label": "M 12 / W 13.5", "soldOut": false, "available": true  },
{ "label": "M 12.5 / W 14", "soldOut": false, "available": true  },
{ "label": "M 13 / W 14.5", "soldOut": false, "available": true  },
{ "label": "M 14 / W 15.5", "soldOut": false, "available": true  },
{ "label": "M 15 / W 16.5", "soldOut": false, "available": true  }
]

Structured JSON Outputs

Navigator n1.5 can now return structured data that adheres to a caller-provided JSON schema. This makes it incredibly easy for developers to extract typed information — search results, form values, table contents, product attributes. Pass a schema in the request, and Navigator n1.5 will produce output that conforms to it.

response = client.chat.completions.create(
  model="n1.5-latest",
  messages=[...],
  extra_body={
      "json_schema": {
          "type": "object",
          "properties": {
              "product_name": {"type": "string"},
              "price": {"type": "number"}
          },
          "required": ["product_name", "price"]
      }
  }
)

# Access the parsed result
parsed = response.parsed_json  # {"product_name": "Widget Pro", "price": 29.99}

Pricing

Navigator n1.5 continues to be the most cost-effective computer-use model.

Scatter plot of Navi-Bench v2 accuracy vs. input price per 1M tokens, showing Navigator n1.5 (88.0% / $1.50) on the Pareto frontier alongside Navigator n1, Claude Opus 4.6/4.7, GPT-5.5, and GPT-5.4 — Figure 2. Navigator n1.5 sits on the Pareto frontier of accuracy vs. cost, beating larger models on Navi-Bench v2 at a fraction of the input price.

Model	Input (per 1M tokens)	Output (per 1M tokens)
OpenAI GPT-5.5	$5.00	$30.00
Claude Opus 4.7	$5.00	$25.00
Claude Sonnet 4.6	$3.00	$15.00
Gemini 3.1 Pro Preview⁸	$2.00	$12.00
Navigator n1.5	$1.50	$5.00

Table 2. Published per-token pricing for similar-caliber Computer Use models.

Get Started

Navigator n1.5 is available today through the Yutori API. Check out the official Yutori SDK for examples on how to easily integrate Navigator n1.5 in your agent loop and browser automation pipelines.

We're excited to see what you build with it.