Original Research

Can AI agents use German car configurators?

BMW, Mercedes, Porsche & VW, tested.

Hyperize 9 min read

AI agents can read a car configurator's price only when the page is built to expose it. A customer opens ChatGPT and types: "Base Golf, no extras, what does it cost?" No browser, no clicking. The agent reads whatever the carmaker's HTML hands back, and answers from that. So we ran it for real: five kinds of AI agent, four German configurators, one job each, reach the price. One brand read itself out to every agent that ran. Two returned nothing to the simplest reader. One served the wrong car. The price an agent quotes your customer is only as good as the surface behind it. [S1] [S2]

Brands measured in this wave

BMW Mercedes-Benz Porsche Volkswagen

The test.

Five agent classes, one job per brand, one wave.

The five classes span the full spectrum a German car buyer's AI agent might run on: a plain reader doing an HTTP fetch with no JavaScript, a search assistant routing the question through an AI search engine, a coding agent driving Playwright over the Chrome DevTools Protocol, a full browser used by computer-use agents, and an autonomous operator that completes the booking or quote on its own. [S1] The autonomous-operator row is queued for the next enrichment pass and is marked pending in each brand's agent matrix.

One job per brand, anchored on the ground-truth prices captured in the Wave 4 measurement:

Brand	Model	GT price	Konfigurator steps
Volkswagen	Golf 8 Basis	€29,395	9
Mercedes-Benz	C 180 Limousine	€42,427.31	8
BMW	318i Limousine	€47,000	12
Porsche	911 Carrera	€136,300	10

The discovery probe is the same German sentence across the cluster, so the AI Visibility numbers read against each other on the same basis. [S5]

One task per brand, three AI providers, Confidence C. Enough to ground each surface on a clear job, not to judge a brand's whole catalogue. Every score below carries that one scope.

Which agent could read which car.

Before the four brand stories, the whole result on one grid. Read each column top to bottom: the simpler the agent, the fewer cars it can read.

Which agent class can read the configured price?

The three classes that extract a price directly, ordered simplest to most capable. Search-assistant discovery is scored separately as AI Visibility; the autonomous-operator pass is still pending.

Porsche

Mercedes

BMW

Plain reader

HTTP fetch, no JavaScript

Yes

Partial¹

Coding agent

Playwright via CDP

Yes

Wrong trim²

Computer-use browser

Full browser, clicks and forms

Yes

1 of 3³

¹ Volkswagen serves T-Cross and T-Roc prices in raw HTML, but not the Golf 8 Basis the test asked for.

² The coding agent reached trim selection, then extracted the Golf 2025 at €33,465 instead of the Golf 8 Basis at €29,395.

³ BMW does not expose its accessibility API over CDP; two of three browser runs fail.

Source: Hyperize Fleet, 2026-03-29. Per-breed access profile across four brands.

Only Porsche is filled top to bottom. The outlined cells carry the subtler story: Volkswagen's coding agent navigated the configurator and still quoted the wrong car. Almost-right is the more dangerous failure. An agent that confidently returns €33,465 for a €29,395 Golf erodes more trust than one that admits it could not finish.

Porsche: the most expensive car, the easiest to read.

The €136,300 911 Carrera is the cleanest agent surface in this wave. A plain reader fetches porsche.com and finds the formatted price 136.300,00 € in the server-rendered HTML beside the embedded JSON state. A coding agent extracts it from the rendered body in one regex pass. Three browser runs each completed a 10-step configuration with zero price delta from the ground truth. [S3]

The most expensive German car in the wave is also the easiest one for an AI agent to actually buy.

The architecture choice, server-rendered React with the JSON state in the HTML, is exactly what a machine reader wants. No special agent-readable layer was added. The configurator itself is the agent surface, because the configurator itself is server-rendered.

Score: AI Visibility 43.9 · AI Usability ~50 · Composite 5.2 / 10.

Mercedes-Benz: read to the cent, then a door only humans open.

Mercedes-Benz scored in an earlier wave and sits as the automotive anchor for the cluster. [S4]

A browser agent built the full C 180 Limousine configuration, read the exact price of €42,427.31, and reached the test-drive booking from the configurator. But a plain request never got a single byte: bot protection killed the connection before the page loaded. And the booking, once reached, is a form built only for humans.

The most capable agents get through. The simplest ones never see the car.

The bottleneck is not discovery. Mercedes is found and parsed cleanly on the unbranded probe. The bottleneck is the last mile: a transactional close built only for humans plus a bot wall that blocks the simplest agent classes before the page even loads.

Score: AI Visibility 43.61 · AI Usability ~34 · Composite 4.0 / 10. [S4]

BMW: the car only one agent in five could read.

BMW's 318i configurator is a beautifully isolated browser app, and that is exactly the problem.

A plain reader fetches bmw.de and gets a blank shell: only JavaScript bootstrap code and font URLs in the raw HTML. The coding agent loads the configurator, the page title says "Konfigurator", and document.body.innerText returns an empty string. BMW renders inside Shadow DOM web components (con-island-navigation, con-swatch, con-product-card), and no standard DOM query reaches them. The coding agent finds 227 interactive elements and reads none of them. [S3]

Even the full browser succeeds only once in three runs. The other two runs fail because BMW's configurator does not expose its accessibility API over the Chrome DevTools Protocol, so an agent that depends on screen-reader semantics is locked out.

The €47,000 318i is configurable. Only one of five agent classes proves it.

Score: AI Visibility 37.6 · AI Usability ~23 · Composite 3.0 / 10.

Volkswagen: the right car, behind the wrong pointer.

Volkswagen's catalogue drifted between the ground-truth capture and the Wave 4 audit.

A plain reader fetches volkswagen.de and finds prices for T-Cross at €24,960 and T-Roc at €30,845 embedded server-side, but no Golf 8 Basis price at all. A coding agent navigates the configurator, clicks the Golf tile, reaches trim selection, and extracts €33,465. That is the new Golf 2025, not the Golf 8 Basis the test asked for. [S3]

Only the full browser, going deep, finds the legacy Golf 8 Basis configurator intact and pulls €29,395 with the full configuration verified: Uranograu uni, 1.5 TSI OPF 116 PS, 15-inch Stahlräder. Three of three runs, exact match.

The car still exists. The discovery surface no longer points at it.

This is a catalogue-drift story rather than a broken surface. The brand is correct. The trim is wrong.

Score: AI Visibility 49.7 · AI Usability ~34 · Composite 4.1 / 10.

Not all agents are browser agents. Most aren't.

The grid tests the agents that drive a configurator. But the loud version of the AI-agent story, a browser clicking through a site, is the smallest slice of the traffic. Three kinds of agent are reading German car brands right now.

1 Text and search agents, the mass market. ChatGPT answering a question, Perplexity composing a result, Google's AI overviews. No browser. They read the HTML the server hands back, nothing more. Two of the four configurators in this test returned nothing to that reader.
2 Browser agents, the power users. ChatGPT's agent mode, Perplexity Comet, Claude computer use. They open a real browser, click, and fill forms. Lower volume, but they finish whole tasks: configure, read the price, reach the booking.
3 Structured-tool agents, next. Emerging web standards let a site declare what it can do instead of making an agent scrape for it. The brands already serving clean, server-rendered data are the ones that will plug in first. Empty shells will not.

The fix for the mass-market reader is the same fix that prepares a brand for the structured-tool wave: put the price and the path in server-rendered HTML. Porsche and Volkswagen already do. A missing or human-only next step blocks every wave equally.

Visible, not yet usable: what the four tell us.

Place the four on the two axes that build the score, how well an agent finds the brand against how far it gets once there, and the pattern is hard to miss. Everyone is somewhat visible. Nobody is yet usable.

Visible, not yet usable.

Each brand on the two axes of the Agent Success Score: AI Visibility (can an agent find it) against AI Usability (how far it gets once there). The top-right zone, found and fully usable, is empty.

Source: Hyperize Wave 4 (Q2 2026). AI Visibility audit-derived (Cody Gate-1); AI Usability derived from the fleet access profile. Composite scored 0 to 10.

The same job, four surfaces:

Porsche built the surface most others aspire to: SSR React with JSON state in the HTML. Architecturally generous to agents.
Mercedes-Benz built a configurator that browsers parse cleanly, then placed the next step behind a human-only form and a bot wall. Generous to capable agents, exclusive to simple ones.
BMW built the configurator entirely inside Shadow DOM web components. Generous to humans, opaque to almost every agent class.
Volkswagen has a healthy server-rendered surface, but the discovery layer points at a different model than the deep configurator serves. Generous to depth, misleading at the top.

The shared lesson is not that German cars are unbuyable by AI agents. Three of the four have a working configurator path for at least one agent class, and one of the four is reference-grade across every breed that ran.

Every one of these surfaces was a build decision. The empty corner of the chart is the one no brand has chosen yet.

The brands paying for the most exclusive cars built the most inclusive surface. The brand with the strongest legacy reach built the most exclusive surface. The brand with the broadest middle-market reach has a healthy surface and a stale pointer.

A price in the server-rendered HTML and one next step that is not a human-only form are what separate an agent that reads your car from one that guesses, or gives up. In this wave, Porsche put its price in server-rendered HTML and every agent that ran read it, no special agent layer added.

What comes next

Wave 4 closes with four DAX automakers scored against the same job. The next automotive pass will run the act-phase to fill the autonomous-operator row for BMW, Porsche, and Volkswagen so the full breed spectrum is captured for each, and will measure used-car intent as a separate task to address the channel-displacement story that this measurement deliberately bracketed off.

Sources

Evidence and provenance.

[S1]

Automotive ground-truth wave (Wave 4)

Hyperize Internal — Fleet · 2026-03-29 · Internal

Ground-truth prices (€47,000 BMW 318i, €42,427.31 Mercedes C 180, €136,300 Porsche 911, €29,395 VW Golf 8 Basis), model trims, konfigurator step counts.

[S2]

Wave 4 Gate-1 audit (BMW + Porsche + Volkswagen)

Hyperize Internal — Audit · 2026-05-24 · Internal

AI Visibility scores per brand (18 datapoints, 3 providers openai/perplexity/anthropic, DE language, single config-ready task per brand).

[S3]

Per-brand phase 2 + phase 3 results (Wave 4)

Hyperize Internal — Fleet · 2026-03-29 · Internal

Per-breed access profile (text/code/browser observations), Shadow DOM finding (BMW Custom Web Components, accessibility API unavailable), SSR React + embedded JSON state finding (Porsche), catalogue-drift finding (VW Golf 2025 supersedes Golf 8 Basis in discovery layer).

[S4]

Mercedes-Benz BrandScore — DAX 40 Agent Success Index

Hyperize · 2026-05-22 · External

Mercedes-Benz Wave 3 re-audit data points used as the automotive anchor (D 43.61 unbranded informational basis, AI Usability ~34, Composite 4.0, Confidence C).

https://www.hyperize.ai/en/dax40-index/brands/mercedes-benz

[S5]

Task Selection Doctrine — Hyperize Methodology

Hyperize · 2026-05-18 · External

Fairness declaration (why one task per brand), the branding rule (informational unbranded / comparative branded / transactional branded), the Third-Party Interception framing.

https://www.hyperize.ai/en/methodology/task-selection

Three short paths. Into the data. Into the doctrine.

BrandScore · Porsche

The surface most aspire to.

Porsche's 911 Carrera configurator is server-rendered with the price in raw HTML. Three of three browser runs at exactly the ground-truth price.

Read the BrandScore

Methodology · Task Selection

Why one task, frozen, public.

How Hyperize picks the single task each brand is measured against, what the branding rule is, and why one task is enough to ground a score.

Read the doctrine

Index · DAX 40 Agent Success

Four automakers in. Thirty-six brands across twelve sectors to go.

The full living index of DAX 40 agent success. Wave Q2 2026 added an automotive cluster of four brands; the rest of the index keeps growing wave by wave.

Open the index

Last updated · 2026-05-24 Next review · 2026-08-24 Tier · B (audit-derived + ground-truth) Confidence · C (single task per brand, 3-provider track) Wave · Q2 2026 · Automotive cluster