Can AI agents build a production website?

Yes, and it is measured: hyperize.ai and minglabs.com are live, built by AI agents with one human making the decisions. 17 days from first commit to launch, Lighthouse mobile performance 95, and AI engines began citing the pages eight days after launch.

What did it cost to build two websites with AI agents?

About €1,300 in design-tool spend for both sites, plus flat agent subscriptions, plus one founder's decision time. The structural saving is the team that was never hired: no design team, no dev team, no CMS license.

Why not ship the React app the design tool generates?

Because AI crawlers fetch HTML and do not execute JavaScript. A client-side React app hands them an empty shell. We treat the design tool's code as a living specification and port it into server-rendered HTML, where every page is complete in the first response.

How do you keep quality without a dev team?

Quality is enforced by gates, not by review meetings: 12 page types with machine-readable rules, 34 automated checks per page, and a content gate on every push. A page ships only when the gate exits 0. One human holds taste, approvals, and the veto.

Do AI engines actually cite an agent-built website?

Yes, with the measurement frame declared: eight days after launch, a logged-out ChatGPT with no account returned the Hyperize index as its first source for a plain buyer question, above two arXiv benchmarks. Queries carrying our published vocabulary were cited the same day by ChatGPT above Wikipedia and six times inline by Perplexity.

How one founder and AI agents built hyperize.ai and minglabs.com in 17 days

What this article covers

The full build pipeline behind hyperize.ai and minglabs.com: the design sandbox, the port into agent-readable HTML, the gate system that replaced code review, the operating model, the receipts, and the one thing that silently failed.

The bet: the website is the product.

Hyperize sells one thing: making brands findable and usable for AI systems. A website that fails its own methodology would refute the product on contact.

So we set the bar accordingly. Every page on hyperize.ai passes the same agent-readability standard we sell, including the homepage, the legal pages, and this article. There is no marketing section that gets a pass.

The website is not the brochure for the product. The website is the product.

That bet shaped every build decision that follows. It also raised the stakes beyond Hyperize: minglabs.com, the agency site, shipped through the same pipeline and went live first, on May 28 [S1]. Two sites, one process, repeated. A process that works once is an anecdote. A process that works twice is a pipeline.

Design in a sandbox, production in a pipeline.

The design never touched production, and production never touched the design tool. That separation is the first load-bearing decision.

Design happened in Lovable, a prompt-based design tool. One founder, a designer by trade, iterated visually until the layout held: brand tokens, typography, 15 homepage sections, every state. Hours per iteration, not sprints. The output is a React codebase, and we treat it as a drawing, not as software.

From there the port is mechanical, not creative. A build agent, Claude Code, reads the design source and reproduces it 1:1 in the production stack, component by component, animation by animation. The design repo stays read-only for production. Interpretation is banned, because interpretation is where agents invent errors nobody ordered.

Note what is not the advantage here: the tools. Lovable is public, Astro is open source, the agents run on subscriptions anyone can buy. And anyone who has prompted a design tool for an afternoon knows its ceiling: results drift, consistency breaks, every session starts from zero. What removed that ceiling was not a better tool.

The cost structure is the quiet headline. Design-tool spend for both websites: about €1,300. The build agents ran on flat subscriptions [S1]. What never appeared in the budget: a design team, a dev team, or an agency retainer.

The prettiest page an AI cannot read.

The design tool's output had one disqualifying property: AI systems see it as an empty page.

Lovable ships a client-side React app. The content assembles in the browser, after JavaScript runs. Fetch the raw HTML, the way most AI crawlers do, and you get a shell. Vercel's crawler study measured exactly this: GPTBot, ClaudeBot and PerplexityBot fetch HTML and do not execute JavaScript [S5]. Our own index keeps finding the same failure on brand sites: a coding agent reads BMW's configurator page body as an empty string, because the content hides in Shadow DOM components [S4].

The design tool's output, fetched raw

“<div id="root"></div>. The content arrives only after JavaScript runs. To a crawler that fetches and leaves, the page is empty.”

The production page, fetched raw

“Every headline, fact, source and schema entity sits in the first HTML response. No JavaScript required to read any of it.”

That is why production is Astro: server-rendered, plain HTML per route, complete in the first response. Each page also ships a machine-readable JSON twin and an entry in llms.txt, so an agent can read the site without parsing the human layer at all.

If a crawler cannot read your homepage, your homepage does not exist. The stack choice was never a tech preference. It was the strategy.

Quality is an exit code, not an opinion.

Between agents, "looks good to me" is not a handoff signal. Pass or fail is.

We defined 12 page types, each with machine-readable rules: what a page of that type must contain, what it may contain, and what is forbidden on it. 34 automated checks run against every page, covering canonicals, schema, language alternates, source discipline and share cards. A page counts as done if and only if the gate exits 0 [S6]. A second gate checks content quality on every push and blocks what fails.

The human role concentrated where it earns the most: decisions. One person sets taste, approves direction, and holds the veto. The shared memory lives in three documents, a vision, a backlog, and a decision log that has ratified 26 decisions so far [S6]. What used to live in Jira tickets and in people's heads lives in files every agent reads before it works.

An agent cannot be embarrassed into quality. A gate can refuse to pass it.

Nothing about this is trust in machines. It is the opposite: the system assumes agents drift, and makes drift fail loudly before it ships.

What this article does not contain

This article names every tool and every count: 12 page types, 34 automated checks, 26 ratified decisions. What the counts stand on is the working asset: the rules, the check definitions, and the editorial gates encode a year of measuring what AI engines actually retrieve and cite. They are what makes the pipeline converge instead of drift. The tools are downloadable. The rule layer is the product.

Live in 17 days, cited in 8 more.

May 15: first commit of the production repo. June 1: hyperize.ai live on the new domain, 61 pages in the launch sitemap [S1]. Lighthouse rates the mobile build at 95 performance, 2.0 seconds to largest paint, zero layout shift; run it through PageSpeed yourself [S7].

Then the part we could not gate: would AI engines cite a two-week-old domain? On June 9, the day Google indexed the page, ChatGPT cited our Siemens Energy analysis above Wikipedia, and Perplexity quoted it six times inline. Those queries carried vocabulary from our own published analysis, so we log them as vocabulary-led wins, not organic ones [S3]. The control run is the one that matters: a logged-out ChatGPT, no account, asked a plain buyer question, returned the Hyperize index as its first source, above two arXiv benchmarks [S2].

ChatGPT, logged out, June 9

“One example is Hyperize's DAX 40 Agent Success Index, which publishes scores for individual DAX companies such as SAP and BMW. The methodology combines AI visibility with agent task completion on company websites.”

And one confession, because a making-of without the failure is marketing. For four days after launch, the homepage lead form validated input, showed a polished success message, and sent nothing [S8]. The 1:1 port had faithfully replicated a demo form, fake confirmation included. Nobody saw it, because it looked perfect. Every lead in that window was lost.

The fix took an evening. The lesson became doctrine: a 1:1 port replicates bugs as faithfully as it replicates design, so every interactive element now gets tested for what it transmits, not for how it looks.

A website is not a project anymore.

The build taught us a category change. A website used to be a project: scoped, staffed, shipped, then left to age until the next redesign. What we run now is a system: measured, gated, and regenerated as the measurements come in.

The economics moved with it. For this build: no agency project, no CMS license, no maintenance team. The remaining inputs are decisions and compute. Across 26 ratified decisions, the scarce input was judgment: which layout holds, which claim ships.

The visitors are changing too. The DAX 40 index we publish measures what AI agents experience on Europe's biggest brand sites, and the recurring finding is that the sites were built for the previous generation of visitors [S4]. Agents are not a future audience. They are reading your site now, and they cannot act on what they cannot read.

AI agents built our websites. The decisions stayed human. That division of labor, not the tools, is what we would carry into any build that follows.

AI agents built our websites. One human made the decisions.

The bet: the website is the product.

Design in a sandbox, production in a pipeline.

The prettiest page an AI cannot read.

Quality is an exit code, not an opinion.

Live in 17 days, cited in 8 more.

A website is not a project anymore.

Check the receipts. Then check your own site.

Read the discipline. Browse the receipts. Engage.

Agent Surface.

DAX 40 Agent Success Index.

Founding Program.

Frequently asked questions

Can AI agents build a production website?

What did it cost to build two websites with AI agents?

Why not ship the React app the design tool generates?

How do you keep quality without a dev team?

Do AI engines actually cite an agent-built website?

Evidence and provenance.