Case Study

Agentic Shopper

Autonomous AI agent that shops e-commerce stores and reports what's broken

2025·E-Commerce·AI Agent
The Problem

Online stores break in ways nobody notices.

Third-party apps and widgets update silently — and when they do, they can break parts of your store without anyone knowing: A reviews widget adds 3 seconds to checkout, the cookie consent banner starts reappearing on every page because the add-on pushed a config change, and a geolocation popup covers the search bar on Android but looks fine on desktop.

Add to Cart
FREE SHIPPING OVER $50
40% of button obscured on mobile
Silent breakage
Button blocked by sticky header
A theme update pushes a promotional bar over the Add to Cart button — but only on mobile. The button is there, it's just not clickable. Desktop QA passes. Analytics show nothing. Customers silently leave.

We use cookies
This site uses cookies for analytics and personalization.
Accept Decline
Reappears on every page transition
Third-party update
Cookie banner keeps reappearing
A cookie consent add-on pushes an update that breaks session persistence. The consent dialog now fires on every single page transition — six times during a single purchase flow. The add-on changelog mentions nothing.

4.2s
LCP under 4G throttling
Dev/prod gap
Checkout loads in 4+ seconds on mobile
Pages that feel instant on desktop broadband take over 4 seconds on a real 4G mobile connection. The dev team never tested under network throttling. Layout issues that were solved locally resurface under real-world conditions.

It gets worse when code that worked perfectly in development hits production. A theme update shifts the Add to Cart button behind a sticky header — but only on mobile. Layout issues that the dev team solved locally resurface under real-world conditions: different devices, network throttling, CDN caching, third-party scripts loading in a different order.

These aren't theoretical — they're things we found the moment we ran the system against production stores.

The standard industry answer is synthetic monitoring (Datadog, Pingdom, New Relic), which tells you if an endpoint is up and how fast it responds, or session replay tools, like FullStory and Contentsquare, which show you what happened to real users, after the fact. Neither actually walks through a purchase journey on your live store every day and tells you concretely what went wrong and where.

No tool on the market actually walks through your store like a customer — every day, on real devices — and tells you what broke and where. Synthetic monitoring checks if the server is up, and session replay shows what already happened, but neither catches the broken Add to Cart button before your customers do. That gap is what we went after.
What We Built

An AI agent that shops like a customer.

We built an agent that navigates e-commerce sites the way a customer would — browsing products, adding to cart, filling checkout forms — and evaluates the experience at every step along the way.

The AI agent drives a headless Chromium instance with realistic device profiles to simulate the experience of a real user. For example, Mobile gets a Pixel 5 viewport with 4G network throttling and CPU constraints, while Desktop runs Windows Chrome with broadband speeds.

The navigation is agentic. The system uses computer vision and DOM analysis to identify interactive elements — search inputs, product cards, Add to Cart buttons, form fields — and decides how to interact with them. Platform-specific knowledge (Shopify, WooCommerce) provides structural hints, but the agent handles actual page interaction through dynamic element resolution rather than brittle CSS paths. When a Shopify theme renames its classes after an update, the agent adapts because it resolves elements by intent — find the primary call-to-action, find the checkout button — not by hardcoded selectors.

Every run executes full purchase flows (search → product page → cart → checkout) on both devices and produces a full evidence package: step-by-step screenshots, video recording, HAR traces, console logs, and a structured issue report, while blockers trigger instant email alerts.

How the Agent Works
Goal
"Find and purchase a pair of black boots"
Observe
"I see a search bar, a cookie banner, and a navigation menu"
Vision + DOM analysis
Decide
"The cookie banner is blocking the search. Dismiss it first, then type the query"
Goal-driven reasoning
Act
"Dismissing cookie banner... Typing 'black boots' into search bar..."
Browser interaction
But at any point, the agent may encounter:
Popup covers the page
"This is a dismissable overlay — not part of the purchase flow. I should close it and continue."
Dismiss and continue
Button is hidden
"The Add to Cart button is 60% obscured by a sticky header. I cannot click it reliably."
Screenshot, flag as blocker
Page loads too slowly
"LCP is 4.8s and CLS is 0.4 — both above acceptable thresholds for this step."
Log metrics, flag performance issue
↻ Repeat until goal is complete
The Detection Layer

Identifying what's wrong.

Driving and navigating a site like a human is one part of the story; identifying what's wrong is the other. We built a set of detectors that evaluate the experience at every interaction point during a flow:

CTA visibility and obstruction

Before every click, the agent verifies the target is visible, has real dimensions, and isn't covered by fixed or sticky elements. A header sitting on top of 60% of the Add to Cart button on mobile — that gets flagged with a screenshot showing the exact overlap.

Dead click detection

After every click, we classify what happened: page navigation, SPA route change, DOM mutation, or nothing. "Nothing" means the click did nothing observable. This could be a broken handler, intercepted event, or an element that looks clickable but isn't wired up, which - in any case - are nearly impossible to catch from server logs.

Per-step performance

LCP, CLS, INP, FCP, TTFB, long task count — measured at each step. Your site might be "fast" on average, but if search loads in 1.2s and checkout takes 4.8s, that matters.

Layout shift at interaction time

CLS measured across each user action, not just initial page load. Product added to cart causes a 0.3 shift? We show exactly which step triggered it.

Overlay friction

The system handles known overlays (cookies, geo banners, newsletter popups, chat widgets) automatically but tracks cumulative time consumed. A cookie dialog reappearing on every navigation? That shows up as measurable friction.

JS and network errors

Console errors, failed requests, 4xx/5xx responses — all captured throughout the flow. While 40 JS errors and 12 failed API calls during checkout is a blocker, one console warn is not.

Challenges

Things that were harder than expected.

Figuring out when a page is "done"
On modern SPAs, there's no clean signal for "the page finished loading", and therefore we built a multi-signal classifier: URL change detection, MutationObserver for DOM quiet, network tracking via patched fetch/XHR, requestAnimationFrame for paint confirmation. Each post-action state gets classified (navigation, SPA route, DOM mutation, idle) with different settling strategies.
Dev-prod parity
Production stores have bot protection — Cloudflare challenges, fingerprinting, rate limiting — that add delays or block requests from cloud IPs entirely. We built device emulation and browser fingerprint alignment to reduce detection, but this remains one of the harder operational challenges of running autonomous agents from cloud infrastructure.
Detecting invisible obstructions
Knowing a button exists on the page isn't enough — you need to know if it's actually clickable. A sticky header might cover 40% of the Add to Cart button. A semi-transparent promotional overlay might block clicks but still let you see the element underneath. A navigation bar with pointer-events: none looks like it's blocking but isn't. We built a grid-sampling system that probes a matrix of points across each target element, traces through the DOM to find fixed or sticky ancestors at each point, computes effective opacity through the full ancestor chain, and classifies the obstruction as hit-blocking or visual-only — with different severity thresholds for each.
Results

What it found.

The agent runs against live stores providing a basic purchase goal.

On the first run against a mid-size fashion retailer, it flagged a sticky header covering 40% of the Add to Cart button on mobile, a cookie banner re-triggering on every page navigation, and checkout load times above 4 seconds under 4G conditions. None of this was visible in their analytics.

report.mirrora.ai / fashion-retailer / 2025-11-12
Agentic Shopper Report
Fashion Retailer · 2025/11/12
✗ Flow Blocked Issues: 3 BLOCKER 1 MAJOR 1 MINOR 1
⚠️ Issues

CTA blocked by sticky header

BLOCKER
Add to Cart button 40% obscured by sticky header on mobile.
Flow: search to cart · Step: add_to_cart · Mobile

Cookie banner re-triggering

MAJOR
Cookie consent reappears every page transition. Dismissed 6 times.
Flow: search to cart · Overlay friction · Mobile

Checkout loading extremely slow

MINOR
LCP 4,200ms under 4G. Page feels stuck during checkout.
Flow: cart to checkout · Step: go_checkout · Mobile
📊 Execution Summary
Step Time LCP CLS
▼ Search → Cart 📹 📄 38s
open_home6.6s15080.00
submit_search8.3s39120.40
open_pdp5.0s9040.00
add_to_cart4.0s0.00
▼ Cart → Checkout 📹 33s
go_checkout8.1s42000.01
fill_email2.3s0.00

Have a similar challenge?

Let's talk about what an AI agent could do for your operations.

Book a Discovery Call