Skip to main content
Back to the blog

Walkthrough

Your crew, one shop: how an AI workforce actually runs your store week-by-week.

The Day-1-to-Month-3 arc of an AI workforce on one real shop — every worker, every cross-team handoff, every refusal that earned its place in the team.

Kamil Buczek · 26 May 2026 · 15 min read

Your crew, [one shop]: how an AI workforce actually runs your store week-by-week.

Meet Anna and her jewelry shop.

Anna runs a handmade silver jewelry shop on PrestaShop 8 — fifty SKUs, around two hundred orders a month, also sells on Etsy and Amazon Handmade. She is solo. This post follows her four weeks with a Crewmerce AI workforce.

Anna is the whole team. Daily, the friction is the same as it is for thousands of small ecommerce operators: roughly thirty customer messages across three languages, five-to-eight stuck orders a week, photos that need a refresh for marketplace specs she has trouble keeping up with, product descriptions she keeps meaning to translate, inbox that piles up because she is also the one packing parcels. SEO and ads sit at the bottom of the list because they are the work she never gets to. This is the kind of shop where the gap between "useful AI help" and "yet another tool to manage" is the difference between actually shipping more next month and adding another tab to your browser. The walkthrough below is how Crewmerce reads on a shop like Anna's, not on a hypothetical Fortune 500.

Day 1: Onboarding Specialist reads your shop and your voice.

Anna pastes her shop URL. Onboarding Specialist fingerprints PrestaShop 8 from the URL alone, walks her through the API key setup with the exact admin screen referenced, validates the connection with six probes, and reads her public pages to learn her brand voice. The whole flow takes about eight minutes.

Onboarding Specialist starts from one signal — the URL — and uses HTTP headers, embedded JavaScript, admin path patterns, and the favicon hash to recognise PrestaShop 8.1.x with high confidence. It walks Anna to the exact screen in her PrestaShop admin ("Advanced Parameters → Webservice") and tells her which six resource permissions to enable. She pastes the key. The validation pass runs in front of her: products read, orders read, customers read, webhooks (absent on PrestaShop 8 — polling only), shop metadata, image upload. In parallel and silent, the brand research pass reads her homepage, about page, shipping and returns pages, and a handful of sample product pages. It produces about six proposals with confidence in the 0.3-0.7 range — your tone reads as "warm, poetic, but concrete", your positioning is "thoughtful artisan, minimalist aesthetic", your category vocabulary uses "crafted" and "silver-forged" not "luxury" or "premium". Anna confirms four, corrects one, skips one. Those confirmed proposals become the brand voice memory every other worker reads before drafting anything.

Day 1, hour 2: Customer Care sends its first reply in your voice.

Onboarding suggests Customer Care or Order Manager as a first hire. Anna picks Customer Care. Within minutes, the worker is reading her inbox in three languages and drafting the first reply.

Customer Care reads inbox threads from her PrestaShop customer messages plus her connected mail. Each thread's language is detected with a fast language identifier and stays sticky to that thread — when Anna's German customer writes in German, the reply stays in German, never code-switches mid-conversation. The first inbound is a where-is-my-order question. Customer Care pulls order context from Order Manager in one handoff — the order state across all four axes, the carrier scan, the tracking number, the projected delivery date — and drafts a reply in Anna's voice (warm, concrete, the specific delivery context already in it, not generic chatbot fluff). The draft lands in her queue. Anna reads, taps approve, the reply sends, and the Receipt records what was found, what was sent, why, when, and in which language.

Week 1: Order Manager runs your orders across four axes.

Order Manager does not collapse your orders to a single status. It tracks payment, fulfillment, communication, and lifecycle as four independent axes so a refund-pending order can still be shipping while waiting on the bank.

A stuck order on a real shop is rarely "stuck" in just one sense. The customer paid by card, the card cleared, fulfillment was triggered, the parcel was dispatched, but the carrier's last scan is three days old. That order is in fulfillment_state: shipped, payment_state: paid, communication_state: customer_waiting, lifecycle_state: post-sale. Collapsing those four axes into status: shipped loses three of them — and that is what almost every legacy commerce platform does. Order Manager keeps them separate. By the end of Week 1, Anna sees: stuck orders surfaced with carrier exception context already attached, fraud signals (address mismatch plus payment-method velocity) flagged for her review (never auto-cancelled — precision over recall, the Stripe Radar discipline), and one sale campaign about to expire without an end-date set (EU Omnibus discipline — Order Manager refuses to ship a perpetual-sale label). Every mutation stays approval-first; every action lands inside a 24-hour rollback window if she changes her mind.

Week 2: Studio Photographer refuses to regenerate photos already on spec.

Anna uploads ten new product photos. Studio Photographer audits each against her per-channel preset (Amazon clinical / Etsy lifestyle / Google Merchant Center clean). Seven of the ten are already on spec. Studio refuses to regenerate them — and tells her how much budget that saved.

The audit runs a seven-dimension rubric per photo: resolution, aspect ratio, background quality, overlays or text, product centering, Lab colour delta-E against her brand palette, and provenance status. For seven of the ten, every dimension passes for at least one target channel. Studio reports back: "Your photos are already on Etsy spec. I am not regenerating them. Cost saved: $0.84 across seven shots." For the three that miss, it counter-proposes specific fixes — for one, a clinical Amazon variant with pure white background; for another, an Etsy lifestyle variant with the wooden surface kept; for the third, a Meta square crop. Each generated image carries C2PA Content Credentials and a SynthID watermark where the underlying model supports it. Provenance is default-on, never billed separately, ahead of the EU AI Act Article 50 transparency expectations binding August 2, 2026. Anna taps approve on the three regenerations. The new photos hand off to Product Specialist via a typed envelope for multi-channel publish coordination.

Week 2 continued: Product Specialist catches typos and refuses mass-AI rewrites.

Product Specialist audits Anna's fifty SKUs across Google Merchant Center, Meta Catalog, TikTok Catalog, and Amazon Seller Central. It finds twelve missing GTINs and four typos. When she asks it to rewrite all fifty descriptions overnight, it refuses.

The audit is per-channel because each channel rejects on its own rules: Google MC requires GTIN on branded products since the November 2023 policy update; Amazon Seller Central rejects images that are not pure white background; TikTok Shop wants `parent_product_id` grouping for variants. Product Specialist catches the gaps and proposes per-SKU fixes Anna can tap through. Then the moment that earns the worker its place in the team: Anna says "rewrite all fifty descriptions, my voice, by tomorrow." Product Specialist refuses. Over fifteen AI-drafted descriptions in one batch without Anna's edit-pass commitment is the threshold that keeps small-shop domains off the next Helpful Content audit. It counter-proposes fifteen descriptions per batch, three batches across three sessions, an edit-pass after each. Anna accepts. By the end of Week 2 her descriptions are SKU-by-SKU updated with her own edits on every single one — readable, true to her voice, and the kind of content that does not get demoted six months later when a Google update hits.

Week 3: SEO and GEO start the audit. Shipping Coordinator picks the carrier before cutoff.

Three more workers come online. SEO Specialist audits Search Console weekly. GEO Specialist tracks AI-search citations. Shipping Coordinator compares rates across the carriers she has connected and refuses to lock a single one without comparison.

SEO Specialist verifies her Search Console property — every shop with organic ambitions starts here, and Anna's was sitting un-verified — and audits her fifty SKU pages. Twelve are missing offer-block schema. Three have poor Core Web Vitals scores on the LCP element (her hero photos are heavy and not optimised for the marketplace channel). SEO drafts rewrite briefs and hands them to Product Specialist with an explicit edit-pass-required flag. SEO does not rewrite the copy itself — that crosses into Copywriter territory, which is a future hire. GEO Specialist reads her schema graph, her llms.txt, and tracks her citations across ChatGPT, Claude, Perplexity, Google AI Overviews, and Bing Copilot. It drafts entity-graph extensions with stable @id URIs so the AI crawlers see one coherent entity across the site, not five fragmented mentions. Shipping Coordinator reads the dispatch queue across the six pre-built carriers (InPost, DHL Express, DPD, Poczta Polska, UPS, FedEx). When she has two or more carriers connected, it refuses to lock a single carrier without comparing — the rate-compare is the default, not an opt-in. The cutoff window is enforced — labels printed after the carrier cutoff roll to tomorrow's pickup, not silent-fail.

Week 4: Ads Specialist refuses Performance Max without brand exclusions. Analytics Specialist caps the digest at three findings.

Ads Specialist connects Anna's Google Ads and Meta Ads. When she asks for a Performance Max launch, it refuses to launch without brand exclusions and explains why. Analytics Specialist runs read-only, never mutates anything, and refuses to ship the weekly digest if it has more than three findings without her permission.

Performance Max without brand exclusions is the industry-standard cost-centre mistake — your brand search terms get cannibalised by your own Performance Max budget. Ads Specialist refuses the launch and counter-proposes: a brand exclusion list plus a separate brand search campaign with Target Impression Share at 100%. Anna approves. The campaign launches with the right defaults from minute one. Analytics Specialist comes online next — read-only by design, never mutates platform state, never deploys a GTM container, never edits a GA4 setting. It reads through GA4 and Microsoft Clarity (always-on Tier 1) and produces a Friday digest. The digest format is hard-capped at three findings per week with severity tags (attention, monitor, signal). The discipline behind that cap: a wall of charts is not an analysis. Analytics refuses to ship the easy AI-summary that hides what is actually moving — every finding has a point of view, a why, and a recommended next action with which worker should run it.

Week 4 continued: API Integrator wakes up for an exotic platform.

Anna's daughter has a Wix shop and wants Anna's help. There is no pre-built Wix connector in the Crewmerce roster. API Integrator wakes up.

API Integrator handles every platform that does not yet have a pre-built connector. It is the team's universal-platform fallback. For Anna's daughter's Wix shop, it runs Phase 1 dynamic synthesis: reads the Wix API documentation from a public OpenAPI spec, proposes a connector plan with a cost preview ($1.20 of one-time AI work), opens an authenticated connection on Anna's approval, validates response shapes against fifty common ecommerce operations, and saves the connector as a workspace-private definition. Anna's daughter now has a working Wix integration. Crucially, this is not a one-shop hack — once the Wix pattern is used across three different shops over the following months and stays stable, the connector graduates from workspace-private (Phase 2) to pre-built (Phase 4), at which point every future Wix shop on Crewmerce gets the deterministic pre-built path at $0 LLM cost. The pipeline is how exotic platforms become first-class over time.

Month 2: Business Advisor reads everyone and tells you what matters this week.

Business Advisor reads every other worker's weekly digest and synthesises the cross-system picture in plain English. Read-only by design — it never mutates anything; it just tells you what is moving and what is worth thinking about.

By Month 2, Anna has eleven workers in motion. Business Advisor is the twelfth — and is the only one that reads the outputs of every other worker. It sees Order Manager's weekly orders summary, Customer Care's inbox digest, Analytics' top-three findings, Ads' spend signal, Product Specialist's catalog moves, Studio Photographer's audit summary, Shipping's exception rate, SEO's Core Web Vitals trend, GEO's citation count, Onboarding's brand voice memory drift, and API Integrator's connector usage. It synthesises in a pyramid principle (situation, complication, question, answer) and hard-caps the Friday brief at three recommendations. A concrete example from Anna's Month 2: "Pierścionki series sales down 18% week-over-week. Three signals converge: Studio Photographer flagged four SKUs whose photos drifted from your other product line's aesthetic; Customer Care surfaced two complaints about size accuracy; Ads Specialist's Performance Max creative for this series has a fatigue ratio over your threshold. Three options ranked by reversibility…" Business Advisor never executes — Anna decides; the recommended worker runs the chosen action.

Month 3: Earned autonomy in action.

After ninety days, Anna has approved enough simple delivery acknowledgments for Customer Care to handle that exact pattern on its own under her rules. Refunds, complaints, and brand-risk threads still wait for her tap.

Trust ladder maturation. Customer Care has earned Ring 2 for the where-is-my-order pattern after fifty clean approvals with the same shape. Now those replies send automatically — in any of five supported languages, with order context already pulled, within fifteen minutes of the customer messaging. Anna's queue for that exact pattern drops to near-zero — she still sees the Receipts roll in, she can still revoke the pattern any moment, but her active attention is free for the work that needs her. Meanwhile: refunds still wait for her tap. Complaints still wait. VIP customers still wait. Chargeback threats still wait. Brand-risk signals still wait. The Hard-Ring-1 perpetual lock list has not changed, and it will not, no matter how much trust the team accumulates. Trust is granted per narrow pattern, not as a general permission.

What stays approval-first forever.

Some things never automate, no matter how much trust the team has earned. The hard list is not theoretical — every worker carries its own.

Refunds above the threshold Anna set (Order Manager). GDPR data exports and customer PII batch operations (Customer Care). AI-generated content publication without an edit-pass (Product Specialist). Hero-image overwrites on any live channel (Studio Photographer). Bulk operations affecting more than ten percent of catalog (every commerce worker). Performance Max launches without brand exclusions (Ads Specialist). Sitemap, canonical, robots.txt, JSON-LD injection, hreflang, noindex, redirect maps, disavow and Search Console change-of-address mutations (SEO Specialist). Schema graph restructures and llms.txt mutations that change crawler scope (GEO Specialist). International shipments without HS tariff codes plus commercial invoice (Shipping Coordinator). Connector graduation from Phase 1 to Phase 4 (API Integrator). Brand research extraction publication without Welcome Ceremony confirmation (Onboarding Specialist). Strategic recommendations without at least three sources of evidence (Business Advisor). The shape is consistent: blast radius too high; cost of asking too low. Approval-first is not a phase; it is a posture.

Pull quote

Your crew, one shop — and not one of them runs unless you have said yes, or you have already said yes to that exact narrow kind of work.

Frequently asked questions

  • Do I have to hire the whole crew at once?

    No. Most owners start with one — usually Customer Care or Order Manager — and add more as patterns emerge. The team grows at your pace, not the other way around. Hiring is one tap; firing is one tap.

  • Will the workers do anything without my approval?

    Not on day one. Every worker starts read-only or approval-first. Specific narrow patterns can earn the right to run under your rules, after you have approved enough of the same kind of work — but Hard-Ring-1 actions (refunds above your threshold, GDPR exports, bulk catalog overwrites, hero-image changes, AI-generated content publication) never automate, no matter how much trust the team has earned.

  • What if a worker refuses to do something I asked?

    It tells you why and proposes a counter-pattern. Studio Photographer refuses to regenerate a photo that already meets the marketplace spec — and reports the budget saved. Product Specialist refuses to ship more than fifteen AI-drafted descriptions per batch without your edit pass. Ads Specialist refuses Performance Max without brand exclusions. Refusal is a feature: it is the team protecting you from work that would hurt your shop.

  • How do the workers actually communicate with each other?

    Through typed handoff envelopes — each worker emits structured data the next worker can act on without re-querying. When Customer Care needs order context, it asks Order Manager once and gets everything (order state across all four axes, carrier scan, payment status, fulfillment) in one call. When Studio Photographer generates a new image, it pushes a typed notification to Product Specialist with the per-channel format coverage already validated. The team works as one crew, not as a stack of separate tools.

  • Is this ready for the EU AI Act binding August 2, 2026?

    Yes. Every image Studio Photographer generates carries C2PA Content Credentials plus a SynthID watermark where the model supports it. Every AI-drafted description and reply is labelled in the action record. Five content-generating workers carry refusal thresholds that fire before mass-AI publication. The Receipt records the AI involvement on every action. Provenance is default-on, never billed separately.

An AI workforce is not a pile of chatbots. It is a team that reads your shop, runs the small jobs you would never get to, refuses the ones that would put you at risk, and leaves a Receipt on everything it did — so you stay in charge of a store you understand.

Your crew, one shop — how the Crewmerce AI workforce actually runs your store week-by-week — Crewmerce