Name: StoreMend Shopify Audit Cohort, Q2 2026
Creator: StoreMend
Published: 2026-06-25
License: https://creativecommons.org/licenses/by/4.0/

The headline finding

Of the 1,091 Shopify storefronts audited in Q2 2026, 778 of them shipped with the structured-data surface required for AI shopping assistants to cite them. The other 778 did not. Phrased another way: 72.2% of audited stores are functionally invisible to ChatGPT, Perplexity, and Google AI Overviews when a buyer asks for a recommendation in the relevant vertical.

Zero stores in the corpus met the full ready bar: complete Product JSON-LD with offers.availability, aggregateRating, FAQ schema on the product page, llms.txt at the root, and an AI crawler policy in robots.txt. The top tier was empty across the entire 1,091-store cohort.

AI shoppability tier mix across 1,077 stores classified to cohort cells. Invisible means no Product or Organization JSON-LD detected on homepage or product page. Transitional means partial signal: Product schema with three or fewer fields, or Organization schema without Product, or llms.txt without robots.txt policy. Ready means full Product + FAQ + llms.txt + crawler policy.

The schema gap is the single biggest pattern

The largest failure cluster across the cohort is Product and Organization schema absence. 733 of 1,077 stores classified to a primary cluster ship without JSON-LD on either the homepage or the product page. That is 68.1% of the classified cohort, and it dwarfs every other pattern by a factor of five or more.

The pattern is category-wide. Every vertical sits between 61% and 80% schema-absent. Outdoor stores carry the highest gap rate at 80%, beauty at 74%, food and CPG at 70%. The lowest gap rate, pet at 61%, is still high enough that more than half the vertical is missing the surface entirely.

Share of stores per vertical missing Product or Organization JSON-LD. The cohort mean of 68% is included as the dark bar at the bottom for comparison. Sample sizes per vertical: apparel 205, beauty 171, other 162, food and CPG 146, home 122, electronics 108, pet 87, supplements 46, outdoor 30.

The cost of the gap is concrete. Google rich results require Product schema with offers, aggregateRating, and an availability state. AI shopping assistants treat the same JSON-LD as the primary input for citation. A storefront without Product JSON-LD is not just less visible in Google Search; it is structurally absent from the AI-shopping recommendation channel, regardless of how good the product or how loud the brand.

The top failure patterns across 1,077 classified stores

Each storefront in the cohort was assigned a primary cluster based on the highest-severity finding observed. The six clusters below account for 96% of every classified store. The long tail of individual findings is large (8,409 total surfaced), but the primary-cluster shape is concentrated.

Stores classified by their dominant cluster. Percent of the classified cohort (n = 1,077) shown in parentheses. The remaining clusters (P3 render-blocking head, P4 duplicate vendor loads, P5 font weight sprawl) collectively account for 3.6% of classified stores.

What each cluster means

S1_S2_schema_absent
Product and Organization schema absent 733 stores (68.1%)
No JSON-LD on homepage or product page. Google rich results and AI shopping assistants cannot read the catalog.
P2_page_builder_overhead
Page-builder JavaScript overhead 138 stores (12.8%)
Pagefly, GemPages, Shogun, or similar shipping render-blocking JS that overrides theme-level speed work.
S4_og_defects
Open Graph defects 111 stores (10.3%)
Missing og:title, og:description, or og:url. Social shares render a stripped card; LinkedIn and Slack previews break.
R1_reviews_absent
Reviews surface absent 29 stores (2.7%)
No Yotpo, Judge.me, or native review app on the product page. Buyers see zero social proof above the buy button.
G1_no_ai_crawler_rules
No AI crawler policy in robots.txt 27 stores (2.5%)
No rule for GPTBot, ClaudeBot, PerplexityBot, or Google-Extended. Either silent permission or silent block, never deliberate.

The 8-vertical breakdown

The audited cohort spans 8 verticals plus an “other” catch-all. Apparel and beauty dominate; supplements and outdoor are the long tail. The table below pairs each vertical with its dominant failure pattern.

Number of stores audited per vertical. Apparel is the largest cohort at 205 stores; outdoor is the smallest at 30 stores. The “other” bucket captures stores that did not fit cleanly into one of the eight named verticals.

Vertical	Top failure pattern	Stores affected	Share of vertical
Outdoor	Product and Organization schema absent	24 of 30	80%
Beauty	Product and Organization schema absent	126 of 171	74%
Food and CPG	Product and Organization schema absent	102 of 146	70%
Home	Product and Organization schema absent	83 of 122	68%
Supplements	Product and Organization schema absent	31 of 46	67%
Apparel	Product and Organization schema absent	138 of 205	67%
Electronics	Product and Organization schema absent	70 of 108	65%
Other	Product and Organization schema absent	106 of 162	65%
Pet	Product and Organization schema absent	53 of 87	61%

The dominant cluster is identical across every vertical: schema absence. The variation is in the rate, not the pattern. This is the strongest single signal in the cohort: the schema gap is not a vertical-specific problem to be solved by vertical-specific tooling. It is a platform-wide gap with a platform-wide fix.

The shape of 8,409 findings

Across the cohort, the audit surfaced 8,409 findings. The severity mix is roughly one-quarter high, one-half medium, one-quarter low. The high-severity bucket is where Google Search indexability, AI shopping citation, and checkout integrity sit. The medium bucket is mostly schema-completeness and trust-signal density. The low bucket is cosmetic and copy-quality findings.

Severity distribution across all 8,409 findings surfaced in the cohort. High-severity findings are the load-bearing ones for Google Search and AI shopping visibility. Medium covers schema completeness and trust signal density. Low covers cosmetic and copy-level findings.

What this means for Shopify operators

The shape of the cohort is a single dominant problem with a concentrated fix path. Two-thirds of stores ship without the structured data that determines whether they get cited by Google rich results or by AI shopping assistants. The fix is well-documented, mostly free, and almost always under a day of work for a single operator who knows what to add.

The hierarchy of next actions is consistent across verticals. First, ship Product JSON-LD with the six required fields on every product page (name, image, description, sku, offers.price, offers.priceCurrency), plus offers.availability and aggregateRating when reviews exist. Second, ship Organization JSON-LD on the homepage. Third, decide an AI crawler policy and write it into robots.txt. Fourth, fix Open Graph for social shares. Fifth, audit the page-builder JavaScript load if the store ships Pagefly, GemPages, or Shogun.

The data also shows what is notthe dominant issue. The cluster of clusters classified as “performance-only” (P2 through P5) accounts for 16% of stores; the cluster of “structured data and AI shoppability” (S1, S4, G1, R1) accounts for 83%. Operators investing weekly in pagespeed work while shipping a 3-field Product schema are working the smaller surface. The structural fix lands the bigger compound.

The other observation worth banking: the corpus contains zero stores in the readytier. There is no “mature AI shopping” cohort to learn from yet. The first cohort that gets there will not be the largest brands; it will be the brands that treat Q3 and Q4 2026 as the window to ship the structured-data surface before competition figures it out.

Methodology and sources

The corpus is the StoreMend audit cohort icp-1000-2026-Q2, a curated sample of 1,091 Shopify storefronts audited between April and June 2026. Stores were sampled across 8 verticals (apparel, beauty, food and CPG, home, electronics, pet, supplements, outdoor) with a long-tail “other” bucket. Revenue band is SME (small to medium, roughly $50K to $5M annual). All stores are publicly accessible at the time of audit.

Each storefront was audited via a deterministic-layer fetch and parse. The homepage and a representative product page were fetched from the live cdn.shopify.com surface and inspected for JSON-LD presence, Open Graph metadata, robots.txt directives, llms.txt presence, and rendered head content. Findings were classified into 11 clusters and 3 AI-shoppability tiers using the cohort-aggregates v1.0 schema. The full per-store finding stream is 8,409 individual observations (2,334 high-severity, 4,244 medium, 1,831 low).

The corpus has 90 cohort cells (vertical x cluster x AI-shoppability tier). Of those, 9 cells contain 30 or more stores and are the load-bearing cells for statistical claims. Smaller cells are reported for completeness but should be read as directional rather than definitive.

The dataset version baked into this report is 2026-06-19. The next refresh lands with the Q3 2026 cohort cut. All percentages in the report are calculated against the 1,077 stores classified to a primary cluster; the 14-store gap between the cohort total (1,091) and the classified total (1,077) reflects stores with no dominant cluster after deduplication, expected at the schema-version v1.0 cut.

The full machine-readable corpus is available on request for research purposes. The next report covers Q3 2026 and ships with the same cadence and the same dataset structure, so quarter-over- quarter comparison is in scope from the start.