The Ghost Score: A Methodology for Measuring AI Brand Visibility

Published by GhostedByAIVersion 1.0February 2026

As AI systems increasingly mediate product discovery, brand evaluation, and purchasing decisions, a critical gap has emerged in marketing measurement: no standard metric captures whether a brand is visible to AI. Traditional search metrics — keyword rankings, domain authority, click-through rates — measure visibility to humans navigating search engine results pages. They do not measure whether an AI system knows a brand exists, can accurately describe it, or will recommend it when a user asks for help.

The Ghost Score is a standardized metric (0–100) that quantifies a brand's visibility across the four major AI platforms consumers use for product discovery: ChatGPT, Claude, Gemini, and Perplexity. This document defines the methodology, explains the measurement framework, establishes scoring bands with business-context definitions, and provides preliminary industry benchmarks.

Our goal is not to replace existing SEO and brand measurement tools. It is to define the metric that captures the thing they cannot: whether your brand exists in the AI layer that sits between your customer and your product.

1.Why AI Brand Visibility Requires a New Metric

1.1The Shift from Search Results to Synthesized Answers

For two decades, digital brand visibility has been synonymous with search engine visibility. Brands invested in SEO to appear on the first page of Google results. The measurement infrastructure — keyword rank tracking, domain authority scores, SERP analysis — was built entirely around this paradigm.

That paradigm is fracturing. Gartner projects a 50% decline in traditional organic search traffic by 2028. Capgemini research indicates that 58% of consumers have already replaced traditional search with generative AI tools for product research and recommendations. Google itself has acknowledged this shift by deploying AI Overviews, which synthesize answers directly rather than presenting a list of links.

When a consumer asks ChatGPT “What's the best project management tool for a 20-person startup?” the AI does not return ten blue links. It returns a synthesized recommendation — typically naming two to four products, explaining their strengths, and sometimes making a direct suggestion. The brand that appears in that answer captures the consumer's attention. The brand that doesn't appear doesn't just rank lower — it functionally doesn't exist.

This is a qualitatively different visibility problem than traditional search. A brand on page two of Google results is disadvantaged. A brand absent from an AI's synthesized answer is invisible.

1.2Why Traditional Metrics Fail

Existing measurement tools were designed for a link-based discovery model. This framework breaks down in AI-mediated discovery for several reasons:

There are no ranked positions. AI systems produce prose responses, not ranked lists. A brand is either mentioned or it isn't. SparkToro's research on AI search consistency demonstrated that positional ranking within AI responses shows less than 1% consistency across repeated identical queries — effectively making 'AI rank position' a random variable rather than a measurable signal.
Visibility is binary at the query level, but probabilistic at scale. For any single query, a brand either appears or it doesn't. But across hundreds of relevant queries, the frequency of appearance produces a stable, measurable signal. This frequency is the foundation of the Ghost Score.
Different AI systems have fundamentally different knowledge architectures. A brand can be well-known to one AI and invisible to another. A single-platform measurement misses this entirely.
The knowledge source is opaque. In traditional search, visibility is a function of indexable content and backlinks — all observable. In AI systems, visibility is a function of training data, retrieval behavior, and model-internal representations — none directly observable. This opacity makes measurement more important, not less.

1.3The Business Case

The business consequence of AI invisibility is straightforward: if an AI system mediates a purchasing decision and your brand is not mentioned, you have zero probability of being selected through that channel. As AI-mediated discovery grows, the cost of invisibility compounds.

Early evidence suggests this effect is already material. Brands that appear consistently in AI recommendations report increased direct traffic and higher conversion rates from AI-referred visitors compared to traditional search visitors — likely because AI recommendations carry an implicit endorsement that a ranked search result does not.

The Ghost Score exists to make this risk measurable, trackable, and actionable.

2.Defining AI Brand Visibility

2.1A Working Definition

AI Brand Visibility is the degree to which a brand is recognized, accurately described, and recommended by AI systems when users ask questions relevant to that brand's category, use cases, or competitive set.

This definition has three components, each of which the Ghost Score captures:

Recognition — Does the AI system know the brand exists? Can it name the brand when asked about the relevant category? This is the most basic layer: a brand that is not recognized by an AI system is, by definition, invisible to it.
Accuracy — When the AI system does mention the brand, is the information correct? Inaccurate representation can be worse than invisibility — a brand described incorrectly is actively misrepresented to potential customers.
Recommendation — Does the AI system recommend the brand when asked for suggestions? There is a meaningful difference between an AI that can define a brand when asked directly and one that proactively suggests it when a user describes a need. The latter represents the highest-value form of AI visibility.

2.2What AI Brand Visibility Is Not

AI brand visibility is distinct from several adjacent concepts:

It is not AI-generated content about a brand. The volume of AI-generated copy that mentions a brand is irrelevant to whether AI systems themselves know and recommend it.
It is not traditional SEO performance. A brand can rank #1 on Google and still be completely unknown to ChatGPT or Claude. The correlation exists but is weaker than most marketers assume.
It is not social media sentiment. The Ghost Score measures the output — what AI systems actually say — rather than the inputs.

3.The Measurement Problem

3.1Nondeterminism in AI Outputs

AI language models are fundamentally nondeterministic. The same query submitted to the same model twice can produce different responses. Any credible measurement methodology must account for this variance rather than ignoring it.

3.2Why Frequency Is the Right Metric

Given nondeterministic outputs, the only stable measurement approach is frequency across a sufficient sample of queries. The Ghost Score asks: across N relevant queries, what proportion of responses mention the brand?

This approach works because while individual responses are noisy, the aggregate signal is stable. A brand that appears in 70% of relevant queries today will appear in approximately 65–75% tomorrow, even as individual responses vary. The frequency metric is robust to the nondeterminism that makes positional ranking meaningless.

3.3Why Official APIs Beat UI Scraping

The Ghost Score methodology uses official APIs exclusively:

Reproducibility. Fixed parameters (model version, temperature, system prompt) produce the most reproducible results possible.
Terms of service compliance. Automated scraping of consumer chat interfaces violates the ToS of every major AI platform. A methodology built on ToS violations is not one enterprises can rely on.
Versioning. APIs expose specific model versions. Consumer interfaces may silently swap between versions.
Parameterization. APIs allow control over temperature and system prompts, reducing measurement noise.

3.4Sample Size and Statistical Validity

The Ghost Score methodology uses a minimum of 10 queries per platform, per measurement cycle.

At 5 queries per platform: variance between runs exceeds ±15 points — too noisy for reliable tracking.
At 10 queries: variance drops to ±5–7 points — sufficient for detecting meaningful changes.
At 20 queries: variance drops to ±3–4 points, but with diminishing returns relative to increased cost and latency.

Queries are drawn from three intent types:

Category queries — “What are the best [category] tools?”
Use-case queries — “I need a tool that does [specific use case]”
Comparative queries — “[Category] alternatives to [known brand]”

4.Ghost Score Methodology

4.1Score Calculation

The Ghost Score is calculated as:

Ghost Score = (Total Mentions / Total Queries Across All Platforms) × 100

With 10 queries × 4 platforms = 40 total queries per standard measurement cycle.
Example: 28 mentions out of 40 queries → Ghost Score of 70.

4.2Mention Detection

A “mention” is the brand name appearing in the AI's response in a context that correctly identifies the brand and its category:

Positive mention — brand name appears, correctly associated with product category. Counts toward Ghost Score.
Negative mention / misattribution — brand name appears, associated with incorrect information. Counted toward visibility but flagged as an accuracy issue.
Incidental mention — brand appears in unrelated context. Excluded from mention count.
Absence — brand does not appear. The core signal the Ghost Score captures.

4.3Platform Weighting

In the current methodology (v1.0), all four platforms are weighted equally at 25% each. Equal weighting was chosen because reliable market share data for AI-assisted product discovery is not yet available. Future versions may introduce market-share-based weighting as usage data matures.

4.4Measurement Cadence

Weekly — for brands actively optimizing AI visibility, to detect the impact of specific actions
Monthly — for baseline monitoring
Ad hoc — after product launches, funding announcements, PR crises, or model version updates

5.Scoring Bands and Business Interpretation

0–33Ghosted

What it means: The brand is invisible or nearly invisible to AI systems. Across 40 queries, the brand appears in fewer than 14 responses.

Business implication: The brand is not participating in AI-mediated discovery. When potential customers ask AI systems for recommendations, competitors are named and the brand is not. This represents a complete loss of the AI discovery channel.

Typical profiles: Early-stage startups (pre-Series A), local businesses, companies in highly specialized B2B niches, brands that have relied primarily on paid acquisition with minimal organic content footprint.

34–66Fading

What it means: The brand has partial visibility. AI systems know it exists but mention it inconsistently — reliably on some platforms but not others, or for some query types but not others.

Business implication: The brand participates in AI-mediated discovery intermittently. Competitors with “Known” scores are consistently chosen in head-to-head comparisons. The brand has a foundation to build on but is at risk of fading further as competitors invest.

Typical profiles: Mid-market SaaS companies, established SMBs with some press coverage, brands in competitive categories where AI systems rotate between many options.

67–100Known

What it means: The brand is reliably visible across AI systems. Across 40 queries, the brand appears in 27 or more responses. AI systems know the brand, describe it accurately, and recommend it with consistency.

Business implication: The brand is an active participant in AI-mediated discovery. This represents a functioning AI discovery channel that generates awareness without paid advertising.

Typical profiles: Category leaders, well-funded companies with significant PR and content operations, open-source projects with strong community presence.

5.4Score Distribution Context

Based on preliminary measurement across multiple B2B SaaS categories:

The median Ghost Score in competitive categories is approximately 25–35 (solidly Ghosted)
Roughly 60–70% of brands score below 33
Only 10–15% of brands in any given category score above 67
Category leaders typically score 75–90
Perfect scores of 100 are rare — they indicate effective monopoly on category awareness

The distribution is heavily left-skewed. Most brands are more invisible to AI than they expect.

6.Platform Architecture and Why Four Platforms Matter

6.1Two Knowledge Architectures

The four platforms fall into two architectural categories:

Search-Grounded

Perplexity + Gemini

Perform real-time web searches to ground responses. Visibility is primarily a function of current web footprint — dynamic, responsive to content and SEO efforts.

Parametric

ChatGPT + Claude

Rely primarily on knowledge learned during training. Visibility is a function of historical web footprint at training cutoff. 6–18 month lag between web presence improvements and parametric visibility.

6.2Why All Four Matter

Measuring only one platform gives an incomplete picture. The four-platform measurement provides:

Completeness — consumers use all four; visibility on one is not visibility on all
Diagnostic value — per-platform breakdown reveals why visibility is what it is
Robustness — aggregating smooths platform-specific quirks
Trend detection — parametric platforms lag search-grounded platforms, revealing trajectory

7.Industry Benchmarks

The following benchmarks are based on preliminary measurement across B2B and B2C categories. They represent estimated typical Ghost Scores, not guarantees, and will be updated as the measurement dataset grows.

7.1By Company Stage and Type

Company Profile	Est. Ghost Score	Notes
Pre-seed / bootstrapped, < 1 year	0–5	Effectively invisible. No training data on the brand.
Seed-stage, 1–2 years	5–20	Occasional mentions, usually only on Perplexity if recent web coverage exists.
Series A, moderate PR coverage	15–35	Emerging visibility; may appear on search-grounded platforms but rarely on parametric.
Series B+, established player	30–55	Inconsistent visibility; appears for broad queries but gaps on specific use cases.
Category leader, strong content engine	60–85	Reliably visible; appears in most relevant queries across most platforms.
Dominant market leader (e.g., Salesforce in CRM)	80–95+	Near-universal visibility; AI systems default to naming this brand.
Enterprise-only, minimal public content	10–30	Low visibility despite revenue scale; enterprise sales motions don't generate public web presence.
Open-source project, strong community	40–75	Often higher than revenue would suggest, due to community-generated content and GitHub presence.

7.2By Category Competitiveness

Category Density	Leader Score	Median Score	Tail Score
Oligopoly (2–4 well-known brands)	85–95	50–65	10–25
Moderately competitive (5–15 brands)	75–90	25–40	0–10
Fragmented (15+ competitors)	60–80	15–30	0–5

7.3Platform Variance Patterns

Pattern	Perplexity/Gemini	ChatGPT/Claude	Interpretation
Rising brand	High (50–80)	Low (10–30)	Recent content gains haven't entered parametric training data. Brand on upward trajectory.
Legacy brand	Low (20–40)	High (50–70)	Strong historical presence but declining current relevance. Warning sign of fading relevance.
Balanced brand	Similar across all	Similar across all	Consistent long-term presence. Healthiest profile.
Perplexity-only brand	High (60+)	Low (< 20)	Visibility driven by recent web content. Not yet in training data. Fragile.

8.Limitations and Open Questions

8.1Known Limitations

Query set dependency. The Ghost Score is only as good as the query set used. A poorly constructed query set won't reflect real-world visibility.
Nondeterminism residual. Even with API-based measurement and 10 queries, individual scores have an inherent margin of error of approximately ±5–7 points. Trends are more reliable than individual snapshots.
Model version sensitivity. A major model update can shift scores significantly without any change in the brand's web presence. Model versions are logged and score changes are flagged when they coincide with known updates.
English-language bias. The current methodology uses English-language queries. Multilingual measurement is planned for future versions.

8.2Open Questions

Should a brand mentioned frequently but described inaccurately score the same as one mentioned accurately? Future versions may integrate an accuracy multiplier.
Should proactive recommendation ('I'd suggest Brand X') carry more weight than a listing ('Options include Brand X, Y, Z')? Empirically, recommendation correlates with higher conversion, suggesting it should.
Should scores be weighted by platform market share? Once reliable market share data is available, this weighting would improve predictive value.
Should relative scoring (percentile rank within category) complement the absolute Ghost Score?

9.Glossary

AI Brand Visibility: The degree to which a brand is recognized, accurately described, and recommended by AI systems in response to queries relevant to the brand's category or use cases.
Ghost Score: A metric from 0 to 100 representing the percentage of relevant queries, across four major AI platforms, in which a brand is mentioned in the AI's response.
Ghosted (0–33): The scoring band indicating a brand is invisible or nearly invisible to AI systems, appearing in fewer than one-third of relevant queries.
Fading (34–66): The scoring band indicating partial, inconsistent visibility.
Known (67–100): The scoring band indicating reliable, consistent visibility across AI systems and query types.
Parametric knowledge: Information encoded in an AI model's weights during training. Does not change between training cycles.
Search-grounded knowledge: Information retrieved via real-time web search and incorporated into an AI response. Reflects the current web.
Mention: An instance of a brand being named in an AI response in a context that correctly identifies the brand's product category. The fundamental unit of Ghost Score measurement.
Query set: The collection of queries submitted to AI platforms during a measurement cycle. Covers category queries, use-case queries, and comparative queries.

10.Citation and Usage

10.1Citing This Document

GhostedByAI. “The Ghost Score: A Methodology for Measuring AI Brand Visibility.” Version 1.0. February 2026. https://ghostedbyai.co/methodology

10.2Usage Terms

The Ghost Score methodology is published openly. We encourage researchers, analysts, journalists, and marketers to:

Reference the Ghost Score by name with attribution to GhostedByAI
Use the scoring bands (Ghosted / Fading / Known) with attribution
Adapt the methodology for internal analysis
Critique, extend, or build upon this framework

Commercial use of the Ghost Score name in competing products requires written permission.

10.3Contributing

This is a living document. As measurement practices mature and empirical data accumulates, the methodology will be updated. We welcome feedback, critique, and collaboration from researchers, practitioners, and platform providers.

Contact: methodology@ghostedbyai.co