Off-Page SEO

LLMs Are Not as Complex as You Think: Here Are 10 Strategies To Improve AI Visibility

Source: AI mode citation study by Moz

Brands need to invest in video content, whether that’s working with creators, partnering with influencers, or publishing branded content on your channel. 

Video is a great way to diversify your traffic sources, and there’s evidence that it gets a ton of visibility in both LLMs and traditional search.

Charlie Clark, CEO, Minty Digital

Focus on content that produces information gain. Most content summarizes existing knowledge, which LLMs already do. AI models won’t surface regurgitated content because they can generate it without retrieval.

If you want to earn third-party mentions and disseminate content in a zero-click world, I’d advise you to invest in original research and net-new knowledge. These are two content formats that AI platforms can’t summarize or surface on their own.

8. Profound’s report shows that LinkedIn is now the most-cited domain for professional queries. How can brands use LinkedIn to get more LLM citations and improve Brand Authority? 

Charlie Marchant

First, I want to clarify by saying this data only matters if LinkedIn is showing up for the prompts you care about. 

If your competitors’ LinkedIn content is appearing in ChatGPT, Copilot, or whichever platforms you’re tracking, go after it. If LinkedIn isn’t showing up at all for your brand’s queries, it’s less of a priority.

For most B2B brands, LinkedIn is a key visibility channel. Founders, marketing managers, and HR leads all have voices on LinkedIn, and LLMs are citing this content.

Here’s how you can take advantage of LinkedIn for AI visibility:

Find the right voices in your company

Not everyone is active on LinkedIn, and that’s fine. Start with employees who already post on LinkedIn or are interested in building a personal brand.

From what I’ve seen, corporate content doesn’t do well on LinkedIn. LinkedIn users are more likely to engage with relatable content that mirrors their lived experiences. 

Build topical pillars around your employee’s job role and expertise. For example, an HR lead’s natural territory is hiring, retention, staff benefits, and employee engagement, so start there.

Create content around the topics you want to own

Use a keyword research tool to identify the topics you want to rank for. Next, ask internal SMEs to publish newsletters and long-form LinkedIn posts around those themes. 

If someone is a better speaker than a writer, that’s not a problem. Ask them to record a podcast, webinar, or YouTube video, then use an LLM to repurpose it for LinkedIn.

At Exposure Ninja, we’ve run experiments where LinkedIn posts influenced LLMs and Google AI Overviews. The two are more connected than most people realize.

9. AI models can sometimes misrepresent a brand, which hurts visibility and conversions. How do you spot misrepresentation in AI answers, and what can you do to fix it?

Charlie Marchant

The first step is confirming whether LLMs are misrepresenting you. Large enterprise brands tend to care because they have specific adjectives and positioning statements they’re protecting.

For example, a premium brand doesn’t want to be described as cheap, and a specialist doesn’t want to be positioned as a generalist. 

Sentiment scores are a good starting point for understanding how LLMs describe you.

Below are two examples of how this plays out in practice:

a. We ran a sentiment analysis for Beaches and Sandals, a luxury honeymoon resort, and found significant negative sentiment. 

The source was specific to grooms who arrived at their honeymoon without a tuxedo and had no option to rent one. The feedback across the web reflected their frustration, and LLMs picked it up and parroted it back. The fix was straightforward because the problem was operational, not perceptual.

b. We worked with a financial education client offering accounting and financial analyst qualifications. 

LLMs were consistently describing them as significantly more expensive than competitors, even though their pricing was identical. We recommended updating the pricing page to make the comparison clearer, and within three days, they appeared at the top of LLM responses. Nothing changed in their pricing, just how clearly they communicated.

Misrepresentation in LLMs is often a content problem, not a perception problem. Fix what the model is reading, and the output changes with it.

I’ll also encourage you to read this piece by Jamie Indigo, sharing useful tips for creating a defensive SEO strategy to protect your brand against harmful misrepresentation in AI search results.

Source link

Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)

We’ve all got used to the little numbered blue links in ChatGPT’s responses. They’re the citations that back up ChatGPT’s responses with external information.

But, although ChatGPT crawls dozens of pages to answer a single query, according to our research, it only ends up citing ~50% of them.

Pie chart shows ChatGPT cites about half the URLs it retrieves: 49.98% cited (23.4M URLs) vs. 50.02% not cited.

Why does one page get the credit while another, which the AI clearly retrieved, gets nothing?

According to studies by AI expert Dan Petrovic, when ChatGPT retrieves results, each one comes back with the page title, a brief snippet or summary, the URL, and an ID number.

Text describing raw search results: title, description, URL, and an ID for each relevant webpage, highlighted with an orange box.Text describing raw search results: title, description, URL, and an ID for each relevant webpage, highlighted with an orange box.

ChatGPT uses this data to decide which pages are worth opening and eventually citing in its response.

That means there’s a gatekeeping layer before ChatGPT opens and reads any of your actual page content. The title, snippet, and URL are doing the heavy lifting in that initial decision.

So we wanted to know: what actually influences that decision? Does higher semantic similarity between a page’s retrieval data and the user query increase citation likelihood? Which fields matter most? Do human-readable URLs outperform opaque ones?

To find out, we analyzed 1.4 million ChatGPT 5.2 prompts from February 2025 (desktop) with the help of Ahrefs data scientist Xibeijia Guan.

But before we get into the findings, you need to understand how ChatGPT actually gathers its sources—because not all URLs enter the system the same way.

Not all sources are created equal: the ref_type hierarchy 

When ChatGPT retrieves results, it categorizes sources using an internal field called ref_type—essentially a label for the retrieval channel the URL came through.

We discovered five categories: search, news, reddit, youtube, and academia.

The citation rates between them are wildly uneven:

ref_type Citation % Total data points
search 88.46% 25,563,589
news 12.01% 3,940,537
reddit 1.93% 16,182,976
youtube 0.51% 953,693
academia 0.40% 185,337

The general “search” index dominates—both in volume and citation rate—and 88% of the URLs that end up being cited by ChatGPT are taken directly from search.

If you want to be cited by ChatGPT, you need to be in that search selection pool—which means your content needs to rank.

This isn’t new information. By now, most people are already aware that ranking plays a part, but it’s nice to have some more data to back it up.

Specialized verticals like YouTube (e.g. youtube.com) and Academia (e.g. arXiv.org), on the other hand, are pulled in at scale but barely ever get surfaced as actual citations.

Sidenote.

The “search” ref_type does include Reddit and YouTube results too—any Reddit or YouTube page that comes back through a standard web search will show up there. 

The separate “Reddit” and “YouTube” ref_types likely represent additional results—i.e. those pulled in via dedicated API integrations—on top of whatever the web search already returned.

That’s why the volume on those channels is so high; ChatGPT is supplementing its search results with a separate feed of Reddit and YouTube content.

This matters a lot for interpreting the rest of the analysis.

On average, ChatGPT pulls ~16.57 cited URLs and ~16.58 non-cited URLs per prompt.

But because Reddit makes up 67.8% of the non-cited pool, any aggregate comparison of “cited vs. non-cited” is really comparing search results to Reddit API output. Not apples to apples.

So throughout this research, we’ve isolated the analysis by ref_type wherever possible to avoid that distortion.

67.8% of non-cited URLs are from Reddit 

This is probably the most striking finding in the dataset.

Reddit has its own dedicated ref_type in ChatGPT’s retrieval system, with over 16 million data points in our dataset.

Yet it’s cited at a rate of just 1.93%.

Meanwhile, 67.8% of all non-cited URLs come from Reddit.

In other words: ChatGPT is using Reddit extensively to understand topics, gauge consensus, and build context—but it almost never gives Reddit the credit.

It learns from the crowd, then cites another institution.

Non-cited pages have 3x more retrieval data—but that’s not the full story… 

As we’ve briefly covered, when ChatGPT retrieves search results, each one comes back with a set of fields including a title, URL, and sometimes a snippet—a short extract of page content stored in ChatGPT’s retrieval data.

We expected that having more of these fields populated would correlate with higher citation rates.

At first glance, the aggregate data seemed to tell a different story: non-cited pages actually have more populated fields in ChatGPT’s retrieval data than cited ones.

Non-cited URLs had snippets 14.81% of the time versus 4.36% for cited URLs, and were far more likely to carry a publication date (92.72% vs. 35.98%).

We almost ran with that as a finding, but I’m glad we didn’t.

When we dug into it, the discrepancy turned out to be almost entirely a compositional artifact—driven by Reddit and the mechanics of ChatGPT’s retrieval pipeline.

Because the non-cited pool is overwhelmingly Reddit (67.8%), and Reddit content pulled via API naturally carries pub_date metadata, the 92.72% figure is a Reddit artifact—not a signal about how ChatGPT evaluates web pages in general.

The snippet gap is explained differently. According to David McSweeney’s research on ChatGPT’s retrieval process, the model actually abandons the snippet field (the short content extract)once it’s decided to cite a URL—and opens the full page instead.

So, it’s not a matter of ChatGPT preferring pages with no snippets. The low snippet percentage for cited pages is likely a byproduct of how the pipeline works.

When we isolated the data to just the “search” ref_type—stripping out Reddit, news, YouTube, and the rest—the picture became a lot clearer:

Search ref_type Has snippet Has pub_date Total URLs
Cited 2.52% 33.79% 22,612,529
Not cited 0.09% 49.00% 2,951,060

Snippet data is basically non-existent for both groups within the search vertical—it’s not a usable signal. And the publication date percentages are closer, but non-cited search pages are still slightly more likely to carry a pub_date (49%) than cited ones (33.79%).

The differences we initially saw between cited and non-cited URLs seem to have been distorted by the data composition and retrieval mechanics. Any signal—if there is one—is buried under the noise. The honest takeaway: we can’t draw strong conclusions about whether the snippet or publication date fields play a meaningful role in citation from this data.

It’s worth flagging that this problem likely applies to other citation studies too. Any research comparing “cited vs. non-cited” URLs without accounting for where those URLs came from risks mistaking quirks of the data for real patterns.

Find your own citation gaps in Brand Radar

The data in this study tells you what ChatGPT values. Brand Radar tells you where you’re falling short.

Open Brand Radar, set up your brand and competitors, and head straight to the Cited Pages report.

Then, filter for responses where competitors are cited and you aren’t.

A screenshot of a "Cited pages" dashboard showing trends over time and a table of AI visibility tools.A screenshot of a "Cited pages" dashboard showing trends over time and a table of AI visibility tools.

That gap analysis gives you a concrete list of content to create, refresh, or restructure.

Titles need to be semantically relevant to fan-out queries 

To figure out what’s “citable,” ChatGPT estimates relevance, in a process sometimes described as “semantic scoring”, to judge whether an article and a query are related.

Since ChatGPT is a closed-source model, we don’t have visibility into exactly how it determines relevance internally.

So, in this study, we use cosine similarity computed from embeddings generated by open-source models, to quantify and approximate how ChatGPT may work.

ChatGPT matches URLs against its own “fanout queries”—the sub-questions it generates internally (from a user’s seed prompt) to hunt for specific facts.

The data confirms that title relevance to fanout queries is an important factor in citation:

  • Prompt vs. cited URL title: 0.602
  • Prompt vs. non-cited URL title: 0.484
  • Fanout query vs. cited URL title (max match*): 0.656

Sidenote.

For each of these fanout queries, we compute its cosine similarity with the article title. The “max match” score is the highest similarity among them—for example, if scores are 0.45, 0.71, and 0.38, the max match is 0.71. This captures the best-aligned sub-question rather than averaging across all interpretations, which would dilute the signal.

The box plots tell the story clearly. Across all ref_types, cited URLs have consistently higher similarity between their title and the original prompt:

Box plot showing that cited pages have higher cosine similarity between their titles and original ChatGPT prompts than uncited pages.Box plot showing that cited pages have higher cosine similarity between their titles and original ChatGPT prompts than uncited pages.

The gap widens further when we compare against fanout queries instead of the original prompt—reinforcing that creating content relevant to ChatGPT’s internal sub-questions are what really drive selection:

Box plot showing cosine similarity between titles and fan-out queries for cited vs. not cited pages. Cited pages show higher similarity.Box plot showing cosine similarity between titles and fan-out queries for cited vs. not cited pages. Cited pages show higher similarity.

When we isolate the search ref_type specifically, the pattern gets even sharper. Cited pages are clearly more relevant, and the non-cited distribution drops significantly:

Box plot comparing cosine similarity between title and original prompt for cited vs. not-cited search results.Box plot comparing cosine similarity between title and original prompt for cited vs. not-cited search results.

We also found that search results with natural language URL slugs had an 89.78% citation rate, compared to 81.11% for those without.

Ultimately, if your URL and title don’t semantically align with the AI’s internal fanout queries, you’re less likely to get cited.

Optimize for fan-out queries using Brand Radar

You can study fanout queries directly inside Brand Radar. Head to the AI Responses report, pick any prompt, and you’ll see the fanout queries ChatGPT generated alongside the cited URLs.

Screenshot of Ahrefs' "AI responses" page, showing listed prompts, responses, fanout queries, mentions, citations, and updates.Screenshot of Ahrefs' "AI responses" page, showing listed prompts, responses, fanout queries, mentions, citations, and updates.

This is the actual set of sub-questions your content needs to answer.

From there, use the AI Content Helper to check how well your page covers the topics those fanout queries address. It measures the cosine similarity between your content and the topics the SERP or AI response is trying to cover—and gives you a colored highlight as you write, showing which gaps remain.

A screenshot of a content optimization tool, showing text being edited and highlighted, with content score and topic suggestions.A screenshot of a content optimization tool, showing text being edited and highlighted, with content score and topic suggestions.

If a competitor’s page is getting cited for a query where yours isn’t, this is one of the fastest ways to diagnose why.

The average cited page is 500 days old (and still getting picked) 

It’s common knowledge that fresher content gets cited more by AI—and, in fact, our own study of 17 million citations supports that. We found that ChatGPT cited URLs that were 458 days newer than Google’s organic results—the strongest freshness preference of any platform we tested.

This new data doesn’t contradict that narrative, but it does add an extra layer of nuance.

For instance, when we look at the search index, cited pages span a wide range of ages—the median is around 500 days (~1.3 years old), with some cited pages over 2,700 days old (~7.4 years old).

The median age is actually far lower than our initial freshness study linked above (958 days back in July vs 500 days in this dataset), suggesting that ChatGPT is skewing even younger in its citation preferences.

That said, we also found that non-cited pages are overwhelmingly very young.

Box plot shows search results cited by ChatGPT are significantly older than non-cited results, with a median age of 500 days.Box plot shows search results cited by ChatGPT are significantly older than non-cited results, with a median age of 500 days.

So within a single prompt’s retrieval set, it’s the older, more established pages that tend to get cited, and the freshest content that tends to get discarded.

In other words, ChatGPT prefers fresh content, but tends to cite comparatively “older” content more often. That sounds counterintuitive, but both things can be true at the same time.

Across the broader population of AI citations, ChatGPT does skew toward fresher when compared against Google results, and even against it’s own citation preferences only last year.

But within a given retrieval set, freshness alone isn’t enough. Relevance still does the heavy lifting.

A new page that matches fanout queries well will get cited. A new page that doesn’t will be retrieved, yet ignored.

It’s also worth pointing out that the pool of non-cited pages (~3M) across the search ref_type is far smaller than the cited group (~23M), which limits how confidently we can interpret the age gap.

Where freshness matters most is in “news”.

In this category, title relevance scores for cited and non-cited pages are nearly identical:

Box plot showing cosine similarity between title and original prompt for cited (blue) and not cited (red) news articles.Box plot showing cosine similarity between title and original prompt for cited (blue) and not cited (red) news articles.

The AI can’t decide based on relevance alone, so it defaults to a temporal tie-breaker: page age. Cited news pages skew younger:

Box plot: "Cited" pages (blue) have a median age of ~200 days, younger than "Not Cited" pages (red) with a median of ~300 days.Box plot: "Cited" pages (blue) have a median age of ~200 days, younger than "Not Cited" pages (red) with a median of ~300 days.

For news queries, younger pages have a clear advantage, even when relevance scores between cited and non-cited pages are similar.

Create the freshest news content using Firehose

If you publish news or time-sensitive content, freshness is non-negotiable.

Be the first to break news on certain stories using Ahrefs Firehose—our real-time web monitoring API that gives you a streaming feed of data from our huge crawler infrastructure.

For example, if you work in SaaS journalism, you can track content changes on pages like Google’s official blog, so you can be the first one to cover a new Google update as soon as it goes live.

A screenshot of a "Firehose" platform dashboard, showing Taps, specifically a "Google Blog" feed with recent articles.A screenshot of a "Firehose" platform dashboard, showing Taps, specifically a "Google Blog" feed with recent articles.

Then, use Brand Radar’s Mentions history in the AI Responses report to track whether your ChatGPT visibility spikes after publication.

Ahrefs AI responses dashboard shows competitor mentions over time, with a graph tracking Ahrefs, Moz, SE Ranking, and Similarweb.Ahrefs AI responses dashboard shows competitor mentions over time, with a graph tracking Ahrefs, Moz, SE Ranking, and Similarweb.

What this all means for being “citable”

The 1.4 million prompts paint a pretty clear picture. ChatGPT is an aggressive editor. It favors its general search index, uses semantic similarity to select and cite sources, and treats Reddit as a textbook it’s embarrassed to admit it read.

But the data also taught us a lesson in analytical caution.

Aggregate comparisons between “cited” and “non-cited” URLs can be misleading if the non-cited pool is dominated by a single source type with its own retrieval mechanics.

What initially looked like a paradox—less-optimized pages getting cited more—turned out to be a matter of dataset composition.

We would have got that one very wrong if we hadn’t isolated by ref_type.

Ultimately, the pages that get cited are the ones whose titles and content match the questions ChatGPT is asking behind the scenes, and that surface through the right retrieval channel.

 

Source link

The Complete AI Research Workflow: From Prompt Discovery to Content Creation

Now that you’ve identified prompts that are important to your business, you can add them to your AI Visibility Dashboard to track. 

Why should I track AI prompts?

Prompt tracking allows you to monitor how your brand appears in AI-generated responses, which is critical to understanding brand reach and visibility. So just like you track rankings in SERPs, you should also track your AI visibility. By doing so, you’ll get a complete picture of how your audience is finding you and spot ways to improve your brand’s presence in the marketplace. By not tracking AI visibility, you’re missing out on understanding a huge piece of the new search landscape puzzle. 

Now, you could track your selected prompts manually by going to each LLM (like Gemini), entering the prompt, and then noting what brands are being mentioned in the response. 

But let’s be honest, nobody has time for that.

AI Visibility in Moz Pro allows you to track prompts to see where and when your brand shows up in the AI-generated responses. You can also track your visibility alongside 3 competitor brands to get a better idea of how you stack up in your industry.

There are a few ways to add prompts to track in your dashboard. If you’re working in Prompt Suggestions and you’ve spotted some prompts you want to track, select them using the checkboxes on the left and then click Track prompts in AI Visibility

Source link

Travel Marketing: How to Compete and Future-Proof in 2026

So how do you manage to stand out and land coverage in 2026? There are five key areas I’m going to take you through. 

1. Personalized stories

The first one is personalized stories. It’s always been really important to understand your target persona and your target audience and their wants, needs, and desires. 

But it’s even more crucial this year. Not just those things, but what they value as well as the real-world and economic factors that are going to impact their travel plans. 

Alongside that, target audiences are wanting much more personalized itineraries and things that align with their hobbies and their passions. 

Journalists are already following this trend, and if we don’t do the same, then we are going to miss out on coverage. They always have their readers front of mind, and so we need to too. 

2. Human-first narratives

Next up, human- first narratives. Human-first narratives are going to help set the PR content apart from AI-produced content because they cannot imitate that. 

How do we do that? Through adding behind-the- scenes insight, untold angles, or through using case studies. This is going to help really set us apart. Journalists are already looking for this within their stories. 

3. Take an “always on” approach

Third is taking an “always on” approach. 95% of digital PRs that we surveyed use data-led creative campaigns to land coverage effectively. 

But on top of this, 71% are using newsjacking and reactive tactics to make sure that they can take this “always on” approach and maximize coverage opportunities. In fact, they said it’s become even more important over the last three to five years to help them do so.

Source link

Brand Bias in Prompts: An Experiment

This experiment looked at three sets of prompts (100 each) — brand, “soft-brand”, and non-brand — all of them based on the topic “seo tools” and a handful of pre-selected brands. We intentionally kept the scope narrow, and within a domain we understood well. Of course, results may vary across different topic areas.

Brand prompts

This is the most straightforward group. Brand prompts contained a brand name or branded product directly in the prompt. Some examples include:

  • “Can I see historical Domain Authority data in the Moz dashboard?”
  • “How many domains does the Moz link index currently track?”
  • “Is Moz or Semrush better for a beginner in SEO?”

Note that brand prompts could include brands or branded products and metrics.

Soft-brand prompts

The “non-brand” prompts were split into two groups. The soft-brand group used our query fan-out research to generate prompts in an open-ended way. Examples include:

  • “Are premium search suites worth the investment for a small blog?”
  • “Can I use a tool to find the most popular questions in my niche?”
  • “How do I reconcile keyword scores from multiple search platforms?”

There’s a bias inherent in our topic — questions about seo tools are naturally going to include specific tools and brands in the answers. So, even without including a brand-name or biasing the system toward brands, we’ve already created a soft brand bias.

Non-brand prompts

Given the topic bias, we nudged the system to generate prompts that were more tool-adjacent, resulting in broader, informational questions. For example:

  • “How do you measure the organic search visibility of a new website?”
  • “Is it better to target one high-volume term or ten low-volume?”
  • “What is the best way to handle a sudden drop in rankings?”

We’ll call these our true non-brand prompts. Even from these few examples, it’s probably clear that the line between non-brand and “soft-brand” is a gray one and depends a lot on the topic. Brand mentions are an on/off switch, but brand bias is a volume knob.

Source link

Reddit Brand Strategy for AI Search — Whiteboard Friday

Read the room

Secondly, when you’re gathering insight, it’s very important for you to read the room. Read what is being upvoted. Read what is being downvoted. Read what is being banned and read what is being blocked. By this, you’re able to understand and see how you will not fall into the same trap, or you could also ace your content. 

Understand subreddit rules

On Reddit, you are not posting to the platform itself. You are posting to the subreddit group, and every subreddit group has its own ecosystem of itself. The rules or the activities that work in a particular subreddit group would not possibly work in another group because the norms and the rules are different. So it’s very important that you understand the rules and take note of them. 

Observe subreddit activities

The next is to observe subreddit activities. Some subreddit groups have activities that they do. They have, for example, self-promo Sunday, fix it Friday. So your goal as a brand is to get to understand the particular intent of those activities and use those intents to align to those activities. 

Use the Reddit AI answers tools

Fifth, you could also use the Reddit AI answers tool. This is a tool that you could just go into as a Redditor, type your particular phrase, a question that you have, and what it does, it surfaces answers from Reddit itself, and it gives you the link to the sources that it got the conversation from, the sources, the AI overviews, the insight where it got it from. So take note of that. 

By doing all of these five steps you should already have 10 to 20 subreddit groups that you believe you could enter and engage in or you could get insight from. 

Source link

Scroll to Top