OnPage SEO

What is a canonical tag and when should you use it?

A canonical tag tells search engines which URL is the original:

A canonical tag is an HTML element placed in the <head> section of a webpage that tells search engines which URL should be treated as the preferred version of a page. The syntax looks like this:

<link rel="canonical" href="https://example.com/preferred-page">

That single line does a specific job. When a page exists at multiple URLs — through tracking parameters, sort orders, print versions, or syndicated copies — the canonical tag points search engines at one URL. That URL becomes the page’s representation in search results.

The clean URL discussion next door covers what makes a URL readable. Canonical tags solve a different problem underneath that. Even with clean URLs, the same content can end up at multiple addresses. The canonical tag is how a site tells search engines which address counts.


Same content, different addresses, one chosen winner:

Duplicate content is more common than most site owners realize. The duplicates aren’t intentional — they’re a side effect of how sites get built.

Consider a single product page in an e-commerce store. The base URL is /products/running-shoe. Then the site adds size and color variants: /products/running-shoe?size=10&color=black. A tracking parameter shows up: /products/running-shoe?utm_source=newsletter. The same page appears under a category path: /men/shoes/running-shoe. A legacy mobile setup redirects mobile users to /m/products/running-shoe — most modern sites use responsive design and skip this, but separate-mobile architectures still exist on older builds. None of those URLs were created to be duplicates. Each had a purpose. But search engines now see five URLs serving substantially the same page.

That creates problems. Search engines aren’t sure which version to rank. Inbound links split across the variants instead of consolidating into one. Crawl budget — the time Google spends crawling the site — gets spent re-fetching duplicate pages instead of finding new content. And in the worst case, Google indexes the wrong version, so the URL that ranks isn’t the one the site wants people to see.

The canonical tag fixes this by naming a winner. The site adds <link rel="canonical" href="https://example.com/products/running-shoe"> to every variant. Now all five URLs point at the same canonical. Search engines understand the relationship. Link signals consolidate. The preferred URL is the one that ranks.

Note that duplicate content doesn’t trigger a Google penalty. The Google documentation has been clear about this for years. Duplicate content creates inefficiency and ambiguity, not punishment. The canonical tag isn’t there to avoid a penalty — it’s there to give search engines a clear instruction about which version matters.


Canonical is a hint, not a command:

Here’s the part most articles get wrong: a canonical tag is a suggestion. Google’s documentation describes canonical signals as inputs into a decision, not as instructions Google must follow. If the page Google crawls doesn’t match the canonical declared, or if other signals contradict the canonical, Google can pick a different URL as the canonical version.

This shapes how the tag actually works. When Google crawls a page with a canonical tag, three things happen. Google fetches the canonical target and confirms it exists and serves a successful response. Google compares the content of the duplicate against the canonical target — if the canonical target’s content doesn’t substantially overlap with the page declaring it, Google may disregard the signal. And Google weighs the canonical declaration against other canonicalization signals. These include redirects, internal linking patterns, and sitemap inclusion.

Google’s documentation describes canonicalization signals in a strength hierarchy. Redirects are the strongest signal that the redirect target should be canonical. The rel="canonical" annotation is a strong signal too. Sitemap inclusion is a weaker signal that helps the listed URL become canonical. None of these methods is mandatory; sites work fine without explicit canonical preferences. But when used together, the signals stack and make the canonical choice more reliable.

The practical implication: the canonical tag works when the page declaring it and the canonical target are genuinely duplicates or near-duplicates. When they aren’t, Google’s documentation explicitly says the canonical may be disregarded. A canonical declared from a category landing page to a featured product page won’t work, because the category page and the product page aren’t the same content.


Redirects move users; canonicals move credit:

Canonical tags and 301 redirects are often confused because both deal with the question of “which URL should this be.” But they do different jobs.

Tool What users see What search engines do When to use
301 redirect Browser sends user to a different URL Treats the destination as the canonical, transfers most ranking signals When only one URL should exist long-term
<!–INLINECODE8–> tag User stays on the URL they requested Treats the canonical target as preferred for indexing, consolidates signals When multiple URLs should remain accessible but only one should rank
<!–INLINECODE9–> meta tag User sees the page normally Removes the page from search results entirely When a page shouldn't appear in search at all
Robots.txt block User sees the page normally if they have the URL Prevents Google from crawling the page (but may still index it) Almost never the right choice for canonicalization

The redirect-vs-canonical question comes down to whether both URLs need to keep working. Take an old product URL that’s been replaced. The old URL should send users to the new one — that’s a 301 redirect, hard and one-way. Take a product available at multiple URLs because it appears in three different categories. All three category paths should keep working for users browsing those categories — that’s a canonical tag, soft and many-to-one.

Mueller has noted on community channels that combining noindex with a canonical tag creates contradictory instructions. The canonical says “this page is a duplicate, index the other one.” Noindex says “don’t index this page at all.” How Google resolves the conflict isn’t consistent — in some cases the canonical takes priority, in others the noindex wins, and the actual behavior shifts based on signal strength elsewhere on the page. The cleaner approach is to pick one method based on the actual goal rather than relying on Google to interpret a contradiction in a predictable way.


Self, cross-page, cross-domain — different scopes, same signal:

Canonical tags come in three implementation patterns. Each handles a different duplicate-content situation.

A self-referential canonical points a page at itself.

<!-- On https://example.com/blog/canonical-guide -->
<link rel="canonical" href="https://example.com/blog/canonical-guide">

This looks redundant but isn’t. Self-referential canonicals tell search engines explicitly which version of a URL is preferred when the same content might be accessed through tracking parameters, anchor fragments, or capitalization variants. Mueller has emphasized in Google Webmaster Hangouts that self-canonicals are a best practice for primary pages. They preempt parameter-based duplicates by declaring the clean URL as canonical before duplicates ever appear.

A cross-page canonical points one page at a different page on the same site.

<!-- On https://example.com/products/running-shoe?color=blue -->
<link rel="canonical" href="https://example.com/products/running-shoe">

This is the e-commerce variant pattern. The blue-variant URL declares that the main product URL is the canonical version. The variant remains accessible for users who want to share a link to the blue version specifically, but search engines consolidate signals to the main product.

A cross-domain canonical points one page at a page on a different domain.

<!-- On https://syndicated-site.com/article-republished -->
<link rel="canonical" href="https://original-publisher.com/article">

This is the content syndication pattern. When an article gets republished elsewhere, the syndicated version’s canonical tag points back to the original. Search engines understand the original publisher should rank, while the syndicated copy remains available for readers who land on it.

Hreflang is a separate signal for language and region variants — it works alongside canonical, not instead of it. A page’s canonical should match its language and region, not point across languages. Google’s documentation is explicit about this.


Canonical doesn’t block, doesn’t redirect, doesn’t penalize:

Misconceptions about what canonical tags do create most implementation mistakes. The tag does one thing: it tells search engines which URL should represent a piece of content for indexing and ranking purposes. It doesn’t do anything else.

Canonical tags don’t block a page from being accessed. The URL remains live. Users following links to the canonicalized URL still load that page. The only effect is on how search engines treat the URL.

Canonical tags don’t redirect users. Unlike a 301, a canonical leaves the user on whatever URL they requested. If someone visits the duplicate URL, they see the duplicate URL in their browser. The canonical is invisible to users.

Canonical tags don’t trigger penalties. The widespread idea that duplicate content earns a Google penalty is a misconception that the Google documentation has tried to correct for years. Duplicate content creates inefficiency — signals split, the wrong URL might rank, crawl budget gets wasted. Those are real problems. They aren’t penalties.

Canonical tags don’t guarantee Google will respect the declaration. As the previous section noted, the signal is a hint Google weighs against other inputs. If the page Google crawls doesn’t substantially match the canonical target, Google may pick a different URL as canonical regardless of the declaration.

This is why thinking of the canonical tag as a clean technical instruction rather than a magic SEO fix matters. The tag works when used to describe the actual structure of duplicates on the site. The tag fails when used as a shortcut to consolidate authority across pages that aren’t really duplicates.


Google’s choice can override your declaration:

When a site declares a canonical and Google picks a different URL as canonical, the discrepancy shows up in Google Search Console. The URL Inspection tool reports two values: the canonical URL the site declared, and the canonical URL Google actually selected. When these disagree, something is off.

Common reasons Google overrides a declared canonical include content mismatch. The page declaring the canonical doesn’t share enough content with the canonical target for Google to treat them as equivalent. Conflicting signals also play a role. If the canonical target is itself canonicalized to a third URL, or if internal links point primarily at a different URL, Google may follow the stronger signal pattern. Indexing conflicts contribute when the canonical target has a noindex tag or returns a 404, leaving Google with no way to use it. And technical errors come up too: the canonical tag is malformed, placed in the body instead of the head, or pointing at a relative path that doesn’t resolve correctly.

The practical workflow when Google’s canonical doesn’t match the declared canonical follows four steps. Check the canonical target for noindex or non-200 status codes. Confirm the tag is in the head section and uses an absolute URL. Audit internal links to verify they point at the canonical target consistently. Compare the content of the duplicate against the canonical to confirm substantial overlap. If everything checks out and Google still picks a different canonical, the override usually reflects Google’s judgment that another URL better represents the content.


Duplicates exist, but each version has reason to stay:

The clearest test for whether a canonical tag is the right tool is this: does each duplicate URL have a reason to remain accessible? If yes, canonical. If no, redirect.

The classic use cases for canonical tags are all situations where multiple URLs need to keep working. Product variants in e-commerce — the same product available in different colors, sizes, or configurations through URL parameters. Tracking and session parameters appended to URLs for analytics — the underlying page is the same, but the parameters need to flow through. Print versions or AMP versions of articles — the content is the same but the URL serves a different rendering need. Pagination and sorting on category pages — the underlying inventory is the same, but the URL reflects how the user is browsing it. Content syndication where an article is republished on partner sites — the syndicated copies should keep working for readers who land on them.

In each case, the duplicate URLs serve a purpose that justifies keeping them accessible. The canonical tag consolidates ranking signals without breaking those access patterns.

There’s a related case for sites that anticipate parameter-based duplicates before they appear: self-referential canonicals on primary pages. By declaring a canonical that points at the clean version of a URL, the page preempts future parameter variants. When a tracking link appends ?utm_source=newsletter later, the self-canonical already tells Google the clean URL is preferred.


If only one version should exist, use a redirect instead:

Canonical tags solve a specific problem. Several other problems look similar but need different tools.

Situation Right tool Wrong tool (canonical)
Old URL that should permanently send users to a new one 301 redirect Canonical leaves the old URL accessible, which isn't the goal
Page that shouldn't appear in search at all <!–INLINECODE11–> meta tag Canonical to another page keeps the page indexable in principle
Pages with substantially different content that someone wants to consolidate for SEO Editorial restructuring (merge pages or differentiate them) Canonical from non-duplicates is disregarded by Google
Category page with a featured article Self-referential canonical (or no canonical) Canonical from category to article makes the category disappear from search
Paginated component pages Self-referential canonical on each page Canonical from page 2+ back to page 1 hides component content
URLs blocked from indexing for privacy or staging Robots.txt or authentication Canonical doesn't prevent crawling or access
Language variants of the same page <!–INLINECODE12–> with self-referential canonicals on each variant Canonical across languages collapses the variants improperly

The pagination case is worth flagging specifically. Google’s documentation calls out canonicalizing component pages (page 2, page 3) back to page 1 as a common mistake. Pages 2 and beyond aren’t duplicates of page 1 — they contain different items, different content. Canonicalizing them to page 1 tells Google the later pages don’t have unique content worth indexing, and the items on those pages drop out of the index entirely.

The category-to-article case is the inverse mistake. Google’s Search Central blog uses an example of a pastry category page that adds a canonical pointing to a “red velvet cupcakes” featured article inside the category. The result: the category page disappears from search results because Google now treats the article as the preferred version. The pastry page was supposed to be discoverable on its own. The canonical broke that.

When in doubt, the question to ask is whether both URLs need to keep ranking separately. If only one should rank and the other is genuinely redundant, the cleaner answer is usually a redirect or a structural change rather than a canonical.


Two tag locations, two reinforcing signals:

The HTML <link rel="canonical"> element is the most common place a canonical tag lives, but it isn’t the only place. Canonical tags themselves can sit in two locations. Two additional signals reinforce them without being canonical tags in the strict sense.

  1. HTML <head> element (canonical tag location). The standard implementation. <link rel="canonical" href="https://example.com/page"> placed in the <head> section of the HTML. Must use an absolute URL with full protocol. Must be in the head — Google explicitly disregards canonical tags placed in the body.
  1. HTTP response header (canonical tag location). For non-HTML resources like PDFs or images, the canonical signal travels in the HTTP header: Link: <https://example.com/page>; rel="canonical". This is the only way to canonicalize a PDF, since PDFs don’t have an HTML head to host a link element.
  1. XML sitemap (canonicalization signal, not a canonical tag). Including a URL in a sitemap is a weak signal that the listed URL should be canonical. Google’s documentation notes that sitemap-based canonicalization is less reliable than the link element or HTTP header, but the signal compounds with the others.
  1. Internal linking consistency (canonicalization signal, not a canonical tag). Consistently linking to the preferred URL throughout the site reinforces the canonical choice. Google’s documentation explicitly recommends this. If a site’s internal links all point at /products/running-shoe while the declared canonical points at /products/running-shoe?variant=default, the link pattern undermines the declaration.

The most reliable setup uses the link element plus internal linking consistency, with the sitemap reinforcing the choice. For PDFs and other non-HTML resources, the HTTP header is the only option.

A few implementation rules worth holding to. Use absolute URLs, not relative paths — https://example.com/page rather than /page. Place the link element only once per page; multiple canonical declarations on the same page confuse the signal. Point the canonical at a live URL returning a 200 status code — canonicals pointing at 404s, 301 chains, or noindex pages get ignored. Match the canonical target’s language and region to the page declaring it. And check the canonical periodically, especially after CMS changes, redesigns, or migrations, since canonical errors are easy to introduce and hard to spot.


Canonical is editorial discipline that lives in HTML:

The canonical tag looks like a technical instruction, but it’s an editorial judgment encoded into HTML. The judgment is “this URL is the version of this page that matters.” Everything else about canonicalization follows from that.

The work splits into three layers. Recognize where genuine duplicates exist on the site. Decide which version should represent the content. Declare that choice consistently across every variant.

Most canonicalization mistakes come from skipping one of those layers. Declaring canonicals on pages that aren’t really duplicates. Declaring different canonicals through different mechanisms that contradict each other. Pointing canonicals at URLs that don’t exist.

What makes the tag work isn’t the syntax. The syntax is the easy part — one line of HTML in the head. What makes it work is whether the declaration matches what the site actually is. Google’s canonicalization decision was never going to follow a signal that contradicted the structure underneath. The tag tells search engines what the site already knows about itself. When the site doesn’t know which URL matters, no canonical tag is going to figure it out for them.