OnPage SEO

How to add structured data to a webpage step by step

Structured data is a second language the page speaks to machines:

Structured data is a standardized format for annotating the meaning of content on a webpage. The page already says what it says in HTML — paragraphs, headings, images, links. Structured data adds a second layer underneath: labeled facts about what those elements actually represent. A page about a chocolate cake recipe contains text describing ingredients and steps. The structured data on that page says, in machine-readable form, “this is a Recipe. The name is Chocolate Cake. The prep time is 20 minutes. The author is Maria Santos.” Same facts, different format. The HTML is for the reader. The structured data is for the search engine, the voice assistant, and any other system that wants to extract meaning programmatically.

The vocabulary that defines what kinds of things get labeled and how comes from schema.org — a collaborative project started by Google, Microsoft, Yahoo, and Yandex in 2011 that now defines thousands of types like Recipe, Article, LocalBusiness, Product, Event, FAQPage, and many more. Schema.org is the dictionary. Structured data is the act of speaking that dictionary on a page.

This isn’t optional metadata in the SEO sense — it’s the foundation of how modern search features work. Rich results, knowledge panels, AI-generated overviews, voice answers, and many specialized search experiences depend on structured data being present and correct on the source page.


Search engines guess; structured data tells:

Without structured data, search engines have to infer what a page is about by reading the content and applying machine learning to patterns. That inference is sometimes wrong. A page about a software product gets misclassified as a news article. A recipe page gets indexed without the prep time, cook time, or rating. A local business page lands in search without the hours of operation surfaced.

Structured data closes the gap between what the page says and what the search engine confidently knows. When the page declares "@type": "Recipe" with explicit prep time, cook time, ingredient list, and nutrition facts, Google doesn’t have to guess what kind of page it is. The page told it. The information becomes directly usable for rich results, recipe carousels, and voice answers.

The practical consequence shows up in two places. Eligibility for rich results is the first. Pages without correct structured data won’t appear as recipe cards, product cards, FAQ accordions, or breadcrumb trails in search results, regardless of how well the page itself is written. The second consequence is AI extraction quality. Generative search features pull structured data when it’s available, and the labeled facts make answer generation faster and more accurate than parsing prose alone.

Note that structured data isn’t a direct ranking factor. A page doesn’t rank higher just for having JSON-LD. What it gets is eligibility for richer search appearances, which frequently improves click-through rates on the same ranking position.


Three ways to say it, one Google prefers:

Structured data is implemented in three formats. All three encode the same schema.org vocabulary. The difference is how the markup is embedded in the page.

Format What it looks like Where it goes Status
JSON-LD JavaScript object inside a <!–INLINECODE1–> block <!–INLINECODE2–> or <!–INLINECODE3–>, separate from content Google's recommended format
Microdata HTML attributes (<!–INLINECODE4–>, <!–INLINECODE5–>, <!–INLINECODE6–>) on existing elements Inline with the HTML content Supported, harder to maintain
RDFa HTML attributes (<!–INLINECODE7–>, <!–INLINECODE8–>, <!–INLINECODE9–>) on existing elements Inline with the HTML content Supported, less common

Google’s documentation states all three formats are equally valid as long as the markup is correctly implemented. The difference is operational. JSON-LD lives in a separate block that doesn’t touch the visible HTML. Microdata and RDFa interleave with the content, which means every template change risks breaking the markup.

For most sites, JSON-LD is the right choice. Added without modifying any existing HTML. Generated dynamically by CMS plugins, server-side templates, or JavaScript. Moved between pages without rewriting the underlying HTML structure. The trade-off is that JSON-LD is technically duplicated content — the same facts exist in both the visible HTML and the structured data block. That duplication is the cost of separation, and the maintenance benefit outweighs it in nearly every real implementation.

There’s a fourth format worth knowing about, even though it’s deprecated. Data-vocabulary.org was an older Google-specific vocabulary that predated schema.org. Google stopped supporting data-vocabulary markup for rich results in 2020. Any site still running it should migrate to schema.org. The Rich Results Test will flag pages relying on data-vocabulary as no longer eligible.


Pick the schema type that matches what the page actually is:

Before writing any markup, the first decision is which schema.org type fits the page. Schema.org’s catalog includes thousands of types, but most pages fit into a small set of common ones.

The matching rule is specificity. Google’s documentation says to use “the most specific applicable type” — not the parent type. A Restaurant is a kind of LocalBusiness, which is a kind of Organization, which is a kind of Thing. A restaurant page should declare "@type": "Restaurant", not "@type": "LocalBusiness" or "@type": "Organization". The more specific the type, the more relevant properties become available, and the better Google understands what the page describes.

The most common types and their typical use cases follow a recognizable pattern. Article, NewsArticle, and BlogPosting cover written content. Recipe covers cooking instructions. Product covers e-commerce listings. LocalBusiness (and its many sub-types like Restaurant, MedicalClinic, Hotel) covers physical businesses. Event covers concerts, conferences, and other scheduled occurrences. FAQPage covers question-and-answer formatted content. BreadcrumbList covers navigation hierarchy. Organization covers the publisher or business behind the site. Person covers individual authors or notable people.

The mistake to avoid is forcing a schema type onto a page that doesn’t fit. Adding Recipe markup to a page that lists “tips for baking” but doesn’t have actual recipe content is misleading and violates Google’s structured data guidelines. The schema is a contract about what the page contains. Misrepresenting that contract risks a manual action from Google, which suppresses all rich results from the affected pages.

For pages that don’t fit any specific type, no structured data is the right answer. Not every page benefits from markup. A generic about-us page or a contact page has nothing schema-worthy beyond the site’s Organization markup, which lives elsewhere.


The four steps that take structured data from idea to indexed:

The setup decisions above — knowing what structured data is, why it matters, which format to use, and which schema type fits the page — set the stage. The actual implementation follows four steps. Write the JSON-LD block. Add it to the page. Validate before publishing. Monitor in Search Console after deployment. Each step has its own failure modes and its own tools.


Step 1: write the JSON-LD block:

Once the schema type is chosen, the next step is writing the actual JSON-LD. Recipe has four required properties for rich result eligibility: name, image, recipeIngredient, and recipeInstructions. The example below includes all four, plus several recommended properties that improve the rich result quality.

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Recipe",
  "name": "Chocolate Cake",
  "image": [
    "https://example.com/photos/chocolate-cake-1x1.jpg",
    "https://example.com/photos/chocolate-cake-4x3.jpg",
    "https://example.com/photos/chocolate-cake-16x9.jpg"
  ],
  "author": {
    "@type": "Person",
    "name": "Maria Santos"
  },
  "datePublished": "2026-03-15",
  "description": "A rich, moist chocolate cake recipe.",
  "prepTime": "PT20M",
  "cookTime": "PT35M",
  "totalTime": "PT55M",
  "recipeYield": "12 servings",
  "recipeIngredient": [
    "2 cups flour",
    "1 cup cocoa powder",
    "2 cups sugar"
  ],
  "recipeInstructions": [
    {
      "@type": "HowToStep",
      "text": "Preheat oven to 350°F (175°C)."
    },
    {
      "@type": "HowToStep",
      "text": "Combine dry ingredients in a large bowl."
    },
    {
      "@type": "HowToStep",
      "text": "Pour batter into pan and bake for 35 minutes."
    }
  ]
}
</script>

Three elements anchor every JSON-LD block. The @context is always https://schema.org for schema.org markup. The @type declares which schema type this block represents. The properties that follow depend on the type — Recipe has prepTime, cookTime, recipeIngredient, recipeInstructions; Article has headline, datePublished, author; Product has sku, price, availability.

A few formatting rules matter for the block to validate. Use valid JSON syntax — double-quoted property names, proper comma placement, no trailing commas. Use ISO 8601 format for dates (2026-03-15) and durations (PT20M means 20 minutes; PT55M means 55 minutes). For nested objects, use the same structure — each instruction step becomes its own HowToStep object with a text property, not a plain string.

The information inside the block must match what’s actually visible on the page. Google’s structured data guidelines are explicit: don’t markup content that isn’t visible to readers. A Recipe block claiming “prep time: 20 minutes” while the page text says “this takes about an hour” creates a mismatch that triggers penalties or gets ignored entirely. The structured data describes the page. It doesn’t invent a different page.


Step 2: add the block to the page:

JSON-LD goes inside a <script> tag with the type attribute set to application/ld+json. The conventional location is the <head> of the HTML document, but Google’s documentation confirms placement in the <body> works equally well. What matters is that the script tag is present in the HTML the server returns when Google crawls the page.

The most common placement pattern looks like this:

<!DOCTYPE html>
<html>
<head>
  <title>Chocolate Cake Recipe</title>
  <script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "Recipe",
    "name": "Chocolate Cake"
  }
  </script>
</head>
<body>
  <!-- visible page content -->
</body>
</html>

For CMS-driven sites, the implementation doesn’t involve hand-editing HTML. WordPress, Shopify, Wix, and similar platforms have plugins (Yoast SEO, RankMath, Schema Pro) that generate JSON-LD automatically based on page content. The plugin reads the post type, title, author, date, and other fields, then writes the corresponding JSON-LD to the page. The maintenance benefit is real — a content writer updating a recipe in WordPress doesn’t need to update the JSON-LD separately, because the plugin regenerates it on save.

For sites using JavaScript frameworks like React or Vue, structured data renders two ways. Server-side: the JSON-LD ships with the initial HTML. Client-side: JavaScript injects the script tag after the page loads. Server-side rendering is more reliable. Google crawls client-side JSON-LD too, but the timing depends on Google’s rendering queue, and complications appear more often than with static HTML.

One implementation note worth holding to: Google Tag Manager injects structured data, but the practice is risky. Tag Manager runs in JavaScript, which means the structured data lives in a separate code path from the visible page content. Drift between the two is easy to introduce, and the duplicate-source problem becomes a maintenance trap. Direct HTML or template-level injection is cleaner.


Step 3: validate before publishing:

Validation should happen before the page goes live and after any change that might affect the markup. Two tools cover most of the work.

The Rich Results Test is Google’s primary validation tool. Paste a URL or a chunk of HTML into the tool. It reports four things. Whether Google detects valid structured data. Which schema types it found. Which rich result types the page is eligible for. Any errors or warnings in the markup. Errors block rich result eligibility. Warnings don’t block eligibility but flag recommended properties that are missing. The tool is essential for catching typos, malformed JSON, missing required properties, and type mismatches before publishing.

The Schema.org Validator is a secondary tool that validates against the full schema.org specification rather than just Google’s supported subset. Google’s Rich Results Test only validates the properties Google uses for its rich results. The Schema.org Validator catches issues with properties Google ignores but other consumers (Bing, voice assistants, alternative search engines) might use. For sites that care about structured data beyond Google, both tools are worth running.

Manual inspection is the third check. Open the page in a browser, view the page source, and confirm the JSON-LD block is present in the HTML the server returns. If the JSON-LD only appears after JavaScript runs (in client-side rendered apps), verify with the Rich Results Test’s URL input. That input fetches and renders the page like Googlebot would, rather than just the code input.

The most common errors caught by these tools: missing required properties; wrong data types (a string where a date is expected); malformed JSON syntax; type names that don’t exist in schema.org. Each error has a specific fix, and the tools show the exact line and property where the problem occurs.

A successful validation looks like this in the Rich Results Test. The tool reports “Page is eligible for rich results” with a green checkmark. It lists the schema types detected — for the Recipe example above, it shows “Recipe” as the eligible rich result type. Below that, the tool displays detected properties grouped by required, recommended, and optional. A page passing validation shows zero items under “Errors” and may show warnings under “Suggestions” for missing recommended properties like aggregateRating or nutrition. Eligibility doesn’t guarantee the rich result appears in actual search results — Google decides display per-query — but passing validation is the prerequisite.


Step 4: monitor what Google actually sees in Search Console:

After deployment, Google Search Console reports on structured data in two places worth checking regularly.

The Enhancements section of Search Console lists each rich result type Google has detected on the site — Recipes, FAQs, Products, Articles, Breadcrumbs, and others. Each rich result report shows the total number of valid items, the number with errors, the number with warnings, and the trend over time. A sudden drop in valid items signals a template change broke the structured data. A spike in errors after a deployment means the new code introduced bad markup that needs investigation.

The URL Inspection tool drills into a specific page. It shows whether Google has crawled the page, which structured data Google detected on the latest crawl, and any errors found during processing. For diagnosing why a specific page isn’t appearing as a rich result, URL Inspection is the first stop.

The two reports answer different questions. The Enhancements report answers “how is structured data performing across the site?” The URL Inspection answers “why is this specific page not showing up as expected?” Together they give a complete picture.

One caveat about Search Console reporting. The reports reflect what Google has crawled and processed, not what’s actually on the page right now. After a fix, the report won’t update until Google recrawls the affected URLs, which takes days to weeks depending on the page’s priority. For urgent fixes, requesting recrawl through URL Inspection accelerates this.


Common schema types and what each is for:

The schema.org catalog is large, but most sites only need a handful of types. The table below covers the most common cases.

Schema type Use case Common properties
Article / NewsArticle / BlogPosting Written content, blog posts, news headline, image, datePublished, author
Recipe Cooking instructions name, image, recipeIngredient, recipeInstructions
Product E-commerce product listing name, image, offers (with price, priceCurrency, availability)
LocalBusiness (and sub-types) Physical business location name, address (required); telephone, openingHours, geo, priceRange (recommended)
Event Scheduled occurrence (concert, conference) name, startDate, location
FAQPage Q&A formatted content mainEntity (list of Question / Answer pairs)
HowTo Step-by-step instructional content name, step (list of HowToStep)
BreadcrumbList Site navigation hierarchy itemListElement (list of ListItem)
Organization Publisher or company behind the site No required properties; name, url, logo, address commonly used
Person Individual author or notable person name; jobTitle, sameAs, image commonly used

Each type has its own page in Google’s Search Central documentation that lists required and recommended properties, with examples. The required properties are the minimum a page needs to be eligible for the corresponding rich result. The recommended properties don’t block eligibility but improve the richness of the rich result when Google shows it. Some types — notably Organization and Person — don’t have strict required properties from Google’s perspective; the guidance is to include as many relevant properties as apply to the content. Requirements also vary by sub-type and rich result feature, so checking the specific Google documentation page for each type before implementing is worth the time.

For sites with multiple types on a single page, JSON-LD supports two patterns. Nested structures: Recipe with aggregateRating inside it. Arrays of separate items: Recipe plus BreadcrumbList plus Organization as three independent blocks. Both patterns are valid. Which one to use depends on the relationship between the items — nested when one item is a property of another, separate when they’re independent facts about the same page. Larger implementations sometimes combine multiple related entities into a single "@graph" structure to express relationships across the page more explicitly, but the underlying principles stay the same.

The BreadcrumbList schema next door has its own implementation specifics that overlap with this one — same JSON-LD format, same <script> tag placement, different properties. Each schema type follows the same structural pattern with different vocabulary.


Eligibility isn’t entitlement:

A common misconception about structured data: adding the correct markup will produce rich results on Google. The reality is more nuanced. Structured data makes a page eligible for rich result display, not guaranteed to receive it.

Google’s documentation states this explicitly. Eligibility is necessary but not sufficient. Whether Google actually shows the rich result depends on additional factors. Content quality matters — Google won’t surface rich results for pages it doesn’t consider high quality, even if the markup is perfect. Query context matters. The same page might appear as a rich result for one query and as a standard blue link for another, depending on what Google thinks the searcher wants. Competition matters — for high-volume queries, only a few rich results appear in the SERP, and Google selects which pages get the slot.

This affects how to think about structured data ROI. A site that adds Recipe markup to a thin recipe page won’t see traffic gains, because Google won’t surface low-quality content as a rich result. A site that adds Recipe markup to a well-developed recipe page with strong engagement signals is far more likely to see the rich result appear.

There’s another wrinkle worth understanding. Rich result formats change. Google adds new rich result types — How-to, FAQ, Course, and others have appeared in recent years. Google retires others — FAQ rich results were significantly reduced in 2023. And Google adjusts the visual treatment of existing ones. A rich result that worked at launch shows up differently or less often over time. Monitoring Search Console’s Enhancements report catches these shifts as they happen.

The mistake is treating structured data as a one-time setup. The right framing is treating it as ongoing infrastructure that needs maintenance — schema updates, validation after template changes, monitoring after Google’s algorithm updates.


Seven structured data anti-patterns:

Most structured data problems on real sites cluster into a recognizable set of patterns.

  1. Marking up content that isn’t on the page. A Recipe block claims “prep time: 20 minutes” but the page text doesn’t mention prep time at all. Google’s structured data guidelines explicitly prohibit this and will ignore the markup or trigger a manual action. Fix: structured data has to match visible page content. If the prep time isn’t on the page, either add it to the page or remove the property from the JSON-LD.
  1. Forcing the wrong schema type on a page. A “10 best recipes” listicle gets marked up as a single Recipe instead of an Article that mentions recipes. The page isn’t actually one recipe — it’s a list of recipes. Fix: match the schema type to what the page actually is. A list of recipes might be an ItemList of Recipes, or just an Article. It’s not a single Recipe.
  1. Missing required properties. A Product block without price or a Recipe block without recipeIngredient. Without required properties, the page isn’t eligible for the corresponding rich result. Fix: check the Google Search Central documentation for each type. Add the missing properties.
  1. Invalid JSON syntax. Trailing commas, unquoted property names, missing closing brackets. The structured data fails to parse and gets ignored entirely. Fix: validate with the Rich Results Test before publishing. The tool catches syntax errors with line numbers.
  1. JSON-LD generated only by client-side JavaScript. The HTML the server returns has no structured data; JavaScript adds it after the page loads. Google sometimes processes this but reliability is lower than server-rendered structured data. Fix: render JSON-LD on the server. For SPAs, use server-side rendering, static generation, or a build step that pre-renders the JSON-LD.
  1. Duplicate structured data with conflicting values. One JSON-LD block declares the product price as $29.99; a second block on the same page declares it as $24.99. Google uses either value or ignores both. Fix: consolidate structured data into a single source of truth per page. Audit for duplicate blocks introduced by stacked plugins.
  1. Structured data behind robots.txt. The page declaring the JSON-LD is blocked from crawling. Google can’t fetch the page, doesn’t read the structured data, and rich results never appear. Fix: allow Google to crawl pages that have structured data. The robots.txt vs meta robots distinction matters here — blocking the page from crawling defeats the purpose of having structured data at all.

An eighth pattern worth flagging: stale structured data after content updates. A blog post’s published date in JSON-LD says 2022; the page itself has been substantially updated and shows a 2026 “last updated” date. Google’s understanding of the page lags behind the content. Fix: keep datePublished and dateModified in sync with what the page actually shows. Most CMS plugins handle this automatically; custom implementations need explicit logic.


What structured data is, what it isn’t:

Structured data does specific things. It declares what kind of content a page contains, in a vocabulary search engines understand. It makes pages eligible for rich result display. It feeds AI extraction systems with labeled facts. It travels with the page wherever the URL is referenced.

Structured data does not do other things attributed to it. It isn’t a ranking factor. Adding JSON-LD to a thin page won’t lift its position. It isn’t a substitute for content quality. Google won’t surface a rich result for a page it considers low value, regardless of how perfect the markup is. It isn’t a guarantee of rich result display. Eligibility and appearance are different states. It isn’t visible to users on the page itself. The structured data lives in a script tag that the browser doesn’t render.

The boundary is worth holding to. Structured data is an infrastructure layer that makes good pages eligible for richer presentation. It compounds with content quality, technical performance, and user signals. It doesn’t replace any of them. A site with strong content and clean structured data outperforms a site with strong content alone for rich result eligibility. A site with weak content and elaborate structured data sees little benefit from the markup at all. The structured data extends what the page already is. It doesn’t change what the page is.