v02 - Google Compatibility Schema With No Validation Flags

Dual-Compatibility Dataset Hub Schema ➤ No Validation Flags

This article shows a safe pattern to store custom AI properties for LLMs while keeping schema.org-compliant elements for Google. Use additionalProperty, knowsAbout, subjectOf, or about wrappers and maintain a separate machine JSON endpoint for richer AI-only metadata.

Layer	File	Purpose
Schema.org compliant	Head JSON-LD	Google-safe + visible
Experimental AI layer	`/ai/llm-extensions.json`	Full experimental data
Dataset core	`/ai/` catalog + datasets	Structured, validated
Awareness map	`/ai/llm.json`	Connects everything together

Why do this?

Google expects schema.org properties and will flag unknown properties (warnings only; not penalties).
LLMs and AI crawlers benefit from extra context (topic clusters, internal scores, custom IDs).
The dual pattern keeps the validator happy while preserving the richer graph for AI consumption.

DLSF Diagram

Pattern Overview (3 parts)

Primary page JSON-LD — Strict schema.org properties only (what Google sees).
Wrapped custom fields — Put experimental keys into additionalProperty, knowsAbout, or subjectOf so they are valid by schema rules.
AI-only endpoint — Host a separate JSON file (e.g. /ai/llm-extensions.json) that contains full experimental fields for LLM crawlers and is linked from your llm.json or catalog.

Example — BEFORE (the simple custom property that triggers warnings)


// THIS triggers schema warnings: "topicCluster" is not recognized
{
  "@type": "WebSite",
  "name": "Real SEO™ Life",
  "topicCluster": "Digital Karma Federation",
  "url": "https://www.realseolife.com/"
}

Example — AFTER (dual-compatible pattern)

Place this in your page head JSON-LD. It keeps schema.org valid properties while preserving the experimental info in wrappers.


{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "@id": "https://www.realseolife.com/#site",
  "name": "Real SEO™ Life",
  "url": "https://www.realseolife.com/",
  // --- SAFE wrapper for experimental fields (schema.org-friendly) ---
  "knowsAbout": [
    {
      "@type": "Thing",
      "name": "Digital Karma Federation",
      "description": "Topic cluster: Digital Karma (used for AI routing)."
    }
  ],
  // --- use additionalProperty to store named-value pairs safely ---
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://www.realseolife.com/",
    "additionalProperty": [
      {
        "@type": "PropertyValue",
        "name": "digitalKarmaScoreVersion",
        "value": "v1.2"
      },
      {
        "@type": "PropertyValue",
        "name": "topicCluster",
        "value": "Digital Karma Federation"
      }
    ]
  }
}

Why this works: knowsAbout and additionalProperty (PropertyValue) are valid schema.org properties, so Google’s validator won’t flag unknown keys — yet AI crawlers can still read the descriptive values and term names.

AI-only endpoint (recommended)

For maximum flexibility, maintain a separate JSON file that contains your richer experimental properties. Link to it from your canonical llm.json or catalog.json.


/ai/llm-extensions.json  (served as application/json)
{
  "@context": "https://schema.org",
  "@id": "https://www.realseolife.com/ai/llm-extensions.json",
  "@type": "CreativeWork",
  "llmExtensions": {
    "topicCluster": "Digital Karma Federation",
    "digitalKarmaScore": {
       "version": "v1.2",
       "weights": {
         "domainRating": 1.5,
         "socialScore": 1.1
       }
    },
    "entityIds": {
       "RealSEOWebsite": "REALSEO001"
    }
  },
  "dateModified": "2025-11-02"
}

Then add this small pointer to your main llm.json or homepage JSON-LD so crawlers can find it:


"hasPart": [
  { "@id": "https://www.realseolife.com/ai/llm-extensions.json" },
  { "@id": "https://www.realseolife.com/ai/digital-karma-dataset.json" }
]

Validation & Testing

Run Rich Results Test (homepage URL) — expect no “unknown property” warnings for the main page JSON-LD.
Check your AI endpoint directly: curl -I https://www.realseolife.com/ai/llm-extensions.json — should return 200 OK and Content-Type: application/json.
Confirm llm.json references the extensions file and the catalog. This forms a discoverable graph for LLM crawlers.

Best Practices / Rules of Thumb

Do not hide malicious or deceptive content. This is about compatibility and future-proofing, not cloaking.
Keep human-readable descriptions next to any experimental fields — they help LLMs understand context (and sometimes Google too).
Prefer valid containers: additionalProperty, knowsAbout, subjectOf, mainEntityOfPage, about.
Use the AI-only endpoint for large or rapidly changing structures (scores, weights, vectors metadata).
Document changes in the file (e.g., dateModified and inline comments in your source files) so audits are easy.

Quick checklist before you push

Last Updated: 03 November 2025 Views: 137 0 minutes read

Real SEO™