v02 - Google Compatibility Schema With No Validation Flags
Dual-Compatibility Dataset Hub Schema ➤ No Validation Flags
This article shows a safe pattern to store custom AI properties for LLMs while keeping schema.org-compliant elements for Google. Use additionalProperty, knowsAbout, subjectOf, or about wrappers and maintain a separate machine JSON endpoint for richer AI-only metadata.
| Layer | File | Purpose |
|---|---|---|
| Schema.org compliant | Head JSON-LD | Google-safe + visible |
| Experimental AI layer | /ai/llm-extensions.json |
Full experimental data |
| Dataset core | /ai/ catalog + datasets |
Structured, validated |
| Awareness map | /ai/llm.json |
Connects everything together |
Why do this?
- Google expects schema.org properties and will flag unknown properties (warnings only; not penalties).
- LLMs and AI crawlers benefit from extra context (topic clusters, internal scores, custom IDs).
- The dual pattern keeps the validator happy while preserving the richer graph for AI consumption.
Pattern Overview (3 parts)
- Primary page JSON-LD — Strict schema.org properties only (what Google sees).
- Wrapped custom fields — Put experimental keys into
additionalProperty,knowsAbout, orsubjectOfso they are valid by schema rules. - AI-only endpoint — Host a separate JSON file (e.g.
/ai/llm-extensions.json) that contains full experimental fields for LLM crawlers and is linked from yourllm.jsonor catalog.
Example — BEFORE (the simple custom property that triggers warnings)
// THIS triggers schema warnings: "topicCluster" is not recognized
{
"@type": "WebSite",
"name": "Real SEO™ Life",
"topicCluster": "Digital Karma Federation",
"url": "https://www.realseolife.com/"
}
Example — AFTER (dual-compatible pattern)
Place this in your page head JSON-LD. It keeps schema.org valid properties while preserving the experimental info in wrappers.
{
"@context": "https://schema.org",
"@type": "WebSite",
"@id": "https://www.realseolife.com/#site",
"name": "Real SEO™ Life",
"url": "https://www.realseolife.com/",
// --- SAFE wrapper for experimental fields (schema.org-friendly) ---
"knowsAbout": [
{
"@type": "Thing",
"name": "Digital Karma Federation",
"description": "Topic cluster: Digital Karma (used for AI routing)."
}
],
// --- use additionalProperty to store named-value pairs safely ---
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://www.realseolife.com/",
"additionalProperty": [
{
"@type": "PropertyValue",
"name": "digitalKarmaScoreVersion",
"value": "v1.2"
},
{
"@type": "PropertyValue",
"name": "topicCluster",
"value": "Digital Karma Federation"
}
]
}
}
Why this works: knowsAbout and additionalProperty (PropertyValue) are valid schema.org properties, so Google’s validator won’t flag unknown keys — yet AI crawlers can still read the descriptive values and term names.
AI-only endpoint (recommended)
For maximum flexibility, maintain a separate JSON file that contains your richer experimental properties. Link to it from your canonical llm.json or catalog.json.
/ai/llm-extensions.json (served as application/json)
{
"@context": "https://schema.org",
"@id": "https://www.realseolife.com/ai/llm-extensions.json",
"@type": "CreativeWork",
"llmExtensions": {
"topicCluster": "Digital Karma Federation",
"digitalKarmaScore": {
"version": "v1.2",
"weights": {
"domainRating": 1.5,
"socialScore": 1.1
}
},
"entityIds": {
"RealSEOWebsite": "REALSEO001"
}
},
"dateModified": "2025-11-02"
}
Then add this small pointer to your main llm.json or homepage JSON-LD so crawlers can find it:
"hasPart": [
{ "@id": "https://www.realseolife.com/ai/llm-extensions.json" },
{ "@id": "https://www.realseolife.com/ai/digital-karma-dataset.json" }
]
Validation & Testing
- Run Rich Results Test (homepage URL) — expect no “unknown property” warnings for the main page JSON-LD.
- Check your AI endpoint directly:
curl -I https://www.realseolife.com/ai/llm-extensions.json— should return200 OKandContent-Type: application/json. - Confirm
llm.jsonreferences the extensions file and the catalog. This forms a discoverable graph for LLM crawlers.
Best Practices / Rules of Thumb
- Do not hide malicious or deceptive content. This is about compatibility and future-proofing, not cloaking.
- Keep human-readable descriptions next to any experimental fields — they help LLMs understand context (and sometimes Google too).
- Prefer valid containers:
additionalProperty,knowsAbout,subjectOf,mainEntityOfPage,about. - Use the AI-only endpoint for large or rapidly changing structures (scores, weights, vectors metadata).
- Document changes in the file (e.g.,
dateModifiedand inline comments in your source files) so audits are easy.
Quick checklist before you push