Sitemap
Workflow
- Fetch
/sitemap.xml(or sitemap index) - Run
scripts/parse-sitemap.ts→ validate structure - Cross-check URL coverage vs crawled pages
- Detect orphan URLs (in sitemap but not linked)
- Detect missing URLs (linked but not in sitemap)
Best Practices
- Max 50,000 URLs or 50 MB per sitemap (use index if larger)
<lastmod>actual file modification date (not generation date)<priority>and<changefreq>largely ignored by Google- HTTPS only, absolute URLs
- Reference in
robots.txt:Sitemap: https://example.com/sitemap.xml
Special Sitemaps
- News (publishers):
sitemap-news.xml, only articles from last 48h - Image:
sitemap-image.xmlwith<image:image>tags - Video:
sitemap-video.xmlwith<video:video>tags
Templates
templates/sitemap/sitemap.xmltemplates/sitemap/sitemap-news.xmltemplates/sitemap/sitemap-image.xmltemplates/robots/robots-saas.txttemplates/robots/robots-ecommerce.txt