Sitemaps — Reference
The Sitemaps surface is the discoverability plane for every SGEN site's relationship with external search engines. It owns the generated sitemap files, the rules that decide which records appear in them, the split between multiple sitemap files when a site grows past a single-file threshold, the last-modified timestamp handling on each entry, and the ping mechanism that notifies search engines when the sitemaps change.
This page is a reference for platform engineers and integrators who need to understand the surface before extending it, scripting against it, or wiring a custom record type into the sitemap output. Customer-facing how-tos live in the customer docs set; this page describes the shape of the surface, not the steps to drive it.
Overview
Sitemaps live under the Sitemaps module in SG-Admin. The module renders three primary views — the sitemap configuration form, the per-sitemap entry inspector, and the ping history — and exposes write operations for configure inclusion rules, regenerate sitemaps, configure the multi-file split, and trigger a ping to registered search engines.
The module is paired by convention: one half renders the views and prepares the data, the other half handles writes and dispatches generation work to a background task. Engineers reading the SG-Admin source will see this split across two controller files; the reference below describes the combined surface as it appears to a calling integration.
The generated sitemap files are served from a fixed path on the public surface of the site. The Sitemaps module is the administrative layer over that output; the public path itself is not gated by the admin check (search engines need to reach it without credentials). A robots-policy entry on the public root points to the sitemap location.
Where it lives in SG-Admin:
- Sidebar: SG-Admin → Sitemaps
- URL prefix:
/sg-admin/sitemaps/ - View templates:
application/views/Admin/Sitemaps/
┌──────────────────────────────────────────────────────────────────────┐│ SG-Admin → Sitemaps → Configuration │├──────────────────────────────────────────────────────────────────────┤│ Auto-generation: ✔ enabled Cadence: hourly ││ Public location: /sitemap.xml ││ Robots reference: ✔ present │├──────────────────────────────────────────────────────────────────────┤│ Record type Included Excluded Entries ││ ─────────────── ──────── ──────── ─────── ││ Page ✔ 3 124 ││ Post ✔ 8 892 ││ Product ✔ 0 118 ││ KB article ✔ 2 713 ││ Tag archive — — — ││ ││ Split threshold: 5,000 entries per file (currently 1 file) ││ [Regenerate now] [Ping search engines] [Configure inclusion] │└──────────────────────────────────────────────────────────────────────┘Actions
The Sitemaps surface exposes the following operations. Each is described by what it does to the data, not by its internal method name.
View configuration
Returns the current sitemap setup — whether auto-generation is enabled, the regeneration cadence, the public location of the sitemap file, whether the robots-policy reference is present, the per-record-type inclusion summary, the current entry count, and the file-split state. Used as the entry view to confirm the sitemap is healthy before any tuning work begins.
Configure inclusion rules
Sets, per record type, whether the record type is included in the sitemap at all and any per-record exclusion condition (for example, posts whose status is not published, or pages explicitly marked private). The inclusion model is intentionally separate from the Search module's indexing rules — sites may want a record discoverable externally while excluded from internal search, or vice versa.
Configure per-entry attributes
Sets, per record type, how each entry's last-modified timestamp is computed (which record field to read), how the change-frequency hint is computed (a fixed value, or derived from the record's edit history), and how the relative-priority hint is computed. Defaults are sensible — last-modified reads the record's standard updated timestamp, change-frequency is set per record type, priority is uniform — but each can be overridden.
Regenerate now
Dispatches a full regeneration of every sitemap file. Regeneration runs in the background; the surface returns immediately with a tracking identifier. While regeneration is in progress, the previously generated files continue to serve from the public path. The new files replace the old atomically when generation completes.
Configure the multi-file split
Sets the entry-count threshold above which sitemaps are split across multiple files. The default threshold is well below the upper bound that search engines accept. When the entry count exceeds the threshold, the surface generates a sitemap index file at the public location plus one or more child sitemap files, each containing a slice of the entries.
Ping search engines
Notifies the configured list of search engines that the sitemap has changed. Ping is dispatched as background work; the surface returns immediately. The ping history view records the destination, the dispatch time, and the response captured from the destination. Ping is also triggered automatically after a successful regeneration when the auto-ping setting is enabled.
View ping history
Shows past ping attempts, paginated, with destination, dispatch time, response status, and any captured error. Used to confirm that the ping mechanism is reaching its destinations and to investigate failed deliveries.
Configure auto-ping
Toggles automatic ping dispatch after regeneration. When enabled, every successful regeneration is followed by a ping to each registered destination. When disabled, regeneration completes silently and a ping must be triggered manually if external notification is desired.
Data model
The Sitemaps surface manages several related record types. Field names below are the conceptual shape — the on-disk column names match closely but are not contractually stable across releases.
Sitemap configuration:
| Field | Type | Notes |
|---|---|---|
auto_generation_enabled | boolean | Whether the scheduled regeneration task runs. |
cadence | enum | hourly, daily, weekly. |
split_threshold | integer | Entry count above which the index-plus-children layout is used. |
auto_ping_enabled | boolean | Whether each regeneration triggers a ping. |
ping_destinations | array | Registered search-engine ping endpoints. |
| Field | Type | Notes |
|---|---|---|
record_type | string | Slug of the record type. Primary key. |
included | boolean | Whether the record type appears in the sitemap at all. |
exclusion_filter | structured | Optional condition that excludes individual records (for example, posts whose status is not published). |
lastmod_source_field | string | Name of the field used to populate the last-modified timestamp on each entry. |
changefreq_value | enum | always, hourly, daily, weekly, monthly, yearly, never. |
priority_value | decimal | Zero to one. Defaults to a uniform value per record type. |
| Field | Type | Notes |
|---|---|---|
id | integer | Primary key. |
destination | string | Endpoint that was notified. |
dispatched_at | timestamp | Set on dispatch, immutable. |
response_state | enum | succeeded, failed, pending. |
response_detail | string | Captured detail from the destination. |
triggered_by | enum | auto (post-regeneration), manual (operator action). |
Last-modified semantics: each entry's last-modified timestamp is read from the configured source field on the source record at generation time. Records that lack the configured field fall back to the record's standard updated timestamp. Entries do not show a timestamp later than the most recent regeneration — search engines treat that as a signal that the entry has changed.
SOURCE RECORDS (page / post / product / kb / custom)││ filtered by inclusion rules▼PER-RECORD-TYPE INCLUSION├── included = true / false├── exclusion_filter├── lastmod_source_field├── changefreq_value└── priority_value││ entry count check▼SPLIT DECISION├── entries ≤ threshold ──▶ single sitemap file└── entries > threshold ──▶ index + N child files│▼ATOMIC PUBLISH(new files replace old at public path)│▼AUTO-PING (if enabled)├── destination 1├── destination 2└── …│▼PING HISTORY RECORDPermissions
Access to the Sitemaps surface is gated at two layers.
Layer 1 — admin gate. Every action under SG-Admin passes through the platform's standard admin access check at request entry. An unauthenticated request never reaches the Sitemaps administrative surface. Search-engine retrieval of the generated sitemap files at the public path is not gated by the admin check — search engines fetch them as anonymous visitors.
Layer 2 — per-action capability. Within SG-Admin, each Sitemaps action checks a capability associated with the calling operator's role. The default role configuration ships with three roles — Administrator, Editor, Viewer — and the capability map is:
| Capability | Administrator | Editor | Viewer |
|---|---|---|---|
| View configuration | ✔ | ✔ | ✔ |
| Configure inclusion rules | ✔ | — | — |
| Configure per-entry attributes | ✔ | — | — |
| Regenerate now | ✔ | ✔ | — |
| Configure the multi-file split | ✔ | — | — |
| Ping search engines | ✔ | ✔ | — |
| View ping history | ✔ | ✔ | ✔ |
| Configure auto-ping | ✔ | — | — |
Self-protection rules. A regeneration cannot be triggered while a prior regeneration is still in progress — the surface returns a structured rejection that names the running regeneration's tracking identifier. The split threshold is bounded between a minimum value (below which file proliferation becomes unhelpful) and a maximum (above which generated files would exceed the size limits search engines accept). Attempts to set the threshold outside the bounded range are rejected.
Audit. Every write — inclusion-rule change, per-entry-attribute change, regeneration dispatch, split-threshold change, manual ping, auto-ping toggle — emits an Activity Log entry. The log records the acting operator, the target (record type, attribute name, or setting slug), and the change shape. Activity Log retention is governed by the site's general settings.
ADMIN OPERATOR REQUEST SEARCH ENGINE RETRIEVAL│ │▼ ▼┌─────────────────────────┐ ┌─────────────────────────┐│ Admin gate │ │ No admin gate ││ (SG-Admin entry) │ │ (public sitemap path) │└────────────┬────────────┘ └────────────┬────────────┘│ │▼ ▼┌─────────────────────────┐ ┌─────────────────────────┐│ Capability check │ │ Atomic file read ││ (per-action) │ │ (last published state) │└────────────┬────────────┘ └────────────┬────────────┘│ │▼ ▼┌─────────────────────────┐ retrieval served│ Self-protection rules ││ (running-regen / split ││ threshold bounds) │└────────────┬────────────┘│ passes▼action executes│▼Activity Log entryRelated references
- Settings — Reference. Owns the role definitions, the public location of the sitemap files, the robots-policy reference content, and the registered ping destinations.
- Search — Reference. Sitemaps and on-site search share a similar inclusion model and a similar background-generation architecture but operate independently — a record excluded from one may still appear in the other.
- Pages — Reference. Pages are a default-included record type; the inclusion rule for pages respects the published state and any explicit private flag on the page.
- Posts — Reference. Posts are a default-included record type; the inclusion rule for posts respects the published and scheduled states.
- Users — Reference. Operator identifiers on inclusion-rule writes and configuration changes resolve through the Users surface.
- Logs — Reference. Regeneration progress, ping responses, and any error captured by the generation task surface on the appropriate channels for operator investigation.
- Tools — Reference. The Activity Log search surface lives under Tools; configuration changes to the sitemap surface are findable there.
/docs/sitemaps.