Reference → Sitemaps — Reference

Sitemaps — Reference

The Sitemaps surface is the discoverability plane for every SGEN site's relationship with external search engines. It owns the generated sitemap files, the rules that decide which records appear in them, the split between multiple sitemap files when a site grows past a single-file threshold, the last-modified timestamp handling on each entry, and the ping mechanism that notifies search engines when the sitemaps change.

This page is a reference for platform engineers and integrators who need to understand the surface before extending it, scripting against it, or wiring a custom record type into the sitemap output. Customer-facing how-tos live in the customer docs set; this page describes the shape of the surface, not the steps to drive it.


Overview

Sitemaps live under the Sitemaps module in SG-Admin. The module renders three primary views — the sitemap configuration form, the per-sitemap entry inspector, and the ping history — and exposes write operations for configure inclusion rules, regenerate sitemaps, configure the multi-file split, and trigger a ping to registered search engines.

The module is paired by convention: one half renders the views and prepares the data, the other half handles writes and dispatches generation work to a background task. Engineers reading the SG-Admin source will see this split across two controller files; the reference below describes the combined surface as it appears to a calling integration.

The generated sitemap files are served from a fixed path on the public surface of the site. The Sitemaps module is the administrative layer over that output; the public path itself is not gated by the admin check (search engines need to reach it without credentials). A robots-policy entry on the public root points to the sitemap location.

Where it lives in SG-Admin:

  • Sidebar: SG-Admin → Sitemaps
  • URL prefix: /sg-admin/sitemaps/
  • View templates: application/views/Admin/Sitemaps/
The module surface is closely related to the Search module — both share a similar inclusion model — but operates independently. A record can be included in the sitemap but excluded from on-site search (a publicly findable page that does not surface in the visitor's site search), and vice versa.
┌──────────────────────────────────────────────────────────────────────┐│ SG-Admin → Sitemaps → Configuration │├──────────────────────────────────────────────────────────────────────┤│ Auto-generation: ✔ enabled Cadence: hourly ││ Public location: /sitemap.xml ││ Robots reference: ✔ present │├──────────────────────────────────────────────────────────────────────┤│ Record type Included Excluded Entries ││ ─────────────── ──────── ──────── ─────── ││ Page ✔ 3 124 ││ Post ✔ 8 892 ││ Product ✔ 0 118 ││ KB article ✔ 2 713 ││ Tag archive — — — ││ ││ Split threshold: 5,000 entries per file (currently 1 file) ││ [Regenerate now] [Ping search engines] [Configure inclusion] │└──────────────────────────────────────────────────────────────────────┘

Actions

The Sitemaps surface exposes the following operations. Each is described by what it does to the data, not by its internal method name.

View configuration

Returns the current sitemap setup — whether auto-generation is enabled, the regeneration cadence, the public location of the sitemap file, whether the robots-policy reference is present, the per-record-type inclusion summary, the current entry count, and the file-split state. Used as the entry view to confirm the sitemap is healthy before any tuning work begins.

Configure inclusion rules

Sets, per record type, whether the record type is included in the sitemap at all and any per-record exclusion condition (for example, posts whose status is not published, or pages explicitly marked private). The inclusion model is intentionally separate from the Search module's indexing rules — sites may want a record discoverable externally while excluded from internal search, or vice versa.

Configure per-entry attributes

Sets, per record type, how each entry's last-modified timestamp is computed (which record field to read), how the change-frequency hint is computed (a fixed value, or derived from the record's edit history), and how the relative-priority hint is computed. Defaults are sensible — last-modified reads the record's standard updated timestamp, change-frequency is set per record type, priority is uniform — but each can be overridden.

Regenerate now

Dispatches a full regeneration of every sitemap file. Regeneration runs in the background; the surface returns immediately with a tracking identifier. While regeneration is in progress, the previously generated files continue to serve from the public path. The new files replace the old atomically when generation completes.

Configure the multi-file split

Sets the entry-count threshold above which sitemaps are split across multiple files. The default threshold is well below the upper bound that search engines accept. When the entry count exceeds the threshold, the surface generates a sitemap index file at the public location plus one or more child sitemap files, each containing a slice of the entries.

Ping search engines

Notifies the configured list of search engines that the sitemap has changed. Ping is dispatched as background work; the surface returns immediately. The ping history view records the destination, the dispatch time, and the response captured from the destination. Ping is also triggered automatically after a successful regeneration when the auto-ping setting is enabled.

View ping history

Shows past ping attempts, paginated, with destination, dispatch time, response status, and any captured error. Used to confirm that the ping mechanism is reaching its destinations and to investigate failed deliveries.

Configure auto-ping

Toggles automatic ping dispatch after regeneration. When enabled, every successful regeneration is followed by a ping to each registered destination. When disabled, regeneration completes silently and a ping must be triggered manually if external notification is desired.


Data model

The Sitemaps surface manages several related record types. Field names below are the conceptual shape — the on-disk column names match closely but are not contractually stable across releases.

Sitemap configuration:

FieldTypeNotes
auto_generation_enabledbooleanWhether the scheduled regeneration task runs.
cadenceenumhourly, daily, weekly.
split_thresholdintegerEntry count above which the index-plus-children layout is used.
auto_ping_enabledbooleanWhether each regeneration triggers a ping.
ping_destinationsarrayRegistered search-engine ping endpoints.
Per-record-type inclusion rule:
FieldTypeNotes
record_typestringSlug of the record type. Primary key.
includedbooleanWhether the record type appears in the sitemap at all.
exclusion_filterstructuredOptional condition that excludes individual records (for example, posts whose status is not published).
lastmod_source_fieldstringName of the field used to populate the last-modified timestamp on each entry.
changefreq_valueenumalways, hourly, daily, weekly, monthly, yearly, never.
priority_valuedecimalZero to one. Defaults to a uniform value per record type.
Ping history record:
FieldTypeNotes
idintegerPrimary key.
destinationstringEndpoint that was notified.
dispatched_attimestampSet on dispatch, immutable.
response_stateenumsucceeded, failed, pending.
response_detailstringCaptured detail from the destination.
triggered_byenumauto (post-regeneration), manual (operator action).
Generation semantics: the sitemap is generated as a whole — every included record is read from its source surface and emitted as an entry. There is no incremental sitemap update; small edits trigger a full regeneration on the next scheduled run. The atomic file replacement guarantees that search engines never see a partially generated file.

Last-modified semantics: each entry's last-modified timestamp is read from the configured source field on the source record at generation time. Records that lack the configured field fall back to the record's standard updated timestamp. Entries do not show a timestamp later than the most recent regeneration — search engines treat that as a signal that the entry has changed.

SOURCE RECORDS (page / post / product / kb / custom)││ filtered by inclusion rules▼PER-RECORD-TYPE INCLUSION├── included = true / false├── exclusion_filter├── lastmod_source_field├── changefreq_value└── priority_value││ entry count check▼SPLIT DECISION├── entries ≤ threshold ──▶ single sitemap file└── entries > threshold ──▶ index + N child files│▼ATOMIC PUBLISH(new files replace old at public path)│▼AUTO-PING (if enabled)├── destination 1├── destination 2└── …│▼PING HISTORY RECORD

Permissions

Access to the Sitemaps surface is gated at two layers.

Layer 1 — admin gate. Every action under SG-Admin passes through the platform's standard admin access check at request entry. An unauthenticated request never reaches the Sitemaps administrative surface. Search-engine retrieval of the generated sitemap files at the public path is not gated by the admin check — search engines fetch them as anonymous visitors.

Layer 2 — per-action capability. Within SG-Admin, each Sitemaps action checks a capability associated with the calling operator's role. The default role configuration ships with three roles — Administrator, Editor, Viewer — and the capability map is:

CapabilityAdministratorEditorViewer
View configuration
Configure inclusion rules
Configure per-entry attributes
Regenerate now
Configure the multi-file split
Ping search engines
View ping history
Configure auto-ping
Custom roles defined under Settings → Roles override the default map. The capability slugs are stable; the role slugs are configurable.

Self-protection rules. A regeneration cannot be triggered while a prior regeneration is still in progress — the surface returns a structured rejection that names the running regeneration's tracking identifier. The split threshold is bounded between a minimum value (below which file proliferation becomes unhelpful) and a maximum (above which generated files would exceed the size limits search engines accept). Attempts to set the threshold outside the bounded range are rejected.

Audit. Every write — inclusion-rule change, per-entry-attribute change, regeneration dispatch, split-threshold change, manual ping, auto-ping toggle — emits an Activity Log entry. The log records the acting operator, the target (record type, attribute name, or setting slug), and the change shape. Activity Log retention is governed by the site's general settings.

ADMIN OPERATOR REQUEST SEARCH ENGINE RETRIEVAL│ │▼ ▼┌─────────────────────────┐ ┌─────────────────────────┐│ Admin gate │ │ No admin gate ││ (SG-Admin entry) │ │ (public sitemap path) │└────────────┬────────────┘ └────────────┬────────────┘│ │▼ ▼┌─────────────────────────┐ ┌─────────────────────────┐│ Capability check │ │ Atomic file read ││ (per-action) │ │ (last published state) │└────────────┬────────────┘ └────────────┬────────────┘│ │▼ ▼┌─────────────────────────┐ retrieval served│ Self-protection rules ││ (running-regen / split ││ threshold bounds) │└────────────┬────────────┘│ passes▼action executes│▼Activity Log entry

Related references

  • Settings — Reference. Owns the role definitions, the public location of the sitemap files, the robots-policy reference content, and the registered ping destinations.
  • Search — Reference. Sitemaps and on-site search share a similar inclusion model and a similar background-generation architecture but operate independently — a record excluded from one may still appear in the other.
  • Pages — Reference. Pages are a default-included record type; the inclusion rule for pages respects the published state and any explicit private flag on the page.
  • Posts — Reference. Posts are a default-included record type; the inclusion rule for posts respects the published and scheduled states.
  • Users — Reference. Operator identifiers on inclusion-rule writes and configuration changes resolve through the Users surface.
  • Logs — Reference. Regeneration progress, ping responses, and any error captured by the generation task surface on the appropriate channels for operator investigation.
  • Tools — Reference. The Activity Log search surface lives under Tools; configuration changes to the sitemap surface are findable there.
For the corresponding customer-facing walkthrough — getting a site listed in search engines, troubleshooting a missing record, configuring multi-sitemap splits for a large catalog — see the Sitemaps section of the customer docs at /docs/sitemaps.
On this page