SGEN performance and reliability
How SGEN keeps pages fast under load and the platform dependable through change — at the architectural layer, not the configuration layer.
The SGEN performance and reliability model is one connected story told across three architectural concerns: how pages are served fast (performance), how the platform keeps serving when something breaks (reliability), and how the model holds up as one site becomes ten and ten becomes a hundred (scalability). This page consolidates that story so an operator can answer the four questions that surface in every platform evaluation — why is it fast, how does it stay up, how does it scale, and what does the operator have to do — without reading four separate pages.
This is a sibling to the Data Model Overview, the Multi-Site Architecture Overview, the Observability and Monitoring Overview, the Backup and Restore Architecture Overview, and the Deployment and Release Overview. Read this one when the question on your mind is "how does SGEN hold up — under traffic, under change, and over time?"
What is this for?
This page answers four concept-level questions every operator carries into the first quarter of running a site that matters:
- Why does the site feel fast without me configuring anything, and what are the layers behind that speed?
- How dependable is the platform under load, during a release, and when something downstream breaks?
- How does the model scale from one site to many — and does Site A's traffic ever slow Site B?
- What can I — as an operator, not a developer — see, tune, or escalate?
When to use this
Reach for this page when you are doing one of the following:
- Evaluating SGEN against another platform during procurement and you need to explain the speed-and-dependability story to a stakeholder.
- Planning a marketing campaign with a sharp expected spike (a Black Friday weekend, a podcast feature, a product launch) and you want to know whether to give your account team a heads-up.
- Diagnosing a "this page feels slow" report and you want to understand which layer might be the source before opening a support thread.
- Briefing a developer or contractor on the performance model so the custom code they write does not undermine the platform defaults.
- Migrating a multi-site portfolio to SGEN and you need to confirm that one tenant's spike cannot drag the rest of the portfolio.
- Auditing an inherited site that is slower than it should be, and you want a checklist of architectural assumptions to validate before changing anything.
What NOT to use this for
This page is concept-level. It does not give you any of the following:
- A configuration walkthrough. (See the module docs for cache purge, image optimization, smart-loading toggles, and per-page cache rules.)
- A service-level guarantee. (See your contract and the platform status page for the formal uptime commitment language.)
- A benchmark report. (Independent measurements vary by site; talk to your account team for site-specific numbers.)
- Stack-specific implementation detail. (Vendor names, datacenter regions, and instance sizes stay engineer-only.)
- A pricing or tier-feature breakdown. (See the pricing page for what each plan tier includes.)
Architecture model
Performance, reliability, and scalability are three faces of the same set of architectural decisions. This section walks the model in three parts that share the same primitives.
The performance model — why pages feel fast. Four design decisions stack on top of one another, each absorbing a different cause of slowness.
Decision one is server-first rendering. A SGEN page is built into HTML on the server, then sent to the browser as a finished document. The browser does not have to assemble the page from many small data requests before it can show anything. The first paint lands fast because the work that turns data into pixels has already happened before the response leaves the server. The trade-off is picked for content: server-first rendering wins for first paint, which is the metric that matters most when a visitor decides whether the site feels fast.
Decision two is smart loading. Below-the-fold images, deferred scripts, and non-blocking media load on a schedule that matches the visitor's path through the page. The hero loads at full quality on arrival. Lazy assets fetch as the visitor scrolls toward them. Analytics, chat widgets, and tracking pixels run on a schedule that does not delay what the visitor sees first.
Decision three is layered caching. Repeat requests for the same content reuse the work done for the first request. The cache absorbs the load; the platform pays for the work once and serves the result many times. The layers and their cadence are walked in the cache-layer table later in this section.
Decision four is operator-tunable knobs. A small set of controls — cache durations, image optimization defaults, per-page cache rules, lazy-load thresholds — let an operator sharpen the speed for the specific shape of the site. The defaults work; the knobs exist for the sites that benefit from tuning.
The cache layers — three layers between the visitor and the origin. The cache is not one thing; it is three layers stacked, each with its own invalidation cadence and operator surface.
Browser cache lives on the visitor's device. It stores the response according to the cache directives the platform sends with each one. Browser cache is per-device and per-browser profile — clearing it on one device does not affect any other. It is the layer closest to the visitor and the layer most likely to cause an operator to see a fresh version while a visitor still sees the old one.
Edge cache lives at geographically distributed nodes positioned close to visitor populations. When a visitor's request crosses the public internet to SGEN, it lands at the nearest edge node before reaching the platform origin. If that edge node holds a valid cached copy, the visitor receives the response from the edge — no platform origin contacted, no application work done. Edge cache is shared across visitors who land at the same edge node, which is why one region can see stale content while another region sees fresh content during a propagation window after a publish.
Origin cache lives at the platform origin. It absorbs the cost of repeated work for pages that have not changed since the last build. Origin cache clears immediately on every publish; the edge and browser layers refresh on their own cadence after that signal.
The reliability posture — uptime, failover, and what stays running when one layer trips. Reliability is the architectural promise that a fast site stays serving. SGEN's reliability posture rests on four observable commitments.
Uptime targets are documented for the platform's hosted delivery tier and tracked against measurement that the platform publishes on its status surface. The status surface is the source of truth for current operating condition; an operator who suspects a platform-level issue checks status before assuming the problem is local to their site.
Failover is structured so that a problem in any one layer is contained by the layers around it. If the edge layer has trouble, traffic falls back to the origin — slower for the affected region during the incident, but functional. If one read replica has trouble, reads route to other replicas. If one application server has trouble, that server drops out of rotation and the others pick up. These transitions are usually invisible to visitors. If a transition is user-facing, it appears on the status surface with the impact and the estimated time to recovery.
Backup and restore sit alongside the reliability model rather than inside it. Backups protect against operator error, data loss, and intentional rollback. The reliability model protects against infrastructure or platform issues. The two work together — the reliability layers keep the site serving during transient trouble, and the Backup and Restore Architecture covers the longer-window protection against state loss.
Audit-ready logging means every meaningful action on the platform is recorded so that post-incident analysis can be precise. Operators do not have to maintain the logging; it is part of the platform's operating commitment.
The scalability model — how the same layers absorb growth from one site to a portfolio. The scaling story is built from the same four primitives in a different arrangement.
Layer one is the edge. For a typical content site, the edge absorbs eighty to ninety-five percent of total traffic. Most requests stop at the edge without ever reaching the platform application tier.
Layer two is the application server pool. When a request cannot be served from the edge, it reaches the application tier. The pool autoscales: as concurrent requests climb, more servers come online; as they drop, the pool shrinks. The operator does not configure this — it happens.
Layer three is the datastore tier. Application servers read from and write to the datastore. Reads in a typical content site dominate writes, so reads serve from replicas — distributing load away from the primary. Writes go to the primary and replicate out to the read pool. The datastore scales vertically with plan tier and horizontally with read-replica count.
Layer four is asset storage and delivery. Media uploads, exports, and large files live in object storage with a content delivery layer in front. This layer scales independently of the application tier — uploads and downloads do not compete with page renders for capacity.
Where the four primitives intersect. Performance, reliability, and scalability share the same primitives in different ratios.
| Architectural primitive | Role in performance | Role in reliability | Role in scalability |
|---|---|---|---|
| Edge cache | Absorbs repeat requests for visitor-facing speed | Falls back to origin if a node has trouble | Absorbs the vast majority of total traffic per site |
| Application pool | Renders the response when the cache misses | Pool members rotate out on trouble; others pick up | Autoscales on concurrent request rate |
| Datastore tier | Serves cached query results to the application | Read replicas absorb primary trouble | Scales vertically with plan tier; reads scale horizontally |
| Asset delivery | Serves media files close to the visitor | Scales independently of application capacity | Uploads do not compete with page renders for bandwidth |
The cache invalidation cadence — at-a-glance reference.
| Layer | Where it lives | Invalidates on publish? | Operator can manually purge? | Typical TTL window |
|---|---|---|---|---|
| Browser cache | The visitor's device | No — browser holds its copy until its own TTL expires | No — visitor must clear their own cache | Minutes to hours, controlled by cache directives |
| Edge cache | Geographically distributed edge nodes | Yes — purge signal sent on every publish; propagation completes within two minutes typically | Yes — from the admin performance panel | Up to five minutes after publish under normal conditions |
| Origin cache | Platform origin | Yes — cleared immediately on every publish | Yes — re-publishing the page clears it | Cleared on every publish |
Where cache state surfaces in admin. Cache state is an operational signal that lives in three surfaces an operator has access to.
The admin performance panel shows the cache status of published pages, the last-invalidation timestamp for each page, and whether a manual purge is pending. This is the correct surface for diagnosing origin- or edge-layer questions about a specific page.
The browser's own developer tools show whether a response was served from the browser's cache or fetched fresh from the network. On any page a visitor reports as stale, opening the Network panel and reloading shows the source of each response. This is the most direct diagnostic for browser-layer questions.
The platform status surface reports cache invalidation incidents that affect multiple regions. If multiple visitors in multiple regions report stale content at the same time, checking status is the right first step.
Operational characteristics
This section covers what the operator sees, where it surfaces, and what to do with the signals.
What the platform measures, and where it shows up. Performance and reliability are observed through three surfaces operators interact with regularly.
The analytics surface shows page-level performance over time — page weight, time-to-first-paint trends, the slowest pages, the heaviest images, the third-party scripts contributing most to the load. This is the surface for "is my site getting faster or slower over time?"
The performance panel in admin shows cache state, last publish times, and the per-page invalidation history. This is the surface for "why is this specific page showing old content?"
The status surface shows current operating condition across the platform — uptime, incident history, regional health. This is the surface for "is this slowdown local to my site or platform-wide?"
Traffic tiers — what most operators see as their site grows. Different traffic tiers exercise the architecture differently. Here is the typical operator experience at each tier.
| Monthly visitors | Operator experience |
|---|---|
| Under 10,000 | Edge cache handles almost everything. The application pool is rarely busy. No tuning needed. |
| 10,000 to 100,000 | Still no operator action required. Cache hit rate starts to matter — make sure cacheable pages have not been accidentally marked no-cache. |
| 100,000 to 1 million | Consider a higher plan tier for larger datastore headroom. Watch analytics for pages that bypass the cache. |
| 1 million to 10 million | Plan-tier check, usually a conversation with your account team. Review heavy queries; verify cache policy on every page type. |
| 10 million or more | Direct planning with the platform team. Specialized configuration is available at this tier. |
Portfolio scaling — how multi-site sites grow. For operators running more than one site, the model extends naturally. Every site scales independently of every other site. The isolation rule: Site A getting traffic does not slow Site B. The datastore is isolated per tenant. Asset storage is isolated per tenant. Cache configuration is per site. The application pool is shared infrastructure, but autoscaling means a spike on Site A spins up more capacity rather than starving Site B.
| Portfolio size | What operators usually do |
|---|---|
| 1 to 5 sites | Manage each site individually through the dashboard. No special configuration needed. |
| 6 to 25 sites | Use the dashboard's portfolio views — cross-site analytics, shared billing, batched user management. |
| 26 to 100 sites | Org-tier conversation. Shared brand standards, batched deploy windows, portfolio-wide health monitoring. |
| Over 100 sites | Direct relationship with the platform team. Custom rollout cadences and reporting available. |
Operator-tunable knobs — what to adjust and when. The defaults work for most sites. The knobs exist for the sites that have a specific shape and need to sharpen the speed for that shape.
| Knob | Default | When to adjust |
|---|---|---|
| Cache duration per content type | Long | Shorten if content updates many times per day; lengthen if updates are rare |
| Per-page cache rule | Cache on | Set no-cache only on genuinely personalized pages |
| Image format default | WebP with original fallback | Adjust if the site relies on specific formats |
| Image quality default | Visually lossless | Adjust if the site is photography-heavy and the quality knob matters |
| Lazy-load threshold | Standard | Tighten for long-scroll designs |
Cache purge — when to do it and what happens. Most operators rarely have to purge manually; the cache durations are short enough that natural expiry handles most updates. The cases where manual purge matters are real and worth knowing.
When the page content changes and the new version should be live now, normal save-and-publish triggers a targeted purge of the affected page; the cached copy is dropped and the next visitor gets the new version. This is automatic.
When an asset is replaced under the same filename, the asset cache is keyed on URL, so the old URL still serves the cached old asset. The fix is to either rename the new file or to manually purge the asset URL.
When a query result changes because of a relationship update, most relationship changes automatically purge affected query caches. Edge cases such as a long-running update or a batch import can leave query caches stale for the short window until the next natural expiry.
When a configuration change should take effect immediately, site-wide changes trigger a broader purge that affects every page using the changed block; the operator does not have to walk the site purging page by page.
| Action | Automatic purge | When to manual-purge |
|---|---|---|
| Save and publish a page | Yes — that page | Rarely needed |
| Replace an asset under same filename | No | Yes — purge the asset URL |
| Edit a taxonomy term | Yes — affected listings | Rarely needed |
| Site-wide block edit | Yes — every page using it | Rarely needed |
| Bulk import | Partial | Yes — purge the affected sections |
Capacity events worth a heads-up. Some traffic patterns deserve advance notice. Not because the platform cannot handle them, but because pre-warming caches and watching for hot spots smooths the experience.
Traffic events worth a short note to your account team:
- A marketing campaign launch with ten times or more your expected baseline (Black Friday, product launch, a press feature, a viral moment you can see coming).
- A scheduled television or podcast mention with unpredictable timing.
- A scheduled migration import of ten thousand or more pieces of content in a single batch.
- A planned subscription drop, ticket release, or any event where load will land on a specific URL at a specific minute.
Examples
Three concrete walkthroughs of the model in operation.
Example one — A mid-size content site responds to a viral mention. A mid-size publication is mentioned on a popular podcast. Traffic to a specific article climbs from a few hundred visits per day to fifty thousand visits in two hours.
The first hit from each region builds an edge-cached copy of the article. The next thousand visitors in that region get the response from the edge node in microseconds. The application servers never see most of the traffic. The first hit per region falls through to the application tier, which renders the article from cached query results and pushes the response back to the edge. Within minutes, the edge has warm copies in every region the visitors are arriving from.
The visitor experience: every visitor sees the article fast. The site does not slow down. The application pool sees only the first request per region; the cache absorbs the rest. The operator does not have to do anything. Two days later, traffic has settled back to normal. The cache holds the article ready for the next time it surges.
Example two — A retailer launches a planned product page at a specific minute. A retailer launches a new product on a known date and time. The product page has been live as a placeholder for two weeks; on launch day, the placeholder body swaps to the launch content, the hero image updates, and inventory becomes available.
The cache state before launch: the placeholder page has been served thousands of times. Edge caches in every region hold the placeholder version. Application caches hold rendered HTML of the placeholder. The team updates the page content while it is still under the placeholder URL using the publish-later workflow that holds the new version until the scheduled time.
At the launch moment, the scheduled publish fires. The page's cache is purged across every edge region at the same moment. The new version starts serving to every visitor on the next request. The asset cache picks up the new hero image on the first request from each region; the asset URL has changed because the file is new, so the cache treats it as a fresh asset.
The launch lands as expected. The cache absorbs the wave. The application pool autoscales for the inventory queries. No operator action during the launch is required.
Example three — A multi-brand agency with one site going viral while nineteen sites stay quiet. A multi-brand agency runs twenty client sites, none individually high-traffic. One site has a viral moment when a video is featured by a major creator. That site's traffic climbs fifty times for six hours. The other nineteen sites in the agency's portfolio see no effect.
The datastore is isolated per tenant — a heavy query on the viral site cannot lock the datastore for the others. Asset storage is isolated per tenant — the viral site's image bandwidth does not saturate the others. The application pool is shared but autoscales: a spike on the viral site adds capacity rather than starving the others.
The agency's account team watches the incident in real time on the org dashboard. No client work is disrupted on any of the other nineteen sites. The viral site's owner sees the surge, hears about it from the agency, and gets a quiet "you are fine; here is the traffic snapshot" message instead of an emergency. This is the isolation model working as designed.
Edge cases
A few performance and reliability shapes hit corners worth flagging up front.
The every-page-is-personalized site. A membership site where every page renders differently for every visitor cannot use the edge cache the way a content site can. The edge layer is bypassed; the application cache picks up the slack for the parts of the page that are the same across visitors; the personalized parts render fresh per request. The site is still fast — not as fast as a fully cacheable content site.
The third-party-script-ate-my-page case. A custom script added to every page that blocks the first paint will undermine server-first rendering. The fix is to load the script asynchronously, defer it, or move it to a lower-priority slot. The platform exposes the controls; the responsibility for using them sits with the operator who added the script.
The huge-image case. An uncompressed multi-megabyte image dropped on the homepage will be slow everywhere, regardless of caching. The platform's automatic image optimization handles most cases; the rare uploads that bypass optimization (very large originals, unusual formats) are worth reviewing before publish.
The cache-cleared-at-the-wrong-time case. A bulk import or a mass-update can trigger a wide cache purge. The next few minutes of traffic rebuild the cache. For a high-traffic site, this can look like a brief slowdown. The fix is to schedule mass updates for low-traffic windows or to use the batched-purge controls.
The hard-reload-and-still-see-old-content case. A hard reload (Ctrl+Shift+R) clears the browser's cached copy and forces a network request. If a hard reload still returns old content, the browser cache is not the source — the request reached the edge layer and the edge node returned its cached copy. The fix is to issue a manual edge purge from the admin performance panel.
The SEO-metadata-still-stale-in-search-results case. Search engines cache their own crawled versions of pages on their own infrastructure, which is outside SGEN's cache layers entirely. Clearing SGEN's edge cache ensures that the next time a search engine crawls the page it receives the current version — but crawl frequency is controlled by the search engine, not by SGEN. For time-sensitive metadata updates, a manual URL submission to the relevant search console requests a recrawl. This is outside SGEN's scope.
The bulk-edit-is-slow-to-propagate case. A bulk edit to fifty or more pages produces a large invalidation queue that is processed sequentially. Edge propagation for a bulk event typically completes within five to ten minutes rather than the standard two. Check the admin performance panel for queue status; the status surface flags active incidents if the queue is platform-wide.
The campaign-exceeded-the-pre-warmed-capacity case. Sometimes a campaign exceeds even the pre-warmed capacity. The autoscaling pool is climbing, but climbing has a finite rate. Cache hit rate stabilizes once the wave breaks. If the experience does not recover within a few minutes, contact your account team — specialized configuration may be available for your tier.
The graceful-degradation table — what visitors see when one layer trips.
| Layer in trouble | Visitor-facing effect |
|---|---|
| Edge cache | Traffic falls back to origin servers. Slower for the affected region during the incident, but functional. |
| One read replica | Reads route to other replicas; the primary absorbs overflow. Visitor sees no effect. |
| Asset storage | Cached assets continue serving; new uploads queue until the storage layer returns. |
| One application server | That server drops out of rotation; others pick up. Visitor sees no effect. |
What the operator can do, helps or hurts
The architecture handles scale. The operator's posture either helps or hurts.
Posture that helps the model:
- Cacheable pages stay cacheable. Do not accidentally add personalization to a page that does not need it.
- Images are sized and compressed (preferably WebP) before upload, or the default image optimization is left on.
- Forms and dynamic surfaces have rate-limiting where it makes sense.
- Custom-code injections do not add heavy synchronous third-party scripts to every page.
- Heads-up arrives before any predictable big event.
- Marking every page no-cache because of one personalized block. Move the personalization to a client-side fetch instead, so the rest of the page rides the cache while the personalization renders fresh.
- Heavy datastore queries from custom code that run on every page render.
- Massive uncompressed image uploads that bypass the optimization path.
- Third-party tracking scripts loaded synchronously, blocking the page render on every page.
Common performance mistakes operators make in the first month
Three patterns show up repeatedly in the first month of running a SGEN site. Each is reversible; each is harder to clean up than to avoid.
Uploading hero images at full camera resolution. A photographer or designer drops a multi-megabyte image into the hero slot, the page goes live, and the first paint feels sluggish for every visitor. The platform's image optimization handles most cases automatically, but operators sometimes bypass it by uploading through paths that skip the optimization step. The fix is to upload through the media library, accept the default optimization, and reserve the full-resolution original for archive purposes.
Marking every page no-cache because of one personalized block. A page has one personalized greeting at the top, the operator marks the whole page no-cache, and the entire page bypasses the edge cache forever. The result: the page is slower than it needs to be for every visitor for the lifetime of the site. The fix is to keep the page cacheable and move the personalized block to a client-side fetch.
Adding heavy third-party scripts to every page. An operator installs a chat widget, an analytics script, a marketing pixel, and a heatmap tracker — all loaded synchronously on every page. Each one is reasonable in isolation; the four together undermine the first paint everywhere on the site. The fix is to load every third-party script asynchronously or deferred and to audit the script list quarterly for the ones no longer earning their cost.
The first month of performance operations sets the baseline for the year. A monthly review of cache hit rate, image sizes, and third-party scripts catches most drift before it compounds.
How custom code interacts with the model
Custom code added through the platform's custom-code surface can either ride along with the performance model or undermine it. Three patterns are worth knowing.
Asynchronous custom scripts are fine. A script marked async or deferred loads on a schedule that does not block the first paint. The page renders fast; the script runs when the browser has cycles for it.
Synchronous custom scripts in the head are a tax. A script in the head without async or defer blocks the first paint while the browser waits for it. The first paint is delayed by however long the script takes to download and run. This is the most common cause of "the site felt fast and now it does not" reports.
Inline custom CSS is fine. A small block of inline CSS in the head is part of the document and does not slow the page. Custom CSS loaded as a separate file is also fine, as long as it is sized and cached like any other stylesheet.
The pattern: custom code that loads asynchronously is invisible to the performance model. Custom code that loads synchronously in the head is a cost the operator decided to pay. Both are supported; the choice has consequences.
How this story evolves over time
Architecture evolves. Cache topologies improve, datastore engines get faster, edge networks add new regions. The shape on this page is the current model; the specific implementation behind each layer is reviewed and updated on a roadmap the account team can share at the appropriate detail level.
What does not change without notice:
- Server-first rendering as the default. Pages render on the server and arrive at the browser as finished documents.
- The cache layer count and shape. Browser, edge, and origin remain the three layers an operator reasons about.
- The isolation rule. Sites in a portfolio stay isolated from one another at the datastore, asset, and capacity layers.
- The operator-facing simplicity. You do not configure datacenters. You do not size cache instances. You do not tune the application server pool.
- Cache topology improvements. As the edge network grows and the cache algorithms improve, the per-layer absorption rates climb. The site gets faster on the same configuration.
- Image format expansion. As new image formats become widely supported, the platform adds them to the default optimization output.
- New tunable knobs. As operators tell the platform team which shapes need more control, the knob set expands. New knobs are always optional; existing defaults keep working.
The shortest possible summary
If you read nothing else on this page, take this with you:
SGEN delivers fast pages through server-first rendering, smart loading, three cache layers (browser, edge, origin), and a small set of operator-tunable knobs. It stays dependable through autoscaling capacity, contained failover at every layer, and an observable status surface. It scales from one site to a portfolio because every site is isolated at the datastore, asset, and traffic layers. The defaults work; the knobs exist; the cache absorbs the load; the platform team handles the sizing decisions.Send that paragraph to a stakeholder who needs the mental model and you have done the job this page is for.
Related
This concept doc sits in the architecture cluster. The siblings cover the surrounding ground:
- Data Model Overview — the queries that the datastore cache layer is caching.
- Multi-Site Architecture Overview — how cache state, datastore tenancy, and asset isolation stay separated per site in a portfolio.
- Observability and Monitoring Overview — where cache invalidation, application-pool autoscaling, and incident events surface as platform signals.
- Backup and Restore Architecture Overview — the longer-window protection against state loss that sits alongside the reliability model.
- Deployment and Release Overview — how new versions of the platform itself ship to your site without disruption.
- Extensibility Overview — how custom code interacts with the performance model.
- SGEN Glossary — definitions for
edge cache,first paint,lazy loading,failover,autoscaling, and other terms used on this page.
