Eptura - Visitor - S2 - System slowness and Performance Issues

Incident Report for Eptura

Postmortem

We are grateful for your continued support and loyalty. We value your feedback and appreciate your patience as we worked to resolve this incident.

Type of Event

S2 – Visitor application performance degradation

 

Services/Modules Impacted

Eptura Visitor – Admin Dashboard
Eptura Visitor – APIs / Webhook
Eptura Visitor – Check-in / LogBook
Eptura Visitor – Kiosk
Eptura Visitor – Notifications

 

Issue Summary/Background

Between January 28 and February 10, 2026, some customers experienced intermittent performance degradation within the Eptura Visitor application. Affected users reported a range of symptoms, including slow login, delayed visitor check‑ins, sluggish dashboard loading, slower‑than‑normal API responses, and delays in receiving notifications. While not all customers were impacted, those who were experienced inconsistent performance, with periods of normal functionality followed by renewed slowdowns—most notably during peak business hours.

Throughout the event, our engineering and infrastructure teams conducted multiple investigation cycles to identify the underlying drivers of the performance fluctuations. During this time, we deployed several hotfixes and performed targeted infrastructure and database tuning to stabilize the environment. One of these hotfix deployments resulted in a temporary login disruption for a subset of customers, which was identified and resolved shortly after it occurred.

Following the final round of optimizations, including additional database adjustments and tuning, overall system performance returned to normal and has remained stable and within baseline expectations since February 10, 2026.

 

Root Cause

The performance issues in the Eptura Visitor application were caused by two related problems that created delays during periods of high activity. The first issue was a defect in the SCIM user‑lookup process. Under certain conditions, the system handled some lookup requests incorrectly, causing them to take significantly longer than expected—sometimes over 40 seconds. These slow requests consumed more system resources than normal and contributed to overall application slowness.

At the same time, the investigation identified inefficiencies in how the application interacted with the database during common operations. Some features, such as retrieving kiosk information, made many small, individual database requests instead of using a more efficient, combined approach. Under increased traffic, these repeated calls added unnecessary load and slowed down responses.

These slower operations placed additional pressure on the database, which was further limited by a configuration that allowed only a small number of active connections at a time. During busy hours, long‑running requests filled those connections, causing new requests to wait, queue, or time out. As requests retried, they added even more demand on the system.

Together, the slow user‑lookup requests, inefficient database access, and limited connection capacity created a cycle where delays increased during peak usage and spread across multiple parts of the application. This is what caused the intermittent and sometimes significant performance degradation customers experienced during the incident window.

 

Remediation

To restore and stabilize performance for the Eptura Visitor application, we executed a series of corrective actions and hotfix deployments:

  • Immediate triage and validation:
    We initiated an urgent review of application and infrastructure health, evaluating API behavior, server performance, and database utilization to confirm where delays were originating.
  • Targeted hotfix for SCIM performance and errors:
    We deployed a fix to correct the SCIM v1 user‑lookup issue that was causing long‑running requests. The update ensured values were processed correctly, eliminated repeated failures, and reduced unnecessary load on affected services.
  • API and infrastructure scaling:
    We increased API server capacity and adjusted backend resource allocation to better distribute traffic and lower the risk of delays during peak usage.
  • First performance hotfix deployment (February 4, 2026):
    A performance-focused hotfix was released to address the most impactful API bottlenecks, improving response times across several high‑traffic endpoints. During this deployment, some customers experienced a brief login disruption, which was resolved shortly afterward.
  • Follow‑up hotfixes and database optimizations (February 8–9, 2026):
    Additional updates refined database connection handling and improved query efficiency. These changes reduced the number of sequential calls, shortened query duration, and helped prevent long‑held connections that could lead to slowdowns under peak load.
  • Security configuration adjustments:
    We adjusted specific security cipher configurations to reduce system overhead while ensuring that required security standards remained fully in place.
  • Continuous monitoring and verification:
    Throughout and after these updates, we closely monitored system metrics to verify that performance remained stable. Following these combined remediation steps, response times returned to normal baseline levels with no further widespread performance issues detected.

 

Timeline (UTC)

January 28, 2026

  • 12:32 PM – We received the first customer report of login and application slowness in Eptura Visitor.
  • 01:00 PM – Additional customer reports confirmed a broader performance issue.
  • 01:39 PM – A critical incident was opened, and an investigation began with our engineering and infrastructure teams.
  • 01:48 PM – Active triage and performance analysis was initiated.
  • 03:14 PM – We confirmed a performance degradation affecting parts of the Visitor application.
  • 03:29 PM – A public status page incident was opened to inform customers and provide updates.
  • 04:02 PM – Recent changes were identified, and deeper analysis began across the application and infrastructure.
  • 04:46 PM – Customers began confirming improved performance, and we continued close monitoring.

January 29, 2026

  • Ongoing – The platform remained under heightened monitoring. Additional reports were reviewed, and customers whose issues had improved continued to confirm resolution.

January 30, 2026

  • 01:57 PM – Some customers reported continued slowness, although overall performance appeared to be improving.
  • 04:04 PM – We increased API server capacity to improve responsiveness and identified specific authentication‑related endpoints as a contributing factor.
  • 04:06 PM – The identified performance issue was escalated to Engineering for focused remediation.
  • 04:52 PM – The first status page incident was marked resolved based on observed stability and confirmation from impacted customers.

February 2, 2026

  • A customer reported intermittent slowness at a specific location. System monitoring did not indicate a widespread issue, but we kept the service under heightened observation.

February 3, 2026

  • 08:40 AM – New customer reports of application slowness and login delays indicated a recurrence of the issue.
  • 09:15 AM – The incident was re‑escalated, and Product and Engineering teams were re‑engaged to support remediation.
  • 01:03 PM – Engineering confirmed defects in user synchronization and authentication flows and prepared two hotfixes to address them.
  • 01:18 PM – A second public status page incident was opened to communicate ongoing Visitor performance issues and our remediation efforts.

February 4, 2026

  • 12:10 AM – The first hotfix was found to have issues during review, so deployment was deferred to avoid additional disruption while corrections were made.
  • 05:10 AM – Deployment of corrected hotfixes began. Customers were advised that temporary slowness could occur during the release window.
  • 06:08 AM – Some customers experienced login failures and temporary application unavailability during deployment.
  • 06:43 AM – Login functionality was restored following completion of the core hotfix steps.
  • 07:18 AM – Customers confirmed that login issues were resolved; minor slowness remained under observation.
  • 07:22 AM – Both hotfixes were confirmed as successfully applied, with initial testing showing improved performance, though intermittent latency was still being monitored.

February 5, 2026

  • 09:14 AM – After continued monitoring and positive customer feedback, the second status page incident was marked resolved, while we continued to track minor intermittent slowness.
  • 03:11 PM – We continued to receive occasional reports of slowness, and metrics confirmed that further optimization work was required.

February 6, 2026

  • 09:44 AM – Our teams identified additional contributors to the performance issues and planned further optimizations for the upcoming weekend maintenance window.
  • 04:19 PM – A third public status page incident was opened to inform customers of ongoing intermittent performance impact and scheduled optimization work.

February 8–9, 2026

  • February 8–9 (weekend window) – A second round of hotfixes and database optimizations was deployed to improve how the application uses the database and handles higher traffic levels.
  • February 9, 06:12 AM – Deployment of these changes was completed, and we continued active monitoring.
  • February 9, 12:19 PM – We observed significant and sustained performance improvements, and no new widespread customer reports were received.

February 10, 2026

  • 11:45 AM – Post‑maintenance monitoring confirmed stable performance across all locations. The incident remained in a monitoring state to ensure continued stability.

February 12, 2026

  • 04:32 PM – After extended stability and no new widespread reports, the third status page incident was marked resolved, and the overall event was closed.

 

Total Duration of Event

308 hours (January 28, 2026 – February 10, 2026)

 

Preventive Actions

To prevent similar performance issues and directly address the identified root causes, we are implementing the following improvements:

  • Stronger validation of SCIM and authentication changes
    We are enhancing pre‑production testing for SCIM and authentication flows, including parameter handling (e.g., dates and UUIDs) and end‑to‑end performance testing during simulated peak hours.
  • Expanded performance testing for high‑traffic APIs and database usage
    We are broadening load and stress testing for kiosk, check‑in, and dashboard APIs with a focus on query efficiency, batching, and connection utilization to eliminate long‑running operations.
  • Refined database connection and capacity planning
    We are formalizing capacity planning with scheduled reviews of database connection thresholds and server scaling to ensure headroom ahead of projected growth and seasonal peaks.
  • Stricter review for security and infrastructure configuration changes
    We are establishing a mandatory cross‑functional review (Product, Engineering, SRE, Security) for encryption, cipher, and infrastructure updates, including explicit performance‑impact assessments prior to rollout.
  • Improved monitoring and alerting for latency and error patterns
    We are strengthening real‑time monitoring to detect rising response times, connection saturation, and SCIM‑related errors earlier, with clearer automated alerts and runbooks for faster mitigation.
  • Safer deployment strategy for hotfixes
    We are adopting phased rollouts, enhanced rollback procedures, and proactive customer communications to minimize risk and customer impact during remediation releases.
  • Ongoing database health and index management
    We are instituting regular maintenance and index reviews to reduce fragmentation, retire unused indexes, and maintain query performance within baseline targets.
  • Sustained verification
    We will continue close monitoring to confirm stability and ensure that performance remains within defined baselines.

All services are operating normally, and we will continue to monitor the platform closely to ensure consistent performance. We appreciate your partnership and the patience you demonstrated while we worked through this issue. Your feedback and collaboration play an important role in helping us strengthen the Eptura Visitor experience, and we remain committed to delivering the reliability and performance your organization expects.

Posted Feb 24, 2026 - 11:00 UTC

Resolved

We are pleased to inform you that the issue with the 'slowness' has been resolved. Our team has completed the necessary actions and verified that the service is now functioning normally.
A Root Cause Analysis (RCA) will be conducted to understand the incident in detail and will be made available on our Status Page within 10 days.
Thank you for your patience and cooperation throughout this process. If you have any further questions or concerns, please feel free to reach out.
Posted Jan 30, 2026 - 16:52 UTC

Update

We’re actively monitoring the system and will keep you updated as soon as new information becomes available.
Thank you for your patience and understanding.
Posted Jan 29, 2026 - 18:37 UTC

Update

We are continuing to monitor the system closely and will provide additional updates as they become available.

Thank you for your continued patience.
Posted Jan 29, 2026 - 00:00 UTC

Monitoring

System performance has improved, and the response times remain stable.

We are monitoring the system closely and will share further updates as needed.

Thank you for your continued patience.
Posted Jan 28, 2026 - 16:47 UTC

Update

We are currently aware of slowness across the application, impacting page load times and certain workflows. Our teams are actively investigating the root cause and have started observing improved response times. Users may continue to experience intermittent slowness.

We continue to investigate and monitor system behavior to ensure sustained recovery. We will provide another update if there are any changes or once the issue is fully resolved.

Thank you for your patience.
Posted Jan 28, 2026 - 15:37 UTC

Update

We are continuing to investigate this issue.
Posted Jan 28, 2026 - 15:31 UTC

Investigating

We have received reports with issues to Visitor system and this is under investigation.
Posted Jan 28, 2026 - 15:29 UTC
This incident affected: Eptura Visitor (Admin Dashboard, APIs / Webhook, Check in / LogBook, Notifications, SAML Sign On).