Investigating Degraded App Performance

Incident Report for JobNimbus

Postmortem

Platform Performance Degradation - March 13, 2026

Beginning yesterday, March 12, at 10:48 PM MT, our platform experienced degraded performance affecting login times, payment processing, and general application responsiveness. We understand how critical JobNimbus is to your daily operations and sincerely apologize for the disruption.

What happened: An internal process generated an unexpectedly high volume of database operations, which impacted authentication, payments, and API response times across the platform. Our existing performance monitors did not trigger overnight, as the degradation did not fully manifest until normal morning usage amplified the load on our infrastructure.

What we did: Our engineering team identified and resolved the root cause, restoring normal database and system performance. Actions were taken to accelerate processing of any items queued by the delays. Site performance returned to normal by 8:35 AM MT, and all lagging processes were cleared by 10:16 AM MT. Our Fintech team is working through payment inconsistencies today and reaching out to affected accounts as needed. The payments impact was isolated to a small number of specific payment attempts.

What we're doing next: We are implementing additional automated monitoring to detect abnormal infrastructure load patterns independent of user-facing response times, hardening our safeguards to prevent recurrence, and adding rate-limiting protections to better isolate workloads across our distributed infrastructure.

We take the reliability of our platform seriously and recognize that we fell short today. Thank you for your patience, and please don't hesitate to reach out to our support team if you have any concerns.

Posted Mar 13, 2026 - 14:24 MDT

Resolved

Update: Normal operations restored. We will post details from our Root Cause Analysis before End of Day today.
Posted Mar 13, 2026 - 10:16 MDT

Update

Update: We have deployed backend changes and are seeing the expected improvements. We are monitoring as the system recovers and queued work continues to drain.

Estimated time for delayed processing to fully clear: ~35 minutes. We will provide further updates as needed.
Posted Mar 13, 2026 - 10:01 MDT

Monitoring

Update: We see application performance at normal levels and are monitoring the final processing of backend queues.
Posted Mar 13, 2026 - 09:15 MDT

Update

Update: We have addressed the root cause and see performance on the application returning to normal. We are mitigating the delays in backend queue processing and will provide another update in 15 minutes.
Posted Mar 13, 2026 - 08:56 MDT

Identified

Update: We have identified the root cause. We are remediating and mitigating impact.
Posted Mar 13, 2026 - 08:25 MDT

Update

Update: We continue to see abnormal pressure and database utilization. As morning traffic scales up, we are seeing additional disruptions as the system takes on more load.

While we investigate the root cause, we are also pursuing mitigations in parallel to restore normal operations. We will continue to provide updates.
Posted Mar 13, 2026 - 08:10 MDT

Update

We are currently experiencing disruptions to logins and page loads. This is caused by high utilization on a core database, which is creating back pressure across our distributed systems.

Our team is actively investigating the root cause and evaluating mitigation strategies. We will provide updates as we learn more.
Posted Mar 13, 2026 - 07:49 MDT

Investigating

We are observing increased latency and error rates across the web application. We are actively investigating.
Posted Mar 13, 2026 - 07:00 MDT
This incident affected: Login and Web Application Performance.