How To Eliminate Internal Status Emails Using PagerDuty And StatusPage
C2FO is the world’s market for working capital and risk-free profit. C2FO is the only working capital exchange that allows companies to optimize their working capital positions in a live, bid/ask environment. Companies across the globe use C2FO to increase their operating income while simultaneously producing vital working capital flows to their supply chain.
User: Patrick McDonald, Director of Network Ops
Infrastructure: Manages 20 servers in AWS, 10 in SoftLayer’s data centers
Team: Full 24x7 on-call organization with a 3 person Ops team outside of Patrick
Challenges: Redundant status emails and missed alerts
Before using StatusPage and PagerDuty, Patrick spent the majority of his day communicating incident status to the rest of the company whenever backend issues arose. Instead of 100% focusing on fixing the issues at hand, this meant Patrick had to regurgitate Jira ticket information within manual status emails. Additionally, it meant he had to reply individually to every internal email from colleagues looking for more information on the current status of productions systems.
Patrick and his team rely on Nagios to watch after their systems, but did not have the time nor resources to customize alerts and escalation workflows within Nagios. Without the proper workflows, alert fatigue set in. To improve their system reliability organization-wide communication, his team needed a solution to ensure the team and company were aware of issues immediately, without taking on the overhead of managing a system themselves.
Solution: Streamline incident response and communication
In an effort to eliminate repetitive status communication and save Patrick a day’s worth of time during and after an outage, C2FO setup StatusPage and integrated PagerDuty and HipChat. Now, the whole company has one place to check during an outage and every employee can opt-in to real-time email and SMS status notifications. Whenever Nagios detects an issue, Nagios sends the alerts to PagerDuty and PagerDuty routes them to the right on-call engineers for the job. While the engineers are working to restore service, StatusPage notifies employees whom have opted into notifications through C2FO's status page. This automatic process provides C2FO with one source of truth to check during an outage.
With StatusPage and PagerDuty working togeter, Patrick can now focus on restoring services during outages, knowing that everyone else in the company can use C2FO’s internal status page as the authoritative source for status information.
"Before StatusPage, when we had an outage, I would spend a significant amount of time talking to employees about the outage and giving status updates, leaving no time to help fix the actual problem. Now that we have StatusPage implemented, and hooked it up with PagerDuty, that time has been significantly reduced. StatusPage is THE authoritative source of information within our organization."
“It’s easy for an email to slip through if someone has been up 14 hours already. PagerDuty allowed us to ensure we respond to every actionable alert and has improved our resolution times by 300%."
StatusPage.io + PagerDuty Integration
To setup the integration, C2FO has created several rules around their PagerDuty services, such as 'Nagios - Production' pictured below. In this example, whenever the Nagios - Production service triggers an alert in PagerDuty, StatusPage will flip the 'Production Environment' component to 'Degraded Performance' and open an incident on the status page using a pre-existing template. As the incident progresses in PagerDuty, StatusPage will update the incident using another pre-formed template and finally resolve the incident when StatusPage receives the resolution from PagerDuty.
Using Incident Templates
C2FO uses 3 separate templates as shown below for creating, updating, and resolving incidents.
Viewing The Incident
Using the pre-made rules and templates, StatusPage will automatically create incidents based on PagerDuty triggers, allowing any C2FO employee to view the latest incident information in real time and receive instant notifications.
Receiving Email And SMS Notifications
All employees, from the customer support team to C-Level Executives, can opt into notifications, keeping everyone in the company informed during an incident.
To test the integration out for yourself, create a free account today at statuspage.io/signup and watch the PagerDuty integration video below. Feel free to give us a shout as well if you have any questions!