Integrating Chat With Your Status Page
One of our customers, Mathias Meyer from Travis CI, wrote an awesome blog post today on 'Operating Your Site with Hubot'. Mathias has open-sourced his Hubot script, allowing you to automatically update your status page with just one line in your chat client. A few other customers have asked about this integration so we thought we'd go ahead and share parts of the post below. Thanks, Mathias!
(all content below reposted with permission from their blog)
Operating Your Site with Hubot - Travis CI
While we're still a pretty small team, we heavily rely on Campfire for our daily work and for communicating. Adding Hubot to the mix was only natural.
While Hubot is great for the pug bombs alone, we drew some more inspiration from GitHub's daily workflow, in particular for the operational part of Travis CI.
For one, I wanted to have a seamless integration of pushing incident updates to our status page.
Let's have a look at how this works out.
We're using the fine services from StatusPage.io for our status page. I heckled them early on to throw an API our way so we can integrate it with Hubot. They were kind enough to ship one very shortly after, and we were off!
The integration allows for a few very simple commands. It allows you to update the status of specific components. For instance, to set the status of our component "Build Processing" to a degraded state, all that's required is the command
/status Build Processing degraded.
Whenever a component is in a non-operational state, a status message should go along with it to explain what's going on.
Here's an example from a recent outage where our build processing was having issues because of problems with the infrastructure handling our private cloud setup. We noticed an increasing amount of errors and updated our status page right away.
The incident immediately pops up on our status page and on the Twitter feed for our system status.
During an outage we send updates to the page with the
'/status update' command. Same for resolving an issue when the site has recovered from the incident.
Insert from StatusPage: Here's what the incident now looks like on Travis CI's page after the update.
You may wonder why we're going through the trouble of building the scripts and integrating them rather than go through the beautiful web app that's powering StatusPage.io.
Switching contexts is expensive, even more so during an outage. We're a small team, and the more we can focus on communication and handling the incident, the less context-switching is required.
In the spirit of Travis CI, we open sourced our Hubot integrations for StatusPage.io and OpsGenie. It's a continuous effort, and we're working on adding more of our essential services to the mix.