Troubleshooting problems from a chat room with Slack and OpsGenie

Jun 17, 2016 by Berkay Mollamustafaoglu
OpsGnie Slack together

With OpsGenie you have the option of executing actions directly through our OpsGenie app. We have described how this capability can be used to gather additional information and enable alert recipients to assess problems efficiently in the post “You woke me up, now what?”

Directly through slack you can (1) forward alert activity to Slack channels and (2) allow users to interact with alerts, acknowledge, comment, close, etc. Refer to our blog post titled “Bi- Directional Integration with Slack”

When we combine these capabilities, you can execute custom commands.

(For example, one can execute commands like ping, or traceroute.)

OpsGnie Slack Diagram

Here’s how you execute custom alerts:

  1. When you create alerts in OpsGenie, you can specify relevant actions to that alert. (A network related alerts may have actions like ping and traceroute, an application problem may have an action that gather info from a log file, etc.)
  2. OpsGenie forwards all alert activity’s to Slack. So, when there is a new alert or when a note is added to an alert users will be able to see it on their Slack channels.
  3. Thanks to Slack’s support command execution, users can execute the custom action from Slack (/Genie commands) One of the commands OpsGenie integration supports is the “exec” command that allows executing the custom action.
  4. OpsGenie passes the alert and the action executed by the user to the customer systems. Marid utility subscribes to these actions, and can execute the relevant script.
  5. Marid collects the output and adds to the alerts as a note, which gets passed to Slack, hence the user can see the output of the command.

There are numerous advantages to executing actions directly from the chat room- the problems are visible to users in the chat room as well as anyone who may get alert notifications from OpsGenie and use OpsGenie’s app or web UI. If the alert is escalated the following person has full visibility of what has been done. The ability to make relevant actions available for alerts allows guiding users to follow a common procedure.

Take a look at this short screencast to see how it all comes together

