Apr 20, 2016 by Berkay Mollamustafaoglu
In the coming weeks OpsGenie will help buyers looking for a reliable, scalable, and customizable
alerting and incident management solution by assessing features, toolsets, and functionality in a
comprehensive comparison between OpsGenie, Pagerduty, and VictorOps through a series of detailed blog
posts. It is our goal to shed light on who does what, and the stark realities between the three popular
technologies. OpsGenie will concentrate on areas within our platform that we believe are extremely
important when looking for an alerting and incident management solution for the dev&ops and IT community
in general. This week we will focus on Email Integration.
PART 1: Email Integration -- A comparative assessment of the direct contrasts between OpsGenie,
Pagerduty, and VictorOps.
OpsGenie’s email integration enables customers to integrate OpsGenie with any system that can send
alerts via email. Email integration is the most commonly used integration method by our customers since
it is easy to use and almost any system out there can send emails.
Apr 5, 2016 by Berkay Mollamustafaoglu
For most of us in ops, it is vital for us to get notified asap about problems that impact the services
we provide. It’s often a race against time to restore the service or to prevent an outage. But not all
alerts require an immediate response, some can wait. Enabling users to deal with alerts that don’t
require an immediate response efficiently, is just as important in preventing alert fatigue, to ensure
we can stay fresh.
At OpsGenie our mission is to empower our users to be able to handle critical as well as non
critical/urgent incidents efficiently.
Mar 25, 2016 by by the OpsGenie Team
The OpsGenie team recently had a thorough and heated discussion (KAPOW!!) on who would
be better with on-call alerts and incident management,
Superman or Batman? Who would come out as winner when pitted against each other in a war of on-call
alerts and response time? So, we thought we would hash it out
here on our blog in a completely fictional format. We’ll try to examine each area of alerting and
incident management to see who we think we would want on our on-call team.
Mar 11, 2016 by Nadia Mehra
As if you needed another reason to love OpsGenie and all its capabilities- We released an OpsGenie app
for the hi-tech,
sleek Apple Watch; where now, you can get the most out of your weekends and look stylish doing it.
Mar 3, 2016 by Berkay Mollamustafaoglu
Continuing with the discussion on how OpsGenie can help
alleviate alert fatigue
we will be examining areas where on-call employees take specific bulk actions to
reduce the excessive alerts that often hinder operations.
Feb 25, 2016 by Berkay Mollamustafaoglu
The concept of “Alert Fatigue” is well known in industries such as healthcare, and awareness is increasing in IT operations as well.
Fighting alert fatigue has been a key design objective for OpsGenie since our inception.
Summarized in the earlier post, some of the key capabilities that OpsGenie provides can be used to alleviate alert fatigue.
In a two part series, I go into more detail on how these features can
improve the alert signal to noise ratio.
Feb 8, 2016 by Berkay Mollamustafaoglu
Since we launched the OpsGenie phone call routing feature last year, we’ve had an enormously great
response from customers. So much, in fact, that we’re dusting off this blog post from last year and
updating it for everyone who is not as familiar with it. Is it easy to use? Yes, it is! You see,
OpsGenie routes alerts to the appropriate on-call individual using a method of policies, on-call
schedules, etc.. Prior to the launch of the application last year, we heard similar questions from a
number of our OpsGenie customers, such as “Can we route phone calls to the right person like we
route the alerts?” This turned out to be a great question, one that resonated with many of our
customers. For a product team, customer feedback like this is priceless!
Jan 29, 2016 by Berkay Mollamustafaoglu
As an alert notification solution, our first priority is to ensure that the right person is notified when
there is a problem. OpsGenie sends multiple notifications through
etc. to ensure that critical alerts don’t get missed.
As crucial as that is, if an alert notification system just stops at “waking you up”, it becomes part of the problem rather than a solution.
Jan 4, 2016 by Tuba Öztürk
Every service provider wants their services to be available 24x7x365. But outages and planned maintenance are inevitable occurrences for online software services.
Dealing with outages and communicating with users during the outage is as important as the availability of the services provided.
To keep users informed, many service providers use web based “status pages” that contain up to date information about the health of the services, incidents, and what the provider is doing to resolve the issues.
OpsGenie is an incident management system for Dev & Ops teams. Customers use OpsGenie to consolidate their alerts generated by monitoring systems and route them to the right people using on-call schedules and escalations.
Because OpsGenie is an essential tool used during outages and we have vital information about the incidents; our customers have been inquiring if we can create “status pages” programmatically based on the alerts generated in OpsGenie.
Responding this request, we’ve taken up the challenge to provide this solution to manage status pages for OpsGenie customers.
Dec 18, 2015 by Kadir Türker Gülsoy
As long as our applications are in production, boosting uptime and avoiding outages is the highest priority for us developers
and operational teams. Despite the great care, having 100% uptime and avoiding outages is a challenging task for even the
most stringent DevOps teams. Let’s imagine that one of your data centers stops responding and in-turn your email service is
completely out, or your payment service has gone offline during Black Friday. Remember the AWS outage that lasted four days
and affected countless numbers of cloud services in April 2011. This is a good example that outages happen even to the most
secure environments.. Now what? Are you going to examine huge log files to find out what went wrong? Are you going to notify
all of your operational teams and developers at the same time to investigate the cause? Unless you allocate large resources
for chaos engineering like Netflix does, you most likely will have very limited time to overcome the issue. So those aren’t
realistic options for most organizations.