When users experience problems with a service, they frequently turn to Downdetector and other online resources to verify and troubleshoot the issue. Many will visit Downdetector directly; some will seek information using search engines such as Google. Others will turn to Twitter and other social media outlets for validation and help. Downdetector monitors and analyzes signals from platforms in real-time to automatically detect incidents and service disruptions in their very early stages.
A small number of users reporting a problem does not constitute a large-scale incident. To ensure that incidents are correctly represented, a baseline volume of typical problem reports is calculated for each service that Downdetector monitors. By verifying that issues affect a large group of individuals, Downdetector only reports an incident when the number of problem reports is significantly higher than the baseline.
Downdetector status, reports and insights are entirely based on online service consumer signals, and it’s not integrated into any of the online services infrastructure or platforms.
Sources of Data
-
Indicators: When a user visits a company status page on a Downdetector website, they can explicitly report a service problem by clicking the Red Button on a service page to select one of the company’s services with which they are currently having a potential problem. These options are called Indicators and are configured for every company monitored. Indicators are presented in the local language of the specific Downdetector country website. When a user submits a report with an Indicator specified, that information is collected along with contextual details such as the date, time, geolocation and the specific Downdetector website used.
-
Tweets: Each monitored company is configured with a list of inclusion words that are used to identify Tweets that may be relevant to the status of that company’s online service. Using the Twitter API, a statistical subset of the Tweets that match the inclusion words are collected, analyzed and filtered to determine if the content signals an issue or problem with the monitored company. Each collected Tweets is analyzed using a proprietary, multi-stage, language-specific natural language processing algorithm. Tweets are scored based on relevance to the monitored company and the sentiment of the content to determine whether it should be counted as a problem report. Tweets that match or exceed a minimum confidence threshold for the specific language are counted as reports.
-
Others: A Downdetector proprietary mechanism that monitors user activity from other sources to identify when issues occur. For example, a sudden increase of visits to a company status page on a Downdetector website under certain conditions is a strong signal that users are experiencing problems with that service. To the degree possible, these signals are collected along with contextual details such as the date, time, geolocation and the specific Downdetector website involved.
A look into a service page on the Downdetector consumer website
Every company/service page on the Downdetector consumer website includes insightful information on what’s happening in real-time based on consumer-driven signals and reports. These insights include:
-
Current Status: Downdetector computes the strength of evidence for an incident by comparing the current number of reports with the appropriate baseline. Based on the strength of evidence and duration that evidence has been observed, a company is then placed in one of three states;
- No problems
success
— there is no evidence or weak evidence that the company is experiencing an incident. - Possible problems
warning
— there is moderate evidence that the company may be experiencing an incident for a sufficient duration. - Problems
danger
— there is strong evidence that the company is experiencing an incident for a sufficient duration.
- No problems
-
Indicators: A list of the company’s services that consumers mostly use. Indicators allow those consumers to select which service type they are experiencing an issue with.
-
Outages reported in the last 24 hours chart: This chart shows a view of problem reports submitted in the past 24 hours compared to the typical volume of reports by time of day. The chart also shows a
baseline
value which is a variable baseline volume for typical volume of reports and indicators for each15-minute interval for each of the previous 365 days, if it exists. -
Live outage map: Downdetector uses GeoIP data from MaxMind to infer
latitude
andlongitude
coordinates for each problem report (except Tweets) based on the user’s IP address who generated the report. When available, information about the user’s region, including country and city, is also provided. Besides, Downdetector also asks mobile users for permission to collect location details, contributing to the reports locations information when possible. -
Comments: Downdetector leverages a third-party system, Disqus, to manage user comments on each Downdetector website. User comments are not counted as problem reports, and they do not contribute to the incident detection system. However, they can contain helpful information about the underlying problems users are experiencing.