Here we list some known issues that Pingdom has, any big/common issues will be listed here and a workaround will be provided. These are things we are aware of and we will update this article with new ones, and when an issue goes away. So you might want to hit the subscribe button in the top right section of this article.
Any current or historical outages on the Pingdom side of things are available on status.pingdom.com
A mismatch between root cause analysis and error given.
The Root Cause Analysis is performed after an outage is confirmed, so if the outage is short it can sometimes not detect it, full details:
When one of our probe servers cannot connect to a site or server, Pingdom's system will first mark the check as Unconfirmed Down and then ask another probe server to try to make the same connection, we call this a Second Opinion, we try to make the second opinion as geographically different as possible to make it easier to determine where the issue is. Your check (site or server) will only be marked as confirmed Down if the second test also fails.
After the second opinion has been made, both servers will each send one more request for the analysis. This means that the Root Cause analysis is made after the initial errors were found by our probe servers. Sometimes, especially in cases of really short lived errors, there will no longer be any errors found on the page once the Analysis has been made. The error that our servers reported first, however, is still what they first saw when initially requesting the site or server.
To troubleshoot further use the Test Result Log that can be found in the icon next to the Root cause analysis. This will give you the exact time of our servers requests, as well as an additional error message. These errors could be found in any access logs you have for your website, and if you compare them to the exact times our servers sent their requests it could help you further identify the cause of the outage.
CRITICAL - Cannot make SSL connection in root cause analysis
Our root cause analysis works by running additional tests after an outage is confirmed, it is however a bit outdated so in some cases you will get a response in the analysis that doesn't match up against the real reason for an outage. The most common one looks like this:
CRITICAL - Cannot make SSL connection
17145:error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure:s23_clnt.c:602:
GET /YOURURL HTTP/1.0
Now all this means is that the root cause analysis could not find the issue, to troubleshoot, use the test result log instead, it lists the timestamp and error reason. These timestamps and errors together with your own access logs can find the reason for the outage, so these are not false alerts, just somewhat faulty root cause analysis reports.