The GoDaddy DNS event (which I wrote about here) has been the subject of many a post-mortem and water-cooler conversation in the web performance world for the last week. In addition to the many well-publicized issues that have been discussed, there was one more, hidden effect that most folks may not have noticed – unless you use Firefox.
Firefox uses OCSP lookups to validate the certificate of SSL certificates. If you go to a new site and connect using SSL, Firefox has a process to check the validity of SSL cert. The results are of the lookup cached and stored for some time (I have heard 3 days, this could be incorrect) before checking again.
Before the security wonks in the audience get upset, realize I’m not an OCSP or SSL expert, and would love some comments and feedback that help the rest of us understand exactly how this works. What I do know is that anyone who came to a site the relied on an SSL cert provided and/or signed by GoDaddy at some point in its cert validation path discovered a nasty side-effect of this really great idea when the GoDaddy DNS outage occurred: If you can’t reach the cert signer, the performance of your site will be significantly delayed.
Remember this: It was GoDaddy this time; next time, it could be your cert signing authority.
How did this happen? Performing an OCSP lookup requires a opening a new TCP connection so that an HTTP request can be made to the OCSP provider. A new TCP connection requires a DNS lookup. If you can’t perform a successful DNS lookup to find the IP address of the OCSP host…well, I think you can guess the rest.
Unlike other third-party outages, these are not ones that can be shrugged off. These are ones that will affect page rendering by blocking the downloading the mobile or web application content you present to customers.
I am not someone who can comment on the effectiveness of OCSP lookups in increasing web and mobile security. OCSP lookup for Firefox are simply one more indication of how complex the design and management of modern online applications is.
Learning from the near-disaster state and preventing it from happening again is more important that a disaster post-mortem. The signs of potential complexity collapse exist throughout your applications, if you take the time to look. And while something like OCSP may like like a minor inconvenience, when it affects a discernible portion of your Firefox users, it becomes a very large mouse scaring a very jumpy elephant.
When I asked if traditional Web performance still mattered, the post generated a flurry of comments and questions that I haven’t seen in in a long time.
After some reflection and discussions with people who have been tackling this problem for longer than I have, the answer is yes, it does matter. However, synthetic Web performance measurement will not matter the way it does now. The synthetic approach will decrease in importance within fully evolved companies, organizations that have strong cultures of Web performance.
In these organizations, the questions change as the approach becomes foundational and integral to the operation of the online business. Ways of examining competition and performance improvement evolve, and the focus moves – from the perspective of We have a problem to one of of Our customers / visitors have a problem.
The shift is fundamental and critical. For as long as I have been in the business, synthetic measurements have served as a proxy for customer experience. But unless you get into the browser, out to where and how the customer uses the online application, the margin of error will remain large.
The customer is not an operational issue. There is no technical fix for perceived performance.
There is no easy solution for evolving the experience of performance.
Image courtesy of james_gordon_los_angeles