Intermittent PHP-FPM failures on Opal4 and Opal5 Sunday 18th October 2020 15:34:00


We've been investigating intermittent failures of PHP sites hosted on Opal4 (Dallas) and Opal5 (Frankfurt) which result in "503 Service Unavailable" responses on websites. We'll post more details here as they become available.

We've rebuilt most of our PHP stack and this issue is now resolved. Updated documentation and maybe a blog post are on the way.

Over the last few weeks we’ve dealt with various PHP-FPM related problems. After trying various combinations of fixes to the server, apache, and PHP-FPM itself we’ve developed a new, more stable PHP-CGI stack to replace PHP-FPM. With PHP and WordPress powering large parts of the Internet (and businesses) having a stable, reliable stack for them is critical.

We rolled the new stack out to all servers yesterday afternoon and found some edge cases we hadn’t found in our previous testing.

We’ll be rolling those changes out again tonight. While we do there may be a brief downtime of less than 10 minutes across PHP based sites and the update happens. Once the sites are updated we’ll begin monitoring for any breakages that didn’t show up in our testing and fix them ASAP.

We’ll follow up with a blog post on exactly what we’ve changed and how things are working in the future.

Our troubleshooting is going but we've improved the stability of PHP-FPM quite a bit over the past few days and are seeing no customer facing problems at this time. We'll leave this incident open until we're satisfied that we've got this issue nailed down.

There have been intermittent PHP-FPM outages on Opal5 in the past several hours. Troubleshooting is ongoing.

PHP-FPM for PHP 7.3 on Opal5 was briefly disrupted about 15 minutes ago. We'll continue to troubleshoot and monitor.

We've just deployed a potential fix and will continue to monitor.