September 6, 2014

Migrating Legacy PHP Apps to Heroku

cover

There are lots of ways to host, deploy and maintain applications. We often use Heroku because the tradeoff of control gives you a lot of operational benefits that are expensive to build and maintain. In this post we're going to be talking about the hurdles around moving a legacy PHP application to Heroku.

Note: It's been years since I've been active with PHP and the rest of my team hasn't had the pleasure. There's a good chance I'm missing obvious things. If I get something wrong please reach out on twitter or email and I'll update this article.

I have a lot to say about Heroku. Their 12 factor app design philosophy speaks to me. Once upon a time, I helped bring 3 of the 12 factors (codebase, logging and disposability) to a large company. We still had many many problems, but even just improving those 3 was a huge boom to reliability and productivity.

Heroku is somewhat similar to the Docker based hosting that's starting to mature. They were an influence, if not an inspiration, for much of the LXC based systems we've been seeing. They enforce strict code and data separation that not only inspires scalable designs but increases security and reliability. Their API lets you easily clone environments for development, testing, and demos. Heroku's operational support, logging, postgresql, buildpacks, scalability of machines, and ease of use and setup make them a great fit for most early stage projects or projects without a permanent team.

We've been tasked with bringing a homegrown tracking and reporting application into a stable state while we start a rewrite. It's a big ball of nightmares running on a "snowflake" and now it's ours.

I should also point out that the app isn't built on Wordpress, which has a horrible code and data separation that makes it a nightmare to deploy, debug, and recover from errors.

Administrators usually let Wordpress manage itself and set up nightly code and database backups. Then they just hope the backups will work together.

We should also take a good long hard look at what PHP puts in its system configuration (php.ini) vs in its runtime configuration. Many, many of the language-provided functions (eg. mail()) are configured as language options instead of application options. I'm happy to say the worst bits have been deprecated but this is a legacy app so, of course, we'll have to deal with them anyway.

I'll break the conversion into the different areas of concern.

Differences between Heroku and "traditional hosting"

This usually comes up with Wordpress. Wherein "wordpress just works on my host" and "it doesn't work" on Heroku. There are several reasons for this.

Heroku maintains an ephemeral filesystem which is reset every time a new dyno starts. Combining this with Wordpress's auto update functionality results in updates and file uploads that just disappear.

Heroku prefers PostgreSQL and you need to go out of your way to get a MySQL database via an addon. Most PHP apps prefer MySQL despite the project's questionable future.

Traditional web hosts have a single web server. Heroku calls their web servers dynos and easily lets you run multiple ones. There's no major difference between a single web server and a dyno until you make a second dyno. Since the dynos operate in isolation and don't share anything, some basic funtionality, including sessions and file uploads, will appear to break. This isn't an issue with Heroku, as you'd have the same issues with two web servers, but you'll need to adjust to accommodate it.

PHP version and configuration

Heroku uses Composer for versioning and dependcy management. Chances are, your legacy app doesn't. Composer is a commandline tool that runs againsts your project's composer.json config file. For our project we could use the latest version of PHP. We also needed the built in MySQL functions. Initially our file looked like this.

{
  "require": {
    "php": ">=5.5",
    "ext-mysql": "*"
  }
}

You'll need to run composer update after each change to the composer.json to also update the composer.lock file.

Now we needed to tweak some of the php.ini settings. Heroku makes this easy by allowing you to make a .user.ini file in your project that it will load. Ours enabled outputting errors to to the logs and enabled the mysql extensions that we required from composer.

; php config
display_errors = Off
html_errors = Off
log_errors = On
error_reporting = E_ALL & ~E_DEPRECATED
extension = mysql.so

MySQL Database Setups

Now that we have PHP with the MySQL extension setup we need a database to connect to. The only offering on the Heroku platform is cleardb. They are a pretty good MySQL shop with fantastic support. (I had a decent conversation with with their CTO once about some networking issues.) It bothers me they don't have machines in Heroku's data center but this hasn't been a performance issue for any of these apps yet. An alternative would be using Amazon's RDS which shares the same datacenter but requires a bit more setup and management.

Once we have a database, we'll need to connect to it. Heroku addons publish connection info through url schemes made available in the environment variables. While you can look up the connection info, don't hard code these credentials in your app. Instead parse them into a PHP array (or hash, as most people call them) from the url.

$connection_info = parse_url($_ENV['CLEARDB_DATABASE_URL']);

//database server
define('DB_SERVER', $connection_info['host']);

//database name
define('DB_DATABASE', substr($connection_info['path'], 1));

//database login name
define('DB_USER', $connection_info['user']);

//database login password
define('DB_PASS', $connection_info['pass']);

Email

PHP includes the mail() function which has been the bane of mail services on shared hosting since it's inception. It makes it very easy to use the server it's running on to send email. It used to be the school of thought that email was an operating system service. This was back when your user account on a machine was also probably your email account. This is no longer the case. In recent days, applications will connect to remote email servers with dedicated credentials. There's a lot to be said for centralizing email, as it helps prevent spam. However, you now need the help and permission of a 3rd party to send email reliably, whereas in the past anyone could do it from any computer.

The philosophy aside, Heroku wont let you send email directly from their machines, they require you to use a 3rd party such as Sendgrid. The mail() function wont work.

Sendgrid knows how to send email and offers a litany of addon services to help you do it reliably. Their non-email infrastructure appears to be a mess. Bugs in their website and account creation systems have been a problem for me in the past. However, for the basic use case they usually "just work".

We opted to use Sendgrid's PHP api client because talking SMTP isn't something PHP can do without a library. If we're going to bring in a library, let's bring in one that works with the mail service we're using.

We'll add it to our composer.json

{
  "require": {
    "php": ">=5.5",
    "ext-mysql": "*",
    "sendgrid/sendgrid": "2.1.1"
  }
}

And we'll change our calls to mail(), to use the SendGrid api.

$sendgrid = new SendGrid($_ENV['SENDGRID_USERNAME'], $_ENV['SENDGRID_PASSWORD']);
$email = new SendGridEmail();
$email->addTo($to)->
     setFrom($to)->
     setFromName("My C00L W3BSITE")->
     setReplyTo($to)->
     setSubject($subject)->
     setText($message)->
     addHeader('X-Sent-Using', 'SendGrid-API')->
     setHtml("<strong>{$message}</strong>");
$sendgrid->send($email);

File uploads

We had none in this application, thank god. Heroku's ephemeral filesystem allows you to upload files, but you can only store them temporarily on the dyno. You'll want to upload them to a cloud file storage services such as Amazon S3. If we ever need to figure this out I'll update this article.

Sessions

PHP has a $_SESSION varible that holds a PHP Array of data. It's a superglobal, which is what most languages call a variable in global scope, and can be accessed from anywhere. To use a PHP session you call session_start() and by default it will read/write a unique identifier to a cookie, and store your session data in a temporary file. If you have more than one dyno they can't read each other's sessions. One technique to work around this is to store the sessions in memcache. Heroku has a nice article about using memache to store php sessions which we'll follow.

First, we'll add the memcache extension to our composer.json and update the lockfile.

{
  "require": {
    "php": ">=5.5",
    "ext-mysql": "*",
    "ext-memcached": "*",
    "sendgrid/sendgrid": "2.1.1"
  }
}

And then set up a memcache addon in Heroku. I'm using memcachier today.

Since sessions are handled by PHP itself instead of your app, you'll need to set up memcache in the user.ini PHP configuration. I added the following lines to do a persistent connection to memcachier's memcache servers. All the connection info is provided in the environment varibles.

; Session handling
session.save_handler=memcached
memcached.sess_binary=1
memcached.sess_sasl_username=${MEMCACHIER_USERNAME}
memcached.sess_sasl_password=${MEMCACHIER_PASSWORD}

; Use persistent connections
session.save_path="PERSISTENT=myapp_session ${MEMCACHIER_SERVERS}"

You can test session handling by logging into your app and then running heroku restart to restart your dynos, which will destroy the ephemeral filesystem. When it starts back up you should still be logged in.

Logging

You'll want to add a logging add-on such as papertrail to keep a small archive of your logs.

Conclusions

It should also be noted that if you ever get hacked, the server cleanup will be a lot easier, and you'll never run the risk of undiscovered malicious files laying around.

You'll now have a much more reliable and easier to manage setup, and can take advantage of all the services that Heroku has to offer.

I hope this helps!

-Francis

Roborooter.com © 2024
Powered by ⚡️ and 🤖.