Wednesday 28 July 2010

Automating So You Don't Forget

This is a bit of an introductory-level post/rant/tutorial, but I've been peppered by enough "why on earth would you do this?" questions by various (seemingly experienced) project team members and on various mailing lists that I thought I'd just write my own take on this and point people to it when useful.


I'm pulling my (semihemidemiexistent) hair out on four different PHP projects at the moment. Not because they take all my time (they don't, unfortunately), but because three of them are in "maintenance" mode and express that in different ways. Take version control: one uses Mercurial (my favourite DVCS package); Subversion (once "subversive;" now the "safe" non-DVCS choice); and (tragically) git. Each project has different coding standards. One of those is actually widely-enough used that PHP CodeSniffer comes with support for it right in the tin. The others are relatively easy to code "sniffs" for. (Do remember that none of the three "maintenance" projects were using CodeSniffer (or equivalent), and all three have very sporadic use of their main VCS repositories.)

Wait a minute...now I've got to remember which standards go with which projects? And oh, yeah, it would be Really Nice™ to have any changes automatically saved in version control... if they're worthy.

What do I mean by "worthy?" Well, before I worry overmuch about how code is formatted, I should be able to prove that it works properly. After all, the most beautifully-formatted code that doesn't work is still (essentially) useless. This, of course, is where a tool like PHPUnit comes in; once you have sufficient coverage of your code with automatable tests, especially if you write the tests before you write (new) code, you can make changes confidently and quickly, because a) your tests prove that the code works as expected, and b) you're making sensible use of a (D)VCS, so that when your wonderful new code goes south and doesn't come back, you can follow your virtual-breadcrumb trail back up the face of the cliff. Only after PHPUnit blesses the code should CodeSniffer get a crack at it.

The new folks are scribbling away: "first test everything, then comply with standards, and then update version control." The rest of you are saying "hang on a minute; that problem's been sorted any of several different ways."

Precisely. If you're developing in the Java world, you're spoilt for choice: you can do perfectly reasonable build/test/deploy automation using Ant, or if you want to keep a large number of people (allegedly) gainfully employed managing a J2EE-on-steroids project, you can go for Maven.

In the PHP world, we've got a nice "little" analogue to Ant called Phing. It will quickly become "dead-finger" technology; you'll wonder how (or why) you ever did a reasonably "serious" project without it. And yet, most of the open-source PHP projects I've seen (on Sourceforge and elsewhere) don't use such a tool; they rely on error-prone, manual steps. This manual process, with steps easily forgotten or mangled, is the source of many bugs in released software — in any language.

Enter Phing (or equivalent). You set up the moral equivalent of a makefile with the steps you want to have performed the same way in the same order, every time. Phing supports properties, which can be stored separately from the "master" build file that references them. This allows you to set up consistent process and policy (defined in the build file) and plug in the values for a specific project using the separate properties file.

So how much difference does all this make? Let's take an example set of steps, some variation of which I follow in my build files:

  1. First, clean out all the files created by steps that come later (like test reports);
  2. Then, run unit tests, displaying the output as they run. If tests fail, stop;
  3. Verify compliance with your chosen coding standards; if a problem is found, stop. Either fix the problem if it's in a file you've touched or add the file to the ignore list if it's a legacy file;
  4. I like to run PHPDocumentor to automatically generate developer documentation, from comments left in the code. CodeSniffer will check these, too, so by the time phpdoc gets its grubby virtual paws on your code, it shouldn't find any problems;
  5. If all is well, then it's on to version control. I have Phing show a "diff" report of what's changed since the last checkin, and then prompt me for a checkin comment. If I want to run the whole process but not check in to VCS (maybe I'm coming back to a project after a while away and just want to see the earlier steps run), I can hit the Return key, and my build file will skip the VCS checkin because I've supplied an empty comment (which it checks for).

Great, so (since I've followed a few conventions), all I need to do is type ''phing'' at the command line and it's off to the races. Trivially easy to use and, much more importantly, proof against a very high level of idiocy.

What's that? You in the back... I'm putting the cart before the horse, you say? I shouldn't do a process that drives VCS checkin, but a VCS checkin "hook" that does the validation and so on instead?

To some degree, that's a matter of taste. From a very practical perspective, though, having your build-and-test automation drive VCS instead of the other way 'round means that you can use any VCS operable from a command line, with minimal pain moving between projects. Not every VCS implements a pre-commit hook in the same way; some apparently don't implement them at all. (Yes, we know they're toys, but they're "enterprisey" big-ticket toys. Some managers will buy anything.) So, by having a single-command process execution/enforcement tool, you'll generally find that the internal and external quality of your project improves considerably and quickly; you'll also find that the risk involved with sweeping changes or audacious new features drops to a more comfortably survivable level.

And that's why I always answer the question "What tools should I be using for my PHP development?" to include at least:

  • Your project's version control tool of choice (again, I recommend Mercurial);
  • Phing;
  • PHPUnit;
  • PHP CodeSniffer; and
  • PHPDocumentor.

Once we get people used to a core set of tools and practices, we can then go on to the thorny religious issues like, "which PHP framework should I use?"

Next question?

No comments: