Friday, 4 July 2008

Six valuable Phing build file refactorings

Some weeks months ago I finally got my hands on the ThoughtWorks Anthology and got immediately hooked on one of the featured essays called 'Refactoring Ant Build Files' contributed by Julian Simpson aka the build doctor. After absorbing and studying the provided catalogue of overall 24 refactorings, I spent some time to transform a few health-promoting ones to the Phing universe. So the following post will outline six five basic, but valuable Phing build file refactorings by showing the smelly example first, followed by the scentless one and a closing refactoring description.

Making build files 'first-class' codebase citizens

You might ask yourself if there even is any need to refactor and care about mostly poorly treated project artifacts like build files. Well according to the book market there are growing needs and catalogues for refactoring non-sourcecode matters like databases and nowadays even (X)HTML, and as build tools are often used to automate the whole build and delivery process of complete projects, their feed build files should be readable and clear, painless maintainable and easily customizable as the controlled project takes new directions and hurdles.

Today build files are also often used to drive the Continuous Integration(CI) process and are heavily used to run local development builds prior to commiting the finished development tasks, therefor messy and clotty build files will have a counterproductive impact on the build management lifecycle and on making required changes to it. So when striving for codebase citizen equality agree upon a build file coding standard (e.g. 5. Introduce distinct target naming or a common element indentation), reside all build files of a project in a SCM system like Subversion or Git and try not to neglect the build files constantly just because they aren't actual business logic.

Trapeze balancing without a safety net

While developing an applications business logic 'ideally' automated behaviour verifying tests are written, whether in a test-first or test-last approach, and these are building the implicit and required safety net for all follow-up refactorings. For build files there are currently no tailored testing tools/frameworks available to warrant their external behaviour during and after a refactoring, though it might be possible to create a basic safety net by using for example a combination of PHPUnit's assertFileExists and assertContains assertions. And even if there were such tools available, it's rather questionable that these would be applied to test-drive mostly simple starting and incremental evolving build files. So currently this flavour of refactoring needs to be applied with much more descipline and caution than classic sourcecode refactorings, even 'old-school' manual tests have to be run frequently and only with a proceeding practice more and more agressive refactorings will become an everyday tool. After this short note of caution, let us jump right into the refactoring catalogue extract.

1. Extract target

Take parts of a large target, declare them as independent targets and preserve dependencies by using the
target depends attribute.
Before
<target name="what-a-clew">

<phplint>
<fileset dir="${build.src}">
<include name="**/*.php"/>
</fileset>
</phplint>

<phpcodesniffer standard="PEAR" format="summary">
<fileset dir="${build.src}">
<include name="**/*.php"/>
</fileset>
</phpcodesniffer>

<phpunit>
<formatter todir="reports" type="xml"/>
<batchtest>
<fileset dir="="${build.tests}">
<include name="**/*Test*.php"/>
</fileset>
</batchtest>
</phpunit>

</target>
After:
<target name="phplint-report">
<phplint>
....
</phplint>
</target>

<target name="sniff-report" depends="phplint-report">
<phpcodesniffer standard="PEAR" format="summary">
....
</phpcodesniffer>
</target>

<target name="test-report" depends="sniff-report">
<phpunit>
....
</phpunit>
</target>
The Extract target refactoring is similar to the well-known Extract method refactoring catalogued by Martin Fowler and should be applied to unclutter long targets, which can become hard to understand and troubleshoot while maintaining or extending a build file. This refactoring is achieved by taken each atomic task (e.g. phplint) of the cluttered target and provide them each a own target (e.g. phplint-report) while the former tasks execution sequence can be obtained by utilizing the target depends attribute. You can compare this refactoring to the technique of tackling a method that's to large and infringes upon the single responsibility principle.

While twiddling with this refactoring I came up with a follow-up and hand in hand refactoring that might go by the name of Introduce facade target, which simply orchestrates the target execution sequence so you can remove the depends attribute of all orchestrated targets and thereby use them separately if needed and advisable. The following build file extract shows the result of this refactoring in action.

1.1 Introduce facade target

Provide a facade target to obtain the task execution sequence and to make each involved target a single callable unit.
After:
<target name="phplint-report">
<phplint>
....
</phplint>
</target>

<target name="sniff-report" depends="phplint-report">
<phpcodesniffer standard="PEAR" format="summary">
....
</phpcodesniffer>
</target>

<target name="test-report" depends="sniff-report">
<phpunit>
....
</phpunit>
</target>

<target name="quality-report" depends="lint-report, sniff-report, test-report"
description="Generates the overall projects quality report" />

2. Introduce property file

Move infrequently changing properties from the build file body to a flat file.
Before:
<property name="db.port" value="3306" />
<property name="db.name" value="example" />
<property name="db.user" value="funkdoc" />
....
<property name="runtime.property.x" value="default" />
....
After:

build.poperties file
[example properties]
db.port = 3306
db.name = example
db.user = funkdoc
....

build file
<property file="build.properties" />
<property name="runtime.property.x" value="default" />
....
The Introduce property file refactoring can be applied for moving infrequently changing or static properties out of the main build file body to raise the overall legibility and keep them distinct from runtime properties. The downside of this refactoring is a lost of property visibility and breaking up the former single build file into multiple units, which is contradictory to the third ANT best practice named 'Prefer a Single Buildfile' of an older best practice catalogue compiled by Eric M. Burke. So in this case, like in any case, you have to make your own choice based on your needs and requirements.

3. Replace comment with description

Annotate elements(targets) with the description attribute instead of XML comments.
Before:
<!-- This target runs the PHP_Codesniffer task and reports coding standard violations --!>
<target name="sniff-report">
....
</target>
After:
<target name="sniff-report" description="Runs the PHP_Codesniffer task and reports coding standard violations">
....
</target>
Often build files are accentuated with plain XML comments to retain the mechanics and purpose of build file elements(i.e. targets) and can become a diversionary/obscuring source while maintaining or extending a build file. By using the available description attribute of the target element to annotate its purpose it's possible to reduce that kind of noise and even better, if used constantly, they can provide valuable information about all accumulated targets of a build file when phing is called with the -l(ist) option. As you can see the Replace comment with description refactoring requires a minimum of effort/investment to achieve a very valuable impact.

4. Reuse elements by id

Declare an instance e.g. a fileset once and make references to it elsewhere to reduce duplication and increase clarity.
Before:
<target name="phplint-report">
<phplint>
<fileset dir="${build.src}">
<include name="**/*.php"/>
</fileset>
</phplint>
</target>

<target name="sniff-report" depends="phplint-report">
<phpcodesniffer standard="PEAR" format="summary">
<fileset dir="${build.src}">
<include name="**/*.php"/>
</fileset>
</phpcodesniffer>
</target>
After:
<fileset id="src_artifacts" dir="${build.src}">
<include name="**/*.php"/>
</fileset>

<target name="phplint-report">
<phplint>
<fileset refid="src_artifacts" />
</phplint>
</target>

<target name="sniff-report" depends="phplint-report">
<phpcodesniffer standard="PEAR" format="summary">
<fileset refid="src_artifacts" />
</phpcodesniffer>
</target>
The Reuse elements by id refactoring is, as the short description states, tailor-made to increase clarity while reducing code duplication, which is when present a risk that an alternation made to one element will be skipped for the other duplicates, by declaring top-level elements once by assigning an id attribute to it and then referring to it thoughout the rest of the build file. This refactoring is best compared to the classic sourcecode refactoring called Pull Up Method also catalogued by Martin Fowler and moreover it enforces the compliance with the DRY principle by providing a single point of change for futurities alternations.

5. Introduce distinct target naming

Use a different punctuation for targets and properties to enhance readability.
Before:
<property name="example.property1" value="abc" />
<property name="example_property2" value="def" />
<property name="example-Property3" value="ghi" />

<target name="example.target1">
<echo msg="${example.property1}" />
....
</target>

<target name="example-target2">
<echo msg="${example-Property3}" />
....
</target>
After:
<property name="example.property1" value="abc" />
<property name="example.property2" value="def" />
<property name="example.property3" value="ghi" />

<target name="example-target1">
<echo msg="${example.property1}" />
....
</target>

<target name="example-target2">
<echo msg="${example.property3}" />
....
</target>
The Introduce distinct target naming refactoring once again tackles the improvement of readability in a build file by applying a constant and different punctuation on the common elements: targets and properties. The appliance of this refactoring leaves you and coworkers with an immediate reply whether you're looking at a property value or a target and can lead towards an agreed upon in-house/project build file coding standard. For target names underscores and dashes are suitable, although dashes are preferred by me as a hypen is used when enforcing internal targets, while for the build properties dots should be considered to build namespaces and their names should 'always' be lowercased except for environment variables/properties.

In case this blog post whetted your appetite for more, heavier and here uncovered build file refactorings, you might consider picking up a copy of the ThoughtWorks Anthology book. Last but not least a shout out goes to the build doctor for the remix permission and until the next post I'm ghost like dog.

9 comments:

Julian said...

Raphael, nice remix. Thanks for the link back!

Alexey Shockov said...

Good job, Raphael!

One mistake: ...target name="quality-report" depends="lint-report, sniff-report, unittest-report"... test-report instead of unittest-report? ;)

Raphael Stolt said...

Hi Alexey,

Yeah you totally right missed that on while proof-reading and corrected it, thanks for pointing me out on that one.

Anonymous said...

Again, a great Phing related post. Thanks!

Federico

Travis Swicegood said...

Really nice post. Glad to see that I already use most of these in my build files for the various Domain51_ packages and PHPT.

FWIW: You can't just create new patterns. Three people have to discover the same thing independently for it to become a pattern.

That said, I've been using the facade pattern that you describe here for about a year. Just one more to go :-)

Raphael Stolt said...

Hi Travis,

Thanks for stopping by again and the kudos. So where do we get that one to make this an 'official' pattern? ;)

Paul said...

There's an error in the code:

<phpcodesniffer standard="PEAR" format="summary" />
<fileset dir="${build.src}">
<include name="**/*.php"/>
</fileset>
</phpcodesniffer>

You're closing the 'phpcodesniffer' task on the opening line (/>), so it can't be closed again 4 lines under (</phpcodesniffer>)

Gr, Paul Edenburg

Raphael Stolt said...

Hi Paul,

Thanks for pointing me to it; it's fixed now. I guess it's just another proof that copy and past is evil.

Anonymous said...

You have hit the mark.