Posts Tagged ‘continuous intergration’

CSSUnit : experimenting with unit testing presentation code.

Thursday, October 1st, 2009

Not all developers are created equal.

In a perfect world, everyone would be super diligent and proficient at creating CSS, but in reality this is not the case. In some cases less experienced developers can make mistakes, create inconsistant code or not reuse exiting code. Even in other cases when more than one experienced developer works on a project, you can still end up with inconsistency and repeated code just by the dint of different working styles.

cssunitscreenshotI thought I’d start experimenting with automated testing of frontend presentation code, focussing on regression testing. This topic is not really discussed a lot, as the standard response is that there is no replacement for an eyeball test, but humans are by nature unreliable beasts and I’d like to change that and make front-end more of an accepted science.

My hope is that by trying out some techniques and bringing them into the forum I can at least start a discussion that results in an advancement of this field. Allowing us to escape some of the common traps we see at the moment and mitigate some of the risks associated with our profession in the same way back end coding has with unit testing.

Existing approaches

This is by no means the first time anyone has looked at this problem. There are other software solutions available like Hp’s WinRunner, but in my opinion they are generally unsuccessful or not fit for purpose. The existing solutions rely on doing algorithmic pixel-based comparisons of screens. Of which the process involves, a screen being designed & built. A “good version” of the screen captured and stored. Then every time the application goes through the build process, the screen is re-captured and compared for differences to the master. Any deviations are noted and the build process fails.

Now this works for static screens. But the problem is that most of the time our applications do not have static screens, the content dynamically changes and therefore, everytime it does so, the build process would fail, invalidating the automated nature of the process. Not only that, but these screens render differently in each browser, so you end up taking 4-8 captures, which exponentially increases the potential to fail incorrectly. In essence these tests are too brittle and so not actually very useful, as they break too easily.

Taking a step back

Given that doing very low level atomic checking seems to be unhelpful, let’s analyse what the actual process is, that a human developer uses to validate a page by eye.

When I look at a page against a design, I don’t compare pixel by pixel. I compare a higher level. First I look at the design. From the design I create a number of mental rules. In my head I create a list of all the different font variations, the different colors used and the rough layout. If the design has too many variations in these things, then it is inconsistent and hence bad user design, in which case I end up going back to the designers and asking for a higher level of standardisation. When we have this basic checklist of “design principles”, we can compare the font size, weight and face, compare colors, compare widths, heights and alignments to the implementation. This enables us to take a design and an implementation and quickly gauge at a high level whether it is likely to be correct.

Taking this principle changed the way I started thinking about unit testing CSS, what if we could formalise this set of design principles and turn them into a programmatic set of rules that we could test each page against?

For this, my experiment led me to create cssUnit, a framework for checking style consistency.

Where cssUnit might help.

Like all software projects, not all processes are suitable for every situation, this is no exception. You will have to evaluate whether css unit testing is right for your project, the key factors I would take into account for this are, project lifespan, number of developers and size of site.

The scenarios in which cssUnit testing will be helpful are:

  • Large corporate web presence where there is a overall style guide but many different sites/microsites maintained by lots of developers
  • where you have a very fluid team - may often be short term freelancers
  • where you are training up a young/unfamiliar team of front-end developers,
  • situations where there is overlap between front end and backend - but not necessarily the possibility to maintain quality
  • co-located or remote teams

As well as being a tool to maintain quality - unit testing is also a way of communicating and distributing knowledge in a direct fashion, few developers will bother to spend the time reading the style guide, but many will learn the rules through failing tests.

What cssUnit is not.

cssUnit is not a number of things - I thought I would just mention the ones I did not mean it to be

Not cross-browser tested - although in principle it should be easy to make cross browser - it is not - not yet anyway - if it proves to be useful, then perhaps there is a case for making it cross-browser

Not beta - cssUnit is not by any means complete, and is as expressed before is more of an experiment, I have only coded for the scenarios I thought of, I’m sure there are better ways of doing this, but this is at least a starting point to explore how it might be done.

Not a service - lots of testing platforms have become services recently, but this is not currently my aim. There are many directions this could take, at the moment I’m not sure making it a service is the correct path.

Not easy - like all unit testing, cssUnit takes time to set up and implement in the beginning. The hope is that in the long term it saves you more time than it costs. It is also dependant on you having a strict style guide - with out standardisation in your design, cssUnit becomes useless.

Seeking the perfect build process.

Monday, May 4th, 2009

I am currently working on a project with a continuously integrating multi-stage build process.

Every time you commit a new build gets triggered and flows down the pipeline of testing stages. Problem is it doesn’t work very well.

One of the primary problems is that the tests getting run seem to be incredibly brittle and unstable which causes a lot of broken builds. Which in turn prevents people checking in. Which in turn creates a backlog of checkins. Which results in people forgetting what the code they need to  check in does. Which results in more broken builds. Which results in a myriad of other problems, including people losing code so massive wastage. Very un-lean.

So the process is broken. The solution is  that a broken build shouldn’t stop other developers working.

How do we implement this?

Well, one suggestion is that we have an automated revert process that every time a build fails, it automatically reverts to the last good build. This works in principle - a developer commits, it breaks the build, the build reverts, the next developer checks in, the broken build commit gets pushed to the back of the queue & creates an incentive to checkin working code.

However in reality, the build is not a instantaneous process. It takes around 15 minutes to get through all the various stages in the process. Arse.

This means the above scenario actually results in: Developer checks in broken code, build runs, meanwhile another developer checks in working code that requires some bit of broken code, build fails, build reverts, next checking code also fails as requires reverted code. Everyone stabs each other in the face.

So, how do we solve this?

Well, lets create an analogy, if we look at the build like it is a lego brick wall, and each commit is a new brick in the wall. So someone clicks on a brick which is unstructually sound. What are the other brick layers doing?

Well, some of them are working on the adjacent walls, and their brick work is unaffected by removing the broken brick, so their bricks should remain unchanged as long as they don’t sit upon the broken brick or any of the bricks that were installed at the same time as the broken brick.

Bricks that were built on top of the broken brick or any of the bricks installed at the same time as the broken brick, well, they’re fucked.  Because it’s Lego, you can’t just slide out the broken brick and replace it, you have to dismantle the wall and take out any bricks above it in order to replace the broken brick, which results in a large number of angry brickies f’ing and blindin’ and going off to read the sun and get a bacon sandwich.

So one way to fix this is to change the build process. Instead of letting people build on top of other bricks before they’re checked for stability, you take their bricks and hold them until the bricks they rely on have been checked, then if the parent bricks are safe, the dependant bricks are put on and checked themselves. If further bricks are put on then they are put in the dependant queue and checked in turn.

Voila, you have a smooth parallel build process.