We spent the last year or so building and maintaining a Continuous Integration (CI) system for our major development group. And it was incredibly daunting and frustrating, considering we had to build it around a legacy and problematic version control system like ClearCase Base.
In the textbooks it seems pretty simple: every commit (check-in) to the repository triggers a build, which in turn checks out the latest version of the repository from source control, and run the build process. This way, if a build fail, you know exactly who broke the build – the person who performed the last commit – and if it the build passes, you have full traceability between the build artifacts and the source code they were built from.
Now, let’s try that with ClearCase Base.
ClearCase Base (unlike ClearCase UCM) uses the same underlying concepts as the much older CVS. The most important – and problematic – aspect is that the unit of operation is a single file. Among other things, it means there are no atomic commits¹, no cheap branching, and no global tagging. On top of that, we have the challenges of inherent performance issues and total lack of integrations with non-Rational tools.
- “Every commit triggers a build” – you can’t do that in ClearCase Base, as there is no ‘commit’. There is a single file check-in, and no way to know if that file is one of many in the change set being checked-in.
Our approach is to perform view update every 30 minutes and, if something was found, update again (to make sure we’re not missing subsequent check-ins), and then launch the build.
- “Checkout the latest repository” – when the repository is big, updating a ClearCase view takes a long time. This is because ClearCase needs to go over every single file, compare it to the server’s content, and update the local view with it if needed.
We have over 150,000 files, and it takes on average 12 minutes just to perform the view update. Using ClearCase ‘dynamic views’ (network-based workspace which is always synced) is not an option, since I/O is very slow, and build jobs tend to be very I/O intensive.
Our approach is to try and optimize the ‘load rules’ which allow cherry-picking specific files and folders to update. It’s a major investment, and requires constant maintenance. But it enables us to reduce the update time by a few minutes.
- “Know who broke the build” – you can’t really do that as well – there could be several files changed by different people since the last build, and not a single commit with a single person to hold responsible.
Yes, it’s possible to figure out which files were updated since last build, obtain the list of people who checked-in and notify them. But that’s complicated, time consuming, and ends up with multiple people responsible, which usually means no one takes responsibility.
Our approach is to try and identify the specific file which caused the build to fail (source file, maybe some build script) and notify the last person who worked on it.
- “Source traceability” – The standard approach is to tag all the version-controlled files who participated in the specific build. But in ClearCase, applying a label on 150,000 files takes over an hour.
Out approach was two fold: first, associate each build with a timestamp – then, if needed, a ClearCase view can be created to represent the repository status at that specific time stamp. Second, we perform labeling only during full build cycles, which take long time (a few hours), in parallel to the deployment and testing stages.
To summarize: We managed to create a working and reliable CI system for ClearCase Base, but it wasn’t easy, or pretty, or even interesting. Plus, it requires a lot of maintenance and support, which isn’t fun either.
Stay away from ClearCase Base, if you can.
¹ the ‘cleartool checkin’ command was enhanced with the ‘-atomic’ option, but it is not available in GUI, or IDE integrations, so most developers (at least on Windows) do not utilize it. And even so, it just means that those check-ins are completed, or rolled-back, together; in ClearCase history they are still recorded as a series of independent, file-specific operations, and not as a single commit with a single purpose.