Measuring and improving code quality

How to measure code quality and implement best practices to enable high performing, high throughput development teams?

I will try to answer this from my own experience running a development organization and the processes and tools that worked well for us. This is not to exclude other alternatives that might be more applicable for other scenarios, but just to suggest a set of tools and best practices that worked for us in our environment.

Code quality could mean different things to different stakeholders, for example, management equates it to easily maintainable code resulting in minimal total cost of ownership (TCO) and faster time to market, QA defines it as fewer bugs, developers define it as readable and easy to understand, (implicit that it follows certain coding standards), and there is universal agreement that it has to be reliable, with low response times and is scalable. So I empowered my development team to come up with their own definition and processes. The only guideline I gave them was that it should be rigorous enough and yet something they are signing up to sustain for the long run (very important to have the team’s buy-in) and here is what they came up with:

Common, preferably automatically enforceable, coding standards
Good performance (Sub-second for most pages at over 1000 concurrent users)
Well documented and easy to understand for new team members
Continuous integration (CI) to find defects earlier
Testable Code (80% coverage in Unit testing and automated regression scripts for integrated testing and QA acceptance)

Coding standards :

We set up only those rules /standards that fall into one of these 3 categories

can be automatically monitored thru tools such as PMD (http://pmd.sourceforge.net/ ) or
the team makes a conscious decision to invest the effort to manually review it as part of the code reviews or
security standards that are subject to complete code reviews by external security auditors.

Signing up for any more than these would result in a situation where they cannot be easily enforced /monitored and that just causes frustration. Aspects that can be automated are:

Consistency in the application (Naming, structuring, etc..), PMD comes with several java coding standards out of the box.
Common boilerplate coding issues (like handling exceptions)
Detect usage of deprecated /unapproved APIs /libraries

Documentation:

Our goal was just enough documentation to avoid future overheads. Using the standards and naming conventions allowed us to write self-explanatory code.
We used Javadocs and integrated with CI to generate javadocs once a day and linked into them from a wiki.

Code complexity is a significant factor in measuring the ease of maintenance.

We measured Cyclomatic complexity. PMD tool uses Npath Complexity to calculate it. You should determine a value for your application. Our goal was for it to be under 10. Exceptions have to be justified.
Code duplication: Another aspect is to avoid copy-pasting of the same code in several places. The Copy Paste Detector (CPD) that comes with PMD can be configured with a threshold of the block size. We set it to 25 lines. (i.e any code block that is over 25 lines and is repeating is flagged)
Principle of single responsibility: Wikipedia defines it as, In object-oriented programming, the single responsibility principle states that every object should have a single responsibility and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility.
One limitation of PMD was that we cannot apply it for the UI layer (javascript, css)
Exceptions that were discovered by the tool are discussed within the team

Security : We also checked for Common errors in coding (from a security perspective) and had an external vendor audit the entire code base for each major release. (we had a PCI compliance mandate)

Intellectual Property contamination: Particularly in shops where Intellectual property is being built and claimed, the code has to also periodically be (automatically) reviewed by tools such as MOSS, (which is commercially made available at http://www.similix.com/ ) to ensure the developers are not just copying code from some other source without proper licensing arrangement. (I do not have experience with this tool though)

Unit Testing:

We used JUnit along with “mockito”, “Hamcrest” and “HSQL” for our application. These were integrated with CI so that the tests are executed after the nightly build. The build is set to fail if unit tests fail and no developer wanted to be the culprit who broke it as the whole dev team got an email when it happened. We tested for negative paths as well as boundary conditions and used mocking for avoiding integration pitfalls. The

This single CI process alone made a significant difference to quality as it helped discover defects early on and provided us a repeatable “one-click” build process. One limitation we had, that our Unit tests covered all layers except UI layer.

Code coverage

Measures how much code is covered by unit testing. EclEmma (http://www.eclemma.org/) plugin is a great tool that provides method level, class level, and package level coverage of how many lines (and %) was covered with the unit test cases. It was integrated with CI for every build.

Integration Testing :
Our integration testing covered all layers of the application by verifying business functionality. It also covered Integration with multiple systems (using webServices). We focused on areas where code coverage was less (in unit testing). It was integrated with CI and executed once a day after deployment to the dev integration environment (yes we used one of our dev environments for this). Selenium is a great tool to set up these scripts. However, as with any of these regression testing tools it needs dedicated SDET (Software Testing development engineers) to keep these scripts current along with the changes being made. A specific limitation of selenium is that it works only using Mozilla Firefox, however, it gave us the confidence that the overall business functionality has not regressed since the previous day (We did not run into situations with missing browser-specific quirks as we did not heavily use browser-specific code). We defined a goal for our application of 80% coverage. It is not a magic number to depict the quality of the application, because the pitfalls are a) one can artificially reach a better coverage with inaccurate assertions in the test. b) This doesn’t mean the code works, its just executed c) The cost-effectiveness should be considered to have a reasonable goal for the application.

Exception Monitoring: Ensure you are monitoring the logs for all layers to ensure there are no uncaught / unexplained exceptions.

Continuous integration (CI) as defined by Martin Fowler is a software development practice where members of a team integrate their work frequently, usually, each person integrates at least daily – leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly CI is a prescribed Agile best practice.

We used CruiseControl as our CI tool and scheduled periodic builds to get quick feedback. A build was scheduled for every 2 hours if any developer changed the codebase. If a build fails, the developers who changed the source code within 2 hours are notified. Separate targets to be executed for nightly builds that include the full Javadoc, integration testing. The deployment of the codebase was done 3 times a day in sandbox & development servers to ensure we had a view of how the project/ application is progressing in terms of the discussed factors over a period of time.

Technical Debt

Maintained in a wiki and sharepoint for every project/ application
Every exception to the factors discussed was documented here along with the justification /need.
Updated by developers during the course of the project
Reviewed periodically to ensure that we are not adding more debt to the product.

Continuously Improving quality by refactoring :

Priorities were established as we focused on certain factors that we want to improve at a certain time (for example in one build we focused on the logging aspect and all developers had a special focus on it). The Boy Scout Rule was a good guideline for us – “Leave the codebase cleaner than what it was when checked-out.” Some of the code refactoring that was done :

Add unit tests if missing before refactoring
Usage of proper design patterns to achieve testable code
Improve documentation within code

Hope this post will help you with some ideas to establish your own development process. Note that each shop and team will need to evolve their own process as a single process will seldom work for another. While I covered the java side of the world, similar best practices can be implemented on the .net side of the world (See http://www.troyhunt.com/2010/04/measuring-code-quality-with-ndepend.html).

Lastly, I have to give full credit to the developers and architects in my team without whom none of this would have been possible.

Measuring and improving code quality

Published by tvprasad

Leave a comment Cancel reply

Share this:

Related

Published by tvprasad

Leave a comment Cancel reply