Tuesday, February 12, 2013

The main obstacle in writing unit tests

I am an advocate of unit testing and try to implement them as much as possible. In the beginning of a software project, it is relatively easy and most of my tests pass. As my code gets bigger and bigger, tests start to fail. What is interesting is that the pass/fail behavior is not smooth, i.e. the number of passed tests might be lower than a month ago although I have written many new tests:


As my unit test base gets larger, the main reason of failure becomes out-of-date tests that need to be updated to match the design. In the messy middle portion of the project, I realize that I spend more time updating tests than fixing bugs found by my tests. There is so much oscillation that I'm on the verge of loosing hope and think about abandoning unit tests. If I persevere, I eventually reach a point where all my tests pass, but it takes a lot of will power. That is for me the main unit test challenge.

Bonus tips:
  • How do you know if you have good coverage for your new code? Try removing a line or a constraint check. If all tests still pass, you don't have enough code coverage and you probably need to add another unit test.
  • Testing only public members leads to tests that can withstand constant code refactorings and internal implementation changes, while still making sure the overall functionality stays the same.
  • When approached poorly, unit tests can achieve the opposite results, stealing valuable time and complicating the testing process.

Friday, February 01, 2013

Dude, WTF is this shit?!

I am involved in maintaining legacy Java code written by long forgotten souls and came across a mesmerizing function that finds the minimum of an array. It boils down to the following:
static int findMin(int[] orgArray) {
    //A typical "WTF is this shit?!" case...
    int[] sortedArray = Arrays.copyOf(orgArray, orgArray.length);
    int minValue = 0;
    Arrays.sort(sortedArray);
    for (int i = 0; i < sortedArray.length; i++) {
        if (orgArray[i] == sortedArray[0]) {
            minValue = orgArray[i];
            break;
        }
    }
    return minValue;
    //TODO: This function should be simplified as follows:
    /*
    int minValue = orgArray[0];
    for (int i = 0; i < orgArray.length; i++) {
        if (orgArray[i] < minValue) {
            minValue = orgArray[i];
        }
    }
    return minValue;
    */
}
A classic "WTF is this shit!" case...

If we analyze the first algorithm, we can assume that Arrays.sort() method will run at O(n*log2(n)). The following for loop will run at O(n) time. So in total it runs at O(n*log2(n)) + O(n) which asymptotically converges to O(n*log2(n)).

The second algorithm uses a single for loop and runs at O(n).

Update - November 5th, 2016: There is a somewhat similar looking case (determining if there are repeating elements) where using Arrays.sort improves performance ["Data Structures and Algorithms in Java", 6th Edition, p.174-5]:
/∗∗ Returns true if there are no duplicate elements in the array. ∗/
public static boolean unique1(int[ ] data){
    int n = data.length; 
    for (int j=0; j < n−1; j++)
        for (int k=j+1; k < n; k++)
            if (data[j] == data[k]) 
                return false; // found duplicate pair
    return true; // if we reach this, elements are unique
} 
The above funtion runs in O(n^2). Using Arrays.sort, we can decrease the running time to O(n*log2(n)):
public static boolean unique2(int[] data) {
    int n = data.length;
    int[] temp = Arrays.copyOf(data, n);
    Arrays.sort(temp); 
    for (int j=0; j < n−1; j++)
        if (temp[j] == temp[j+1])
            return false; // found duplicate pair
    return true; // if we reach this, elements are unique
}