blog.willbadart.com

This is the blog of Will Badart.

Archive Series Tags Feed

C Linked List Follow-up

08 Oct 2018

About a week ago I walked through my solution to comparing two linked lists in C. At the end, I hinted I might write some benchmarks to compare the two approaches discussed. Well, here they are.

Recap

The basic solution in the simplest terms looks at each pair of elements, and asks (1) if they’re the same, and (2) if the rest of the list is the same. It’s so short, I’ll just copy it here:

bool compare_lists(Node* head1, Node* head2) {
    if(!head1 && !head2)
        return true;
    else if(head1 ^ head2)
        return false;
    else
        return head1->data == head2->data
            && compare_lists(head1->next, head2->next);
}

The other solution “improved” on this by enabling tail call optimization with the addition of an accumulator parameter (please read the post linked above if this sentence didn’t track).

Testing

Note: if you’d like to follow along with the code, please see this gist.

When I first ran the benchmarks on 10,000 items, the run times of each program were pretty close, but the “optimized” version was actually running a couple nanoseconds slower, consistently. (Both had been compiled with -O3)

Puzzled, I decided to see if I could make sense of what the compiler had done by checking out the assembly. To do this, you can either disassemble the object file with objdump -d ... (here’s the man page) or re-assemble from source using gcc -S .... I chose the latter since it includes labels.

The first thing that struck me was how many fewer instructions the first version generated: 40 lines compared to 117 lines. This is probably the explanation for the constant-order time difference from my initial testing. Next, and more importantly, I failed to find a call to the call instruction, which, as I’ve been led to believe by this Stack Overflow post, means that GCC (I’m on version 8.2.1) was actually able to tail-call optimize the first version too. Surely enough, when I recompiled with -O0, the call was present. The unoptimized version couldn’t even handle a million elements without blowing the stack/ running out of memory.

So, I suppose the moral of the story is that GCC is really smart, and so are the people who contribute to it. Mad respect.


Recent posts:

Next post: Coffee Log 26
Previous post: Simple BST Validation