Parallel cursor vs. secondary key

This one bugged me  for a while. Whenever I implement an inner loop which is dependent upon the outer loop I start to think about which solution to implement. So i decided to answer the question parallel cursor vs. secondary key for my self – and make a post about it in the process 🙂

Initial Setup

Let me introduce my mighty initial setup…

… a simple head <-> item loop logic. And this is my data generation for this:

Before we start the comparison, I measured the initial solution:






So now we know what we are tuning for.


Parallel cursor – Take 1

So I wanted to implement the parallel cursor solution first, although this was not my preferred solution. I am a creature of some habit and after I picked one up it takes some work to change it. Remember that in order for this to work the inner table has to be sorted before this:

This is the implementation:

This gave me a slight boost (Sort was also measured).







But I was actually very disappointed, I expected way more. That was only an improvement by not even a factor of 2. And I did remember it being a very potent technique.

Secondary key

So now my hopes were with the secondary key of the table – which I made up after the subroutine was called:

Implementing my secondary key was very simple and straight forward. Now I was excited to see the runtime.






Now this seems right. This result is way more desirable. We got a performance improvement by 6.271 %. Scaling also improved. While the difference in runtime between the sequential workprocess and the individual parallel workprocess in the initial solution was + 95%, we have improved to “just” 73%.

But this seemed way too heavy of a difference. I was getting second thoughts… this seemed to good to be true. My favorite technique won. But it won by such a large margin that it seemed also ridiculous to me.

And in my experience something that is too good to be true – it isn’t. So my conclusion was that I had to have messed up something. So I searched – and  I found.

Parallel cursor – Take 2

I did discover what I messed up. I even found it to be interesting enough to share it, because I imagine that more people could tap into this trap without knowing that they left out so much potential.

Let me highlight the magical spot:

Take a look at my where condition. Found something unusual? Let me help.

Tell me, what happens when we are done with looping over the 25 matching datasets in the item table? Quit? Really?

We actually do not quit, because this thing does not know that we are done after the first 25 sets. It needs to make sure that the table does not contain matches which are in another part of that table. And this is a problem.

Remember our initial code?

This code was slow. And it was slow for a reason. Searching with a generic key forces a search with an exponential runtime curve. And this is exactly what I forced in my first parallel cursor implementation:

I just gave it another starting point. But the search type stays the same and so does the runtime curve – which makes this implementation stupid.

I then made a little change and made the code rely on my previous sort:

I deleted the where clause in the inner loop and replaced it with the if statement. You should place it right in the beginning in order to avoid bugs.

This worked very nice:






This solution beat my secondary key solution by 398%. And it is 31.654% faster than the initial solution.

So if you have a similar situation, I would recommend a parallel cursor solution.

And I have to change my habit, after all this blows my previous favorite out of the water…


Take care