This one bugged me for a while. Whenever I implement an inner loop which is dependent upon the outer loop I start to think about which solution to implement. So i decided to answer the question parallel cursor vs. secondary key for my self – and make a post about it in the process 🙂
Initial Setup
Let me introduce my mighty initial setup…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
types: begin of head, head_id type i, end of head, begin of item, head_id type i, item_id type i, end of item. types tab_head type table of head. types tab_item type table of item. form process_data using it_head type tab_head it_item type tab_item. field-symbols: <ls_head> like line of it_head, <ls_item> like line of it_item. loop at it_head assigning <ls_head>. loop at it_item assigning <ls_item> where head_id = <ls_head>-head_id. * Please insert bug... endloop. endloop. endform. |
… a simple head <-> item loop logic. And this is my data generation for this:
1 2 3 4 5 6 7 8 9 |
do 1000 times. ls_head-head_id = sy-tabix. do 25 times. ls_item-head_id = ls_head-head_id. append ls_item to lt_item. enddo. append ls_head to lt_head. enddo. |
Before we start the comparison, I measured the initial solution:
sequential:
parallel:
So now we know what we are tuning for.
Parallel cursor – Take 1
So I wanted to implement the parallel cursor solution first, although this was not my preferred solution. I am a creature of some habit and after I picked one up it takes some work to change it. Remember that in order for this to work the inner table has to be sorted before this:
1 |
sort: lt_item by head_id ascending. |
This is the implementation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
form process_data using it_head type tab_head it_item type tab_item. data: lv_from type i value 0. field-symbols: <ls_head> like line of it_head, <ls_item> like line of it_item. loop at it_head assigning <ls_head>. read table it_item assigning <ls_item> with key head_id = <ls_head>-head_id binary search. lv_from = sy-tabix. loop at it_item assigning <ls_item> from lv_from where head_id = <ls_head>-head_id. * Please insert bug... endloop. endloop. endform. |
This gave me a slight boost (Sort was also measured).
sequential:
parallel:
But I was actually very disappointed, I expected way more. That was only an improvement by not even a factor of 2. And I did remember it being a very potent technique.
Secondary key
So now my hopes were with the secondary key of the table – which I made up after the subroutine was called:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
form process_data using it_head type tab_head it_item type tab_item. data: lt_item type table of item with non-unique sorted key head_id components head_id. field-symbols: <ls_head> like line of it_head, <ls_item> like line of lt_item. lt_item = it_item. loop at it_head assigning <ls_head>. loop at lt_item assigning <ls_item> using key head_id where head_id = <ls_head>-head_id. * Please insert bug... endloop. endloop. endform. |
Implementing my secondary key was very simple and straight forward. Now I was excited to see the runtime.
sequential:
parallel:
Now this seems right. This result is way more desirable. We got a performance improvement by 6.271 %. Scaling also improved. While the difference in runtime between the sequential workprocess and the individual parallel workprocess in the initial solution was + 95%, we have improved to “just” 73%.
But this seemed way too heavy of a difference. I was getting second thoughts… this seemed to good to be true. My favorite technique won. But it won by such a large margin that it seemed also ridiculous to me.
And in my experience something that is too good to be true – it isn’t. So my conclusion was that I had to have messed up something. So I searched – and I found.
Parallel cursor – Take 2
I did discover what I messed up. I even found it to be interesting enough to share it, because I imagine that more people could tap into this trap without knowing that they left out so much potential.
Let me highlight the magical spot:
1 2 3 |
loop at it_item assigning <ls_item> from lv_from where head_id = <ls_head>-head_id. * Please insert bug... endloop. |
Take a look at my where condition. Found something unusual? Let me help.
Tell me, what happens when we are done with looping over the 25 matching datasets in the item table? Quit? Really?
We actually do not quit, because this thing does not know that we are done after the first 25 sets. It needs to make sure that the table does not contain matches which are in another part of that table. And this is a problem.
Remember our initial code?
1 2 3 |
loop at it_item assigning <ls_item> where head_id = <ls_head>-head_id. * Please insert bug... endloop. |
This code was slow. And it was slow for a reason. Searching with a generic key forces a search with an exponential runtime curve. And this is exactly what I forced in my first parallel cursor implementation:
1 2 3 |
loop at it_item assigning <ls_item> from lv_from where head_id = <ls_head>-head_id. * Please insert bug... endloop. |
I just gave it another starting point. But the search type stays the same and so does the runtime curve – which makes this implementation stupid.
I then made a little change and made the code rely on my previous sort:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
form process_data using it_head type tab_head it_item type tab_item. data: lv_from type i value 0. field-symbols: <ls_head> like line of it_head, <ls_item> like line of it_item. loop at it_head assigning <ls_head>. read table it_item assigning <ls_item> with key head_id = <ls_head>-head_id binary search. lv_from = sy-tabix. loop at it_item assigning <ls_item> from lv_from. * Please insert bug... if <ls_item>-head_id <> <ls_head>-head_id. exit. endif. endloop. endloop. endform. |
I deleted the where clause in the inner loop and replaced it with the if statement. You should place it right in the beginning in order to avoid bugs.
This worked very nice:
sequential:
parallel:
This solution beat my secondary key solution by 398%. And it is 31.654% faster than the initial solution.
So if you have a similar situation, I would recommend a parallel cursor solution.
And I have to change my habit, after all this blows my previous favorite out of the water…
Take care
Dmitrii