After more benchmarking, the reading time is exponential versus the number of entries. So it seems that even cdb_get_objects internally browses all the list from the beginning to find each next item. Thus splitting the list into two-level hierarchy greatly improves the total reading time.
Good you have found a solution that works for you. Few remarks that might help you or others reading this:
CDB API really may not be optimal for traversing large lists due to how it handles indexes (though it would not explain exponential behavior, that would really be a reason for concern); you might want to have a look at MAAPI instead.
Using bulk methods (such as .._get_objects) improves performance only up to a point, it does not help to try retrieve thousands of entries in one shot; and transfering large amount of data at once might actually degrade performance when you hit memory issues. My rule of thumb says that (lower) hundreds are more than enough.
@ SCadilhac: I.e. retrieve the data using chunks of say 100 objects / list entries at a time. For your use case, you will then likely notice a 10x improvement in wall clock time to retrieve 50k entries, while the time will be linear versus the number of entries.
Here is an updated test to check the loading time vs the number of entries to read (reading 100 list entries at a time):
#define CHUNK 100
for (int max = 5000; max < 50000; max += 5000) {
uint64_t before = get_time_ms();
confd_value_t values[CHUNK * 5];
int i = 0;
while (i < max) {
cdb_get_objects(rsock, values, 5, i, CHUNK, "/line");
i += CHUNK;
}
uint64_t after = get_time_ms();
printf("Loaded first %d entries in %lld ms\n", i, after - before);
}
printf("DONE");
Which gives the following output:
Loaded first 5000 entries in 936 ms
Loaded first 10000 entries in 2499 ms
Loaded first 15000 entries in 5067 ms
Loaded first 20000 entries in 7699 ms
Loaded first 25000 entries in 12325 ms
Loaded first 30000 entries in 16217 ms
Loaded first 35000 entries in 23926 ms
Loaded first 40000 entries in 28852 ms
Loaded first 45000 entries in 38569 ms
DONE