I have around 800,000 rows of data stored in the boost shared memory from the database. The data are in the form:
Id Color Length Size 1 1 2 4 2 3 4 5 3 2 2 0 4 1 2 4......and so on The Colors can be the value from 1-12 the length 1-4 and the size 1-5. The Id,Length,Color,Size are stored in seperate vector of 800,000 size in the shared memory. So there is Id vector for Id, Color vector for Color and so on.
I want to filter the data before I perform some computation. So I want data for which Color is 1 and length is 2 and Size 4 i.e row 1 and 4 in above case. Is there any efficient way to filter the data without using for loop and going through all the 800,000 images and checking the condition?
Right now I am just using mysql statement to get the Ids of the data which satisify the condition.
"select Id from features_table where Color=1 and Length=2 and Size =4" But is there a faster way to do this? Or should I stick to this method? I am looking for a faster method so I am not sure whether fetching the Ids from the database will increase the execution time of the algorithm.
What are the other options that I can consider in this case? I read about Hash table, B-Tree, Binary Search tree and I am confused which is suitable for this case. Will kd-tree be helpful in this case? Because many images may have the same combination of color,length and size. I am not sure if kd-tree is the right thing to do. I read about FLANN in opencv used for kd-tree. Is there any example or resources which may be helpful in this case? Or are there any built in c++ libraries?