Dictionary of large objects vs. Dictionary of array indices

Question

I have large objects that I want to access via a string identifier. My current approach is to use a Dictionary containing those:

var myObjects = new Dictionary<string, LargeObjectClass>(); var specificObject = myObjects["identifier"];

Now I was wondering whether storing many of those large objects in the Dictionary might be bad for performance and I would be better off using a Dictionary to store indices into an array that actually stores the objects:

var myObjects = new LargeObjectClass[size]; var objectIndices = new Dictionary<string, int>(); var specificObject = myObjects[objectIndices["identifier"]];

This is obviously a bad approach if the size of myObjects is unknown in advance or might change at runtime, but since the Dictionary is smaller and I read somewhere that arrays are more efficient than Dictionaries, I thought this approach might have a better performance in cases where the size is fixed.

Which of these approaches is more efficient, assuming the objects are very large?

Why not go ahead and test it yourself? Then you would know for sure. — user469104
– user469104, Commented Jul 1, 2015 at 20:13
In Dictionnary<string,object>, the object is just a reference to original object.The objects are not cloned. Using an int instead of object "address" provides a small memory benefit because an int is smaller than an object reference. But, it is more efficient in terms of performance because one indirection is skipped. But, all benefits/disadvantages are not really significant. As a conclusion, keep your first approach, i.e. dictionnary of objects. — Graffito
– Graffito, Commented Jul 1, 2015 at 20:23
@Graffito: Using ints in the dictionary would make the dictionary itself smaller, but adding an array to that which has to store all the object references just adds all that saved space back again, right? — StriplingWarrior
– StriplingWarrior, Commented Jul 1, 2015 at 20:52
@StriplingWarrior: in scenia code, the objects are stored in an array, thus already created. — Graffito
– Graffito, Commented Jul 1, 2015 at 20:57
@StriplingWarrior: I read scenia post too fast. I have been confused by "an array that actually stores the objects". I erroneously assumed that the objects were initially loaded in an array (or a list) and that the dictionnary was used to create an Index. Apologies... — Graffito
– Graffito, Commented Jul 1, 2015 at 21:43

StriplingWarrior · Accepted Answer · 2015-07-01 20:50:54Z

You're better off just using a Dictionary<> in this case. Remember that both the Dictionary and the Array are only storing references to your large objects because class instances are Reference Types. So the Dictionary will only be slightly smaller if it's storing ints than it would be storing any objects. That small difference is then overshadowed by the fact that your array would, itself, be storing object references, so the combined total would take up more space than just a Dictionary would.

An array would only give you better performance if it would allow you to avoid using a dictionary at all. This might happen, for example, if you were keying your objects based on consecutive int values rather than strings. But adding an array on top of a dictionary is going to be worse in every way.

Also, as a general rule, you should use the simplest, most maintainable approach until you have a performance problem. A Dictionary<> is highly unlikely to cause any performance problems unless you're invoking it millions of times.

I think you should explain why the dictionary will be fine, ie mentioning references.
I assume using an enum or constants instead of the Dictionary is no better?
@scenia: How many possible values are we talking about? And are all of its possible values known at compile time? It is possible that an array indexed by constant values would give you slightly better performance. You might even be better off creating a class with fields to represent all the possible values. But if there are an enormous number of possible values then it could get pretty unwieldy. Let me reiterate: your first priority should be maintainability. Your chances of running into performance issues in this part of code are pretty slim.
The keys are created from a file via a custom parser class. So there are pretty much unlimited possibilities. Only a single file is read though, and it won't have more than about 100 different values. I was thinking of creating an enum or defining constants at runtime, but now that I write it out, it feels pretty stupid...

Collectives™ on Stack Overflow

Dictionary of large objects vs. Dictionary of array indices

1 Answer 1

8 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Related