Is there a reason why one should use ContainsKey over TryGetValue

Question

I have tried some tests and it does not matter if there are all hits or miss. TryGetValue is always faster. When one should use ContainsKey?

Your timing test is probably not perfoming correctly (assuming you're using a Dictionary), both of those functions call one function FindEntry() except that TryGetValue executes more code after (albeit just a single assignment!), so it can't be faster. referencesource.microsoft.com/#mscorlib/system/collections/… — starlight54
– starlight54, Commented Oct 5, 2016 at 23:37
Your benchmarks are wrong. It's hard to say why, because we can't see them. ContainsKey is considerably faster; you use it when you don't need to retrieve the contents but just need to know if it's there or not for the specific reason that it's faster. — Ken White
– Ken White, Commented Oct 5, 2016 at 23:44
@KenWhite the benchmark is correct if he takes the value after that ! — mybirthname
– mybirthname, Commented Oct 5, 2016 at 23:57
@mybirthname: If the poster is doing that, their benchmark is absolutely incorrect, because it's not comparing ContainsKey and TryGetValue; it would be comparing TryGetValue against ContainsKey+TryGetValue. — Ken White
– Ken White, Commented Oct 6, 2016 at 0:22
Same when setting a value: d[key] = value; skips the extra lookup compared to: if (!d.ContainsKey(key)) d.Add(key, value); — Brain2000
– Brain2000, Commented Apr 21, 2020 at 22:42

mybirthname · Accepted Answer · 2016-10-06 17:46:19Z

It depends for what are you using the methods. If you open Reference source you will see.

public bool TryGetValue(TKey key, out TValue value) { int index = this.FindEntry(key); if (index >= 0) { value = this.entries[index].value; return true; } value = default(TValue); return false; } public bool ContainsKey(TKey key) { return (this.FindEntry(key) >= 0); }

Like you see TryGetValue is same as ContainsKey + one array lookup.

If your logic is only to check if the key is existing in the Dictionary and nothing else related to this key (taking the value for the key) you should use ContainsKey.

If you want to take the value for the specific key, TryGetValue is faster than

if(dic.ContainsKey(keyValue)) { dicValue = dic[keyValue]; // here you do more work! }

Logic about Dictionary[key]

 public TValue this[TKey key] { get { int i = FindEntry(key); if (i >= 0) return entries[i].value; ThrowHelper.ThrowKeyNotFoundException(); return default(TValue); } set { Insert(key, value, false); } }

So pretty much you will go 2 times in FindEntry method + array lookup, if you use the key with ContainsKey and after that take the value with dic[key]. You have overhead of calling FindEntry one additional time.

Paul Tsai · Accepted Answer · 2016-10-06 16:22:30Z

Conceptually, the two methods are very different. ContainsKey simply checks to see if a given key is in the dictionary. TryGetValue will attempt to return the value for a given key, provided it's in the dictionary. Both can be fast, depending on what you want to do.

Consider the following method, which returns a value from dictionary or returns string.Empty.

Dictionary<int,string> Dict = new Dictionary<int,string>(); string GetValue1(int key) { string outValue; if (!Dict.TryGetValue(key, out outValue)) outValue = string.Empty; return outValue; }

Compared to

string GetValue2(int key) { return (Dict.ContainsKey(key)) ? Dict[key] : string.Empty; }

Both are relatively fast and the performance between the two is negligible in most situations (see unit tests below). If we wanted to be picky, using the TryGetValue requires the use of an out parameter, which is more overhead than a regular parameter. This variable will get set to the default for a type if it's not found, which is null for strings. In the above example, this null value isn't used, but we incur the overhead anyway. Finally, the GetValue1 requires the use of a local variable outValue, while GetValue2 doesn't.

BACON pointed out that GetValue2 would use 2 lookups in cases where the value is found, which is comparatively expense. This is true and also implies that in cases where the key is not found, GetValue2 would perform faster. So if the program is expecting for a majority of lookups to be misses, use GetValue2. Otherwise use GetValue1.

If dealing with a ConcurrentDictionary, use TryGetValue since operations on it should be atomic, i.e. once you check for a value in a multi-threaded environment, that check could be incorrect when you try to access the value.

Included below are 2 unit tests, testing the performance between the two methods with dictionaries with different Key types (int vs string). This benchmark is only showing how the gap between the two methods can close/widen as their context changes.

[TestMethod] public void TestString() { int counter = 10000000; for (var x = 0; x < counter; x++) DictString.Add(x.ToString(), "hello"); TimedLog("10,000,000 hits TryGet", () => { for (var x = 0; x < counter; x++) Assert.IsFalse(string.IsNullOrEmpty(GetValue1String(x.ToString()))); }, Console.WriteLine); TimedLog("10,000,000 hits ContainsKey", () => { for (var x = 0; x < counter; x++) Assert.IsFalse(string.IsNullOrEmpty(GetValue2String(x.ToString()))); }, Console.WriteLine); TimedLog("10,000,000 misses TryGet", () => { for (var x = counter; x < counter*2; x++) Assert.IsTrue(string.IsNullOrEmpty(GetValue1String(x.ToString()))); }, Console.WriteLine); TimedLog("10,000,000 misses ContainsKey", () => { for (var x = counter; x < counter*2; x++) Assert.IsTrue(string.IsNullOrEmpty(GetValue2String(x.ToString()))); }, Console.WriteLine); } [TestMethod] public void TestInt() { int counter = 10000000; for (var x = 0; x < counter; x++) DictInt.Add(x, "hello"); TimedLog("10,000,000 hits TryGet", () => { for (var x = 0; x < counter; x++) Assert.IsFalse(string.IsNullOrEmpty(GetValue1Int(x))); }, Console.WriteLine); TimedLog("10,000,000 hits ContainsKey", () => { for (var x = 0; x < counter; x++) Assert.IsFalse(string.IsNullOrEmpty(GetValue2Int(x))); }, Console.WriteLine); TimedLog("10,000,000 misses TryGet", () => { for (var x = counter; x < counter * 2; x++) Assert.IsTrue(string.IsNullOrEmpty(GetValue1Int(x))); }, Console.WriteLine); TimedLog("10,000,000 misses ContainsKey", () => { for (var x = counter; x < counter * 2; x++) Assert.IsTrue(string.IsNullOrEmpty(GetValue2Int(x))); }, Console.WriteLine); } public static void TimedLog(string message, Action toPerform, Action<string> logger) { var start = DateTime.Now; if (logger != null) logger.Invoke(string.Format("{0} Started at {1:G}", message, start)); toPerform.Invoke(); var end = DateTime.Now; var span = end - start; if (logger != null) logger.Invoke(string.Format("{0} Ended at {1} lasting {2:G}", message, end, span)); }

Results of TestInt

10,000,000 hits TryGet Started at ... 10,000,000 hits TryGet Ended at ... lasting 0:00:00:00.3734136 10,000,000 hits ContainsKey Started at ... 10,000,000 hits ContainsKey Ended at ... lasting 0:00:00:00.4657632 10,000,000 misses TryGet Started at ... 10,000,000 misses TryGet Ended at ... lasting 0:00:00:00.2921058 10,000,000 misses ContainsKey Started at ... 10,000,000 misses ContainsKey Ended at ... lasting 0:00:00:00.2579766

For hits, ContainsKey is approx 25% slower TryGetValue

For misses, TryGetValue is approx 13% slower than ContainsKey

Result of TestString

10,000,000 hits TryGet Started at ... 10,000,000 hits TryGet Ended at ... lasting 0:00:00:03.2232018 10,000,000 hits ContainsKey Started at ... 10,000,000 hits ContainsKey Ended at ... lasting 0:00:00:03.6417864 10,000,000 misses TryGet Started at ... 10,000,000 misses TryGet Ended at ... lasting 0:00:00:03.6508206 10,000,000 misses ContainsKey Started at ... 10,000,000 misses ContainsKey Ended at ... lasting 0:00:00:03.4912164

For hits, ContainsKey is approx 13% slower TryGetValue

For misses, TryGetValue is approx 4.6% slower than ContainsKey

The documentation for TryGetValue states "This method combines the functionality of the ContainsKey method and the Item property." While it is true that, compared to GetValue2, GetValue1 has the "overhead" of a local variable and passing a parameter by reference, this is negligible compared to GetValue2 where you are performing two lookups in the case where Dict contains the key key: one lookup in the call to Dict.ContainsKey(key), and another in retrieving Dict[key].
Thank you for the comment. I updated my post to reflect your point.
From the source we know that when the key is not found TryGetValue only has to perform an additional jump and write to the output parameter compared to ContainsKey. I do think this answer is too focused on micro-optimizations such as that and avoiding local variables and output parameters. This benchmark shows that for 10 million lookups (all misses) TryGetValue is only marginally slower than ContainsKey. Unless there are exceptional circumstances where TryGetValue is proving to be a bottleneck and testing shows that [continued]
[continued] ContainsKey + Dict[key] provide a worthwhile performance improvement, I think the general recommendation should be to choose a Dictionary<> method based on its intent: ContainsKey is for checking the existence of a key whose value is not needed, and TryGetValue is for attempting to retrieve the value for a key that doesn't necessarily exist.
Your points are valid. Thanks for the benchmark link, I've updated my post to include some benchmarks with dictionaries of different type keys and of larger sizes. It's clear that there are conditions where one does perform significantly better than the other, and even the Key's type can affect performance.

Collectives™ on Stack Overflow

Is there a reason why one should use ContainsKey over TryGetValue

2 Answers 2

Comments

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

5 Comments

Linked

Related