24

I'm looking for an easy way to clone off an IEnumerable<T> parameter for later reference. LINQ's ToArray extension method seems like a nice, concise way to do this.

However, I'm not clear on whether it's always guaranteed to return a new array instance. Several of the LINQ methods will check the actual type of the enumerable, and shortcut if possible; e.g., Count() will see if the method implements ICollection<T>, and if so, will directly read its Count property; it only iterates the collection if it has to.

Given that mindset of short-circuiting where practical, it seems that, if I call ToArray() on something that already is an array, ToArray() might short-circuit and simply return the same array instance. That would technically fulfull the requirements of a ToArray method.

From a quick test, it appears that, in .NET 4.0, calling ToArray() on an array does return a new instance. My question is, can I rely on this? Can I be guaranteed that ToArray will always return a new instance, even in Silverlight and in future versions of the .NET Framework? Is there documentation somewhere that's clear on this point?

1
  • This would be so easy to test. Commented Apr 13, 2022 at 0:21

3 Answers 3

31

For non-empty collections, ToArray will always return a new array - making it change to return an existing value would be a horribly breaking change, and I'm utterly convinced that the .NET team wouldn't do this. It's an important thing to be able to rely on, in terms of the effect of modifying the resulting array. It's a shame it's not documented :(

There are lots of subtle bits of behaviour in LINQ to Objects which probably aren't worth relying on, but in this case it's such a massive bit of behaviour, I would be absolutely astonished for it to change.

Short-circuiting is great when it doesn't affect behaviour, but generally LINQ to Objects is pretty good about only optimizing in valid cases. You might want to look at the two posts in my Edulinq series covering optimization.

For empty source collections, some versions will return a different empty array on each call, and some will return the same empty array on each call. While that could break code, it's much less problematic than a change of implementation to say "if it's already an array, just return it" would be: empty arrays are naturally immutable, so the only way you'll be able to observe the difference is if you compare for reference identity.

Example:

var empty1 = new string[0]; var empty2 = new string[0]; var array1 = empty1.ToArray(); var array2 = empty2.ToArray(); // Prints True in some versions, and False in others Console.WriteLine(ReferenceEquals(array1, array2)); 
Sign up to request clarification or add additional context in comments.

2 Comments

Came searching cuz I wasn't 100% sure. XML comment on the method says Creates an array from a System.Collections.Generic.IEnumerable`1 so it's fairly "kinda documented" and "fairly convincing" (at least these days) but certainly I needed some more assurance. My use case was doing .ToArray() on a collection of EF entities before looping over and deleting them so I can not be removing an item from the thing I'm iterating. Definitely wouldn't expect this to change in the fx as it'd break A LOT of stuff for folks.
Here's a link to the ToArray source code for anyone that finds it useful: github.com/dotnet/runtime/blob/…
16

Since that ToArray method is internal to the .NET framework, I wouldn't stake my life on MS never changing it. However, what I would do, is to add a Unit Test asserting that ToArray returns a new array instance.

Assert.AreNotSame(myArray, myArray.ToArray()); 

That way, if you later change .NET framework versions, you will automatically know if the functionality changes.

Comments

0

Update: Jon has updated the accepted answer to cover this case.

The accepted answer is wrong as of this writing.

If your IEnumerable<T> is empty, you'll get back the singleton Array.Empty<T> instance.

https://source.dot.net/#System.Linq/EnumerableHelpers.Linq.cs,75

via https://source.dot.net/#System.Linq/System/Linq/ToCollection.cs,10

3 Comments

While this is interesting, it doesn't change the validity of the answer.
I've edited the answer now.
@Enigmativity I don't follow. The accepted answer had previously stated that "ToArray will always return a new array." If it sometimes returns the same empty instance, then how is that not invalid? The reason I arrived here was because I had learned that ToArray always returns a new array instance, and was surprised to track down a rare bug where that fact was relied upon to distinguish array instances by reference. Jon has already updated his answer, and even mentioned exactly the unusual case I had encountered!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.