0

I have two functions they are:

extension Array where Element: Hashable { func uniqueOrderly() -> [Element] { let startTime = CFAbsoluteTimeGetCurrent() var set = Set<Element>() var array = [Element]() for element in self { if set.contains(element) { continue } set.insert(element) array.append(element) } let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime print("Time for uniqueOrderly: \(timeElapsed)") return array } } 

And second one:

public extension Sequence where Element: Equatable { func unique() -> [Element] { let startTime = CFAbsoluteTimeGetCurrent() var unique: [Element] { return reduce(into: []) { unique, x in if !unique.contains(x) { unique.append(x) } } } let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime print("Time for unique: \(timeElapsed)") return unique } } 

And I am doing function execution time measurement on the array.

That is :

let arrayToFilter = [1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1,1,2,4,6,1,2,5,7,9,3,3,1] 

Actual function calls and results are in this order:

arrayToFilter.unique() //Time for unique: 0.00012195110321044922 arrayToFilter.uniqueOrderly() Time for uniqueOrderly: 0.02329099178314209 

But when I change order of the function calls my unique() function show horrible time measurements.

arrayToFilter.uniqueOrderly() //Time for uniqueOrderly: 0.0013059377670288086 arrayToFilter.unique() //Time for unique: 8.940696716308594e-06 

So my question issue why do I have such situation with different function call order? Additionally when I run these tests in for loop, measurements are quite different. (Aproximatly +- 1 sec)

All mesurants were done in playground and real iOS application with Release build setting (on emulator).

Test Spec:

Xcode Version 10.1

Swift 4.2

3
  • 1
    First, where are you testing this? Make sure you are testing in an optimized build on a real device. Never test performance in the playground. Commented Nov 29, 2018 at 16:14
  • @rmaddy I update my question with information that you have requested. Commented Nov 29, 2018 at 16:22
  • @rmaddy I can add log with drop of the time execution if you need. Commented Nov 29, 2018 at 16:24

1 Answer 1

2

Your unique() implementation calculates the elapsed time before it calculates unique. unique is a computed var, and is evaluated when accessed.

Your uniqueOrderly() implementation will still be slower than unique() for fairly short lists like this one since inserting into the set is expensive compared to a linear search that almost always hits in the first few elements.

As to the ordering problem, that is almost certainly an artifact of how you're testing it. Micro-profiling is very challenging. I tested these by putting each into its own .swift file and running as swift -O <file>. However, if I put them together in the same file the way you have, and fix unique() to actually time its activity, then the times are pretty consistent when running as swift -O <file>:

Time for uniqueOrderly: 1.800060272216797e-05 [1, 2, 4, 6, 5, 7, 9, 3] Time for unique: 2.0265579223632812e-06 [1, 2, 4, 6, 5, 7, 9, 3] Time for unique: 2.002716064453125e-05 [1, 2, 4, 6, 5, 7, 9, 3] Time for uniqueOrderly: 2.9802322387695312e-06 [1, 2, 4, 6, 5, 7, 9, 3] 

After a little more testing, I suspect these differences are partially memory allocation (memory is fetched from the OS in chunks, so the first allocator pays a greater price than the second), and L1 caching. Almost all of the difference evaporates if you sort two different arrays (which points to the memory cache). But this algorithm is also very sensitive to the makeup of the array, since it has so many duplicates.

In any case, this kind of micro-optimizing is not meaningful. You're going to just chase ghosts all day. It is very difficult to performance test tiny pieces of code over tiny blocks of data and optimize it in a way that applies to real-world usage. At a minimum, you need much, much larger arrays and testing over many different kinds of distributions (lots of duplicates vs very few). You must test outside of Playgrounds, and you must use the optimizer.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.