Skip to main content
edited tags
Link
Alexey Popkov
  • 62.5k
  • 7
  • 163
  • 405
deleted 3 characters in body; edited tags; edited title; edited title
Source Link
rm -rf
  • 89.8k
  • 21
  • 303
  • 498

Finding all elements within a certain range in a sorted list

Suppose we have a sorted list of values. Let's use list = Sort@RandomReal[1, 1000000]; for this example.

I need a fast function window[list, {xmin, xmax}] which will return all list elements $x$ for which $x_\textrm{min} \le x \le x_\textrm{max}$.

How can this be implemented in MathematicaMathematica? I am looking both for fast and for elegant solutions. The direct solution is implementing binary search, but there are several ways to do this, and perhaps MathematicaMathematica already has something built in that I am not aware of.


Here's the most naïve implementation:

window[list_, {xmin_, xmax_}] := list[[LengthWhile[list, # < xmin &] + 1 ;; LengthWhile[list, # <= xmax &]]] 

Summary:

Here are the timingtimings I get for the different solutions for some random data of a million machine reals which also contains duplicates:

  • My original naive solution: 3.85 s

  • Leonid, using binary search: 0.01 s (close to the measurable limit, $\log n$ complexity)

  • R.M., using Clip: 0.59 s (linear time, no sorting required)

  • faleichik, using Nearest: 1.29 s (strangely, this also runs in linear time, by measurement)

  • kguler, using Map (which autocompiles) and Pick: 0.30 s (also linear time, the fastest simple linear time solution so far, does not require sorting either)

For sorted data, the fastest solution is Leonid'sLeonid's, which uses binary search and has logarithmic complexity.

For unsorted data, the fastest (and also one of the simplest) is kguler'skguler's. A not so obvious trick was using Boole on the condition to allow it to be automatically compiled.

Finding elements in a sorted list

Suppose we have a sorted list of values. Let's use list = Sort@RandomReal[1, 1000000]; for this example.

I need a fast function window[list, {xmin, xmax}] which will return all list elements $x$ for which $x_\textrm{min} \le x \le x_\textrm{max}$.

How can this be implemented in Mathematica? I am looking both for fast and for elegant solutions. The direct solution is implementing binary search, but there are several ways to do this, and perhaps Mathematica already has something built in that I am not aware of.


Here's the most naïve implementation:

window[list_, {xmin_, xmax_}] := list[[LengthWhile[list, # < xmin &] + 1 ;; LengthWhile[list, # <= xmax &]]] 

Summary:

Here are the timing I get for the different solutions for some random data of a million machine reals which also contains duplicates:

  • My original naive solution: 3.85 s

  • Leonid, using binary search: 0.01 s (close to the measurable limit, $\log n$ complexity)

  • R.M., using Clip: 0.59 s (linear time, no sorting required)

  • faleichik, using Nearest: 1.29 s (strangely, this also runs in linear time, by measurement)

  • kguler, using Map (which autocompiles) and Pick: 0.30 s (also linear time, the fastest simple linear time solution so far, does not require sorting either)

For sorted data, the fastest solution is Leonid's, which uses binary search and has logarithmic complexity.

For unsorted data, the fastest (and also one of the simplest) is kguler's. A not so obvious trick was using Boole on the condition to allow it to be automatically compiled.

Finding all elements within a certain range in a sorted list

Suppose we have a sorted list of values. Let's use list = Sort@RandomReal[1, 1000000]; for this example.

I need a fast function window[list, {xmin, xmax}] which will return all list elements $x$ for which $x_\textrm{min} \le x \le x_\textrm{max}$.

How can this be implemented in Mathematica? I am looking both for fast and for elegant solutions. The direct solution is implementing binary search, but there are several ways to do this, and perhaps Mathematica already has something built in that I am not aware of.


Here's the most naïve implementation:

window[list_, {xmin_, xmax_}] := list[[LengthWhile[list, # < xmin &] + 1 ;; LengthWhile[list, # <= xmax &]]] 

Summary:

Here are the timings I get for the different solutions for some random data of a million machine reals which also contains duplicates:

  • My original naive solution: 3.85 s

  • Leonid, using binary search: 0.01 s (close to the measurable limit, $\log n$ complexity)

  • R.M., using Clip: 0.59 s (linear time, no sorting required)

  • faleichik, using Nearest: 1.29 s (strangely, this also runs in linear time, by measurement)

  • kguler, using Map (which autocompiles) and Pick: 0.30 s (also linear time, the fastest simple linear time solution so far, does not require sorting either)

For sorted data, the fastest solution is Leonid's, which uses binary search and has logarithmic complexity.

For unsorted data, the fastest (and also one of the simplest) is kguler's. A not so obvious trick was using Boole on the condition to allow it to be automatically compiled.

added 105 characters in body
Source Link
Szabolcs
  • 238.9k
  • 32
  • 653
  • 1.3k

Suppose we have a sorted list of values. Let's use list = Sort@RandomReal[1, 1000000]; for this example.

I need a fast function window[list, {xmin, xmax}] which will return all list elements $x$ for which $x_\textrm{min} \le x \le x_\textrm{max}$.

How can this be implemented in Mathematica? I am looking both for fast and for elegant solutions. The direct solution is implementing binary search, but there are several ways to do this, and perhaps Mathematica already has something built in that I am not aware of.


Here's the most naïve implementation:

window[list_, {xmin_, xmax_}] := list[[LengthWhile[list, # < xmin &] + 1 ;; LengthWhile[list, # <= xmax &]]] 

Summary:

Here are the timing I get for the different solutions for some random data of a million machine reals which also contains duplicates:

  • My original naive solution: 3.85 s

  • Leonid, using binary search: 0.01 s (close to the measurable limit, $\log n$ complexity)

  • R.M., using Clip: 0.59 s (linear time, no sorting required)

  • faleichik, using Nearest: 1.29 s (strangely, this also runs in linear time, by measurement)

  • kguler, using Map (which autocompiles) and Pick: 0.30 s (also linear time, the fastest simple linear time solution so far, does not require sorting either)

For sorted data, the fastest solution is Leonid's, which uses binary search and has logarithmic complexity. For

For unsorted data, the fastest (and also one of the simplest) is kguler's. A not so obvious trick was using Boole on the condition to allow it to be automatically compiled.

Suppose we have a sorted list of values. Let's use list = Sort@RandomReal[1, 1000000]; for this example.

I need a fast function window[list, {xmin, xmax}] which will return all list elements $x$ for which $x_\textrm{min} \le x \le x_\textrm{max}$.

How can this be implemented in Mathematica? I am looking both for fast and for elegant solutions. The direct solution is implementing binary search, but there are several ways to do this, and perhaps Mathematica already has something built in that I am not aware of.


Here's the most naïve implementation:

window[list_, {xmin_, xmax_}] := list[[LengthWhile[list, # < xmin &] + 1 ;; LengthWhile[list, # <= xmax &]]] 

Summary:

Here are the timing I get for the different solutions for some random data of a million machine reals which also contains duplicates:

  • My original naive solution: 3.85 s

  • Leonid, using binary search: 0.01 s (close to the measurable limit, $\log n$ complexity)

  • R.M., using Clip: 0.59 s (linear time, no sorting required)

  • faleichik, using Nearest: 1.29 s (strangely, this also runs in linear time, by measurement)

  • kguler, using Map (which autocompiles) and Pick: 0.30 s (also linear time, the fastest simple linear time solution so far, does not require sorting either)

For sorted data, the fastest solution is Leonid's, which uses binary search and has logarithmic complexity. For unsorted data, the fastest (and also one of the simplest) is kguler's.

Suppose we have a sorted list of values. Let's use list = Sort@RandomReal[1, 1000000]; for this example.

I need a fast function window[list, {xmin, xmax}] which will return all list elements $x$ for which $x_\textrm{min} \le x \le x_\textrm{max}$.

How can this be implemented in Mathematica? I am looking both for fast and for elegant solutions. The direct solution is implementing binary search, but there are several ways to do this, and perhaps Mathematica already has something built in that I am not aware of.


Here's the most naïve implementation:

window[list_, {xmin_, xmax_}] := list[[LengthWhile[list, # < xmin &] + 1 ;; LengthWhile[list, # <= xmax &]]] 

Summary:

Here are the timing I get for the different solutions for some random data of a million machine reals which also contains duplicates:

  • My original naive solution: 3.85 s

  • Leonid, using binary search: 0.01 s (close to the measurable limit, $\log n$ complexity)

  • R.M., using Clip: 0.59 s (linear time, no sorting required)

  • faleichik, using Nearest: 1.29 s (strangely, this also runs in linear time, by measurement)

  • kguler, using Map (which autocompiles) and Pick: 0.30 s (also linear time, the fastest simple linear time solution so far, does not require sorting either)

For sorted data, the fastest solution is Leonid's, which uses binary search and has logarithmic complexity.

For unsorted data, the fastest (and also one of the simplest) is kguler's. A not so obvious trick was using Boole on the condition to allow it to be automatically compiled.

added 603 characters in body
Source Link
Szabolcs
  • 238.9k
  • 32
  • 653
  • 1.3k
Loading
added 3 characters in body
Source Link
Szabolcs
  • 238.9k
  • 32
  • 653
  • 1.3k
Loading
Tweeted twitter.com/#!/StackMma/status/174102051599564800
added 169 characters in body
Source Link
Szabolcs
  • 238.9k
  • 32
  • 653
  • 1.3k
Loading
Source Link
Szabolcs
  • 238.9k
  • 32
  • 653
  • 1.3k
Loading