6
$\begingroup$

In my algorithm I need to maintain a set (an unordered list of distinct elements) of expressions supporting two operations:

  • Test an expression for membership in the set
  • Adding a new expression to the set

Expressions are to be compared using SameQ. The set can have hundreds of thousands of elements and I want it to work as fast as possible. In most programming languages I would use a hash-table or a balanced tree to implement such a set. Is there any better data structure in Mathematica for this purpose than a plain List? Is it worth trying to manually implement a better structure?

$\endgroup$
6
  • $\begingroup$ Try linked lists. $\endgroup$ Commented Jun 7, 2013 at 20:13
  • 2
    $\begingroup$ You can use a hash table implicit in DownValues, just by introducing some symbol (say, presentQ). Starting definition is presentQ[_]=False. Then,adding is as simple as presentQ[expr] = True, and presentQ itself tests for membership. This seems the easiest option. You can also use System`Utilities`HashTable as an alternative. $\endgroup$ Commented Jun 7, 2013 at 20:47
  • 1
    $\begingroup$ @LeonidShifrin Using downvalues is a simple and great idea! I should have realized this myself. $\endgroup$ Commented Jun 7, 2013 at 21:46
  • $\begingroup$ @VladimirReshetnikov This is a standard and most common way to implement this sort of things. Sometimes one can also use SubValues, although the difference is mostly syntactic. But I have not seen a clear exposition in the documentation which would have explained that hash table functionality in Mathematica is most easily achieved via DownValues. $\endgroup$ Commented Jun 7, 2013 at 22:10
  • $\begingroup$ @LeonidShifrin I implemented the approach you suggested, but later found a bug in my implementation: when expr is a pattern, the plain presentQ[expr] = True does not have the intended meaning. The fix is to use presentQ[Verbatim[expr]] = True instead. $\endgroup$ Commented Jun 25, 2013 at 1:26

1 Answer 1

5
$\begingroup$

Per Leonid's comment:

You can use a hash table implicit in DownValues, just by introducing some symbol (say, presentQ). Starting definition is presentQ[_] = False. Then, adding is as simple as presentQ[expr] = True, and presentQ itself tests for membership. This seems the easiest option. You can also use System`Utilities`HashTable as an alternative.

However, Vladimir notes:

When expr is a pattern, the plain presentQ[expr] = True does not have the intended meaning. The fix is to use presentQ[Verbatim[expr]] = True instead.

I would also add that the new Association data structure in Mathematica 10 is likely to be a faster and perhaps more convenient approach than using either downvalues or the System`Utilities`HashTable.

$\endgroup$
1
  • $\begingroup$ How would Association be used? The list in Association has to be in the form of lhs -> rhs. $\endgroup$ Commented Jul 19, 2016 at 17:18

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.