Hashing 2D, 3D and nD vectors

Question

What are good hashing functions (fast, good distribution, few collisions) for hashing 2d and 3d vectors composed of IEEE 32bit floats. I assume general 3d vectors, but algorithms assuming normals (always in [-1,1]) are also welcome. I also do not fear bit-manipulation as IEEE floats are alsways IEEE floats.

Another more general problem is hashing an Nd float-vector, where N is quite small (3-12) and constant but not known at compile time. At the moment I just take these floats as uints and XOR them together, which is probably not the best solution.

...have you tested how well your hashes are being distributed using the plain XOR method? You might be surprised. — Matti Virkkunen
– Matti Virkkunen, Commented May 8, 2011 at 16:33
@Matti it seems the distribution at least for 3d vectors is not very bad (tested on Stanford bunny 35k verts against hash table of size 65537). I just thought somebody perhaps has a more specialized solution, as I searched the net some time ago and haven't found anything on the topic. — Christian Rau
– Christian Rau, Commented May 8, 2011 at 17:31
65537 sounds like one greater than the number you might want to be using (or is a typo) — Steven Lu
– Steven Lu, Commented Sep 13, 2013 at 3:12
@StevenLu: absolutely not. ++ a power of two is a good safe way to almost always get a prime number. Which is necessary to avoid modulo correlations, and as such, makes awesome hash table sizing. — v.oddou
– v.oddou, Commented Nov 20, 2014 at 2:35

Community · Accepted Answer · 2020-06-20 09:12:55Z

50

There's a spatial hash function described in Optimized Spatial Hashing for Collision Detection of Deformable Objects. They use the hash function

hash(x,y,z) = ( x p1 xor y p2 xor z p3) mod n

where p1, p2, p3 are large prime numbers, in our case 73856093, 19349663, 83492791, respectively. The value n is the hash table size.

In the paper, x, y, and z are the discretized coordinates; you could probably also use the binary values of your floats.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered May 8, 2011 at 18:56

celion

4,07428 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

sehe Over a year ago

Note that 19349663 isn't prime (it's the product of 41 and 471943)

axel22 Over a year ago

I found that using the prime numbers p1 and p3 for the two-dimensional case results in very good distributions.

flogram_dev Over a year ago

When they wrote x p1 xor y p2 xor z p3, did they mean (x*p1) xor (y*p2) xor (z*p3) or x * (p1 xor y) * (p2 xor z) * p3?

celion Over a year ago

@tuple_cat I believe it's (x*p1) xor (y*p2) xor (z*p3)

tuned Over a year ago

Very interesting! Is there any implementation around? I am trying to implement this with scipy/numpy. Thanks.

|

koshy george · Accepted Answer · 2014-05-05 23:20:59Z

I have two suggestions.

Assume a grid cell of size l and quantize the x, y and z co-ordinates by computing ix = floor(x/l), iy = floor(y/l), and iz = floor(z/l), where ix, iy and iz are integers. Now use the hash function defined in Optimized Spatial Hashing for Collision Detection of Deformable Objects

If you don't do the quantization, it wont be sensitive to closeness(locality).

Locality Sensitive Hashing has been mentioned for hashing higher dimensional vectors. Why not use them for 3d or 2d vectors as well? A variant of LSH using adapted for Eucledian distance metric (which is what we need for 2d and 3d vectors) is called Locality Sensitive Hashing using p-stable distributions. A very readable tutorial is here.

BBSysDyn · Accepted Answer · 2020-11-09 13:31:50Z

I wrote this in Python based on the comments seen here,

l = 5 n = 5 p1,p2,p3 = 73856093, 19349663, 83492791 x1 = [33,4,11] x2 = [31,1,14] x3 = [10,44,19] def spatial_hash(x): ix,iy,iz = np.floor(x[0]/l), np.floor(x[1]/l), np.floor(x[2]/l) return (int(ix*p1) ^ int(iy*p2) ^ int(iz*p3)) % n print (spatial_hash(x1)) print (spatial_hash(x2)) print (spatial_hash(x3))

It gives

1 1 3

It seemed to work.

In C++

#include <cstdlib> #include <iostream> #include <unordered_map> #include <vector> #include <random> #include <eigen3/Eigen/Dense> using namespace Eigen; using namespace std; const int HASH_SIZE = 200; //const float MAX = 500.0; const float L = 0.2f; const float mmin = -1.f; const float mmax = 1.f; unordered_map<int, vector<Vector3d>> map ; inline size_t hasha(Vector3d &p) { int ix = (unsigned int)((p[0]+2.f) / L); int iy = (unsigned int)((p[1]+2.f) / L); int iz = (unsigned int)((p[2]+2.f) / L); return (unsigned int)((ix * 73856093) ^ (iy * 19349663) ^ (iz * 83492791)) % HASH_SIZE; } int main(int argc, char** argv) { std::default_random_engine generator; std::uniform_real_distribution<double> distribution(-1.0,1.0); for(size_t i=0;i<300;i++){ float x = distribution(generator); float y = distribution(generator); float z = distribution(generator); Vector3d v(x,y,z); std::cout << hasha(v) << " " << v[0] << " " << v[1] << " " << v[2] << std::endl; map[hasha(v)].push_back(v); vector<Vector3d> entry = map[hasha(v)]; std::cout << "size " << entry.size() << std::endl; } for (const auto & [ key, value ] : map) { cout << key << std::endl; vector<Vector3d> v = map[key]; float average = 0.0f; for (int i=0; i<v.size(); i++){ for (int j=0; j<v.size(); j++){ if (i!=j){ Vector3d v1 = v[i]; Vector3d v2 = v[j]; std::cout << " dist " << (v1-v2).norm() << std::endl; } } } } }

Collectives™ on Stack Overflow

Hashing 2D, 3D and nD vectors

3 Answers 3

7 Comments

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

Comments

2 Comments

Linked

Related