2

I know how to iterate over a HashMap in Rust, however, I am a little confused about how this works in memory. How do we iterate over values that are not stored sequentially in memory? A detailed explanation of the code below at the heap and stack level would be much appreciated.

use std::collections::HashMap; let name = vec![String::from("Charlie"), String::from("Winston"), String::from("Brian"), String::from("Jack")]; let age = vec![50, 5, 7, 21]; let mut people_ages: HashMap<String, i32> = name.into_iter().zip(age.into_iter()).collect(); for (key, value) in &people_ages { println!("{}: {}", key, value); } 

1 Answer 1

2

At the end of the intro of the documentation, it is mentioned that the implementation relies on a C++ implementation of SwissTables. This page contains illustrations about two variants: « flat » and « node » based.

The main difference between these two variants is pointer stability. In the « node » based version, the key-value pairs, once inserted, keep their address in memory even if the hash is reorganised. In the « flat » version, some insertions/removals can make the previous key-value pairs be moved in memory.

When it comes to the Rust implementation, I am not experienced enough to be certain of any specific detail, but I tried this simple example based on yours.

use std::collections::HashMap; fn main() { let name = vec![ String::from("Charlie"), String::from("Winston"), String::from("Brian"), String::from("Jack"), ]; let age = vec![50, 5, 7, 21]; let mut people_ages: HashMap<String, i32> = name.into_iter().zip(age.into_iter()).collect(); let mut keys = Vec::new(); let mut values = Vec::new(); for (key, value) in &people_ages { keys.push(key); values.push(value); let key_addr = key as *const String as usize; let value_addr = value as *const i32 as usize; println!("{:x} {:x} {}: {}", key_addr, value_addr, key, value); } // people_ages.insert("Bob".to_owned(), 4); // mutable and immutable borrow println!("keys: {:?}", keys); println!("values: {:?}", values); } /* 55e08ff8bd40 55e08ff8bd58 Brian: 7 55e08ff8bd20 55e08ff8bd38 Charlie: 50 55e08ff8bd00 55e08ff8bd18 Winston: 5 55e08ff8bce0 55e08ff8bcf8 Jack: 21 keys: ["Brian", "Charlie", "Winston", "Jack"] values: [7, 50, 5, 21] */ 

The commented out line (insertion) is rejected because we cannot alter the hashmap while keeping references to its content. Thus, I guess (I'm not certain) that the implementation does not rely on the « node » based variant since we cannot take benefit of the pointer stability it provides (due to the ownership model in Rust), and probably it relies on the « flat » variant.

This means that we can expect that the key-value pairs associated with the same hash are tightly packed in memory, and iterating over them should be very similar to iterating over a vector: regular progression (with some skips however) very friendly with cache prefetch. Printing the addresses tends to confirm that guess (however the test is not complete enough), and shows a backward progression.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.