A ScyllaDB Community Designing an Energy-efficient Architecture for Geo Databases Yichen Wei Engineer Manager
Yichen Wei Engineer Manager ■ HA Distributed Platform domains with machine learning and GenAI ■ Evangelist of HPC, HTAP, Rust and FP/CT ■ WASM/WASI, FaaS
Geo service
Challenge ■ Critical Service ● Low latency ● High availability ● Backward and forward compatibility ■ Cost ● Data size ● Query performance
Geo service workflow
Why not existing local databases? Why not using sqlite, duckdb or rocksdb? ■ Different user cases ● Daily/Weekly full size dump ● No update operations ■ Performance and cost concerns ● Read/Write amplification and memory footprint ● Long tail lookups
Solution
Geo service workflow
Optimization 1: Loading
Optimization 2: Parsing ■ Encoding ● Dictionary encoding - Country, Region, City, Metro, Zip… ● Delta encoding - longitude and latitude
Optimization 2: Parsing ■ Memory layout and alignment
Optimization 2: Parsing ■ Memory layout and alignment
Optimization 3: Lookup ■
Optimization 3: Lookup Chunk Size ■ Graviton instance ● L1: 64K ● L2: 1M
Optimization 3: Lookup
Optimization 3: Lookup SIMD ■ ●
Optimization 3: Lookup SIMD ■ truth table ● 16 bytes vector - 4 IPv4 value IP1 IP2 IP3 IP4 T T T T T T T F T T F F T F F F F F F F
Optimization 3: Lookup SIMD ■ Does Golang have SIMD support? ● Yes and No
Optimization 3: Lookup SIMD ■ Does Golang have SIMD support? ● ●
Optimization 3: Lookup SIMD ■ Cgo is not Go!
Optimization 3: Lookup SIMD ■ https://github.com/alivanz/go-simd ■ `linkname` directive to call C code, bypass CGO
Optimization 3: Lookup SIMD
Optimization 3: Lookup SIMD
Optimization 3: Lookup - One More Thing premature optimization is the root of all evil
Optimization 3: Lookup - One More Thing
Optimization 3: Lookup - One More Thing
Optimization 3: Lookup - One More Thing
Optimization 3: Lookup - One More Thing
Optimization 3: Lookup - One More Thing Understand compiler first before optimizations!
Looking Forward ■ Compression & encoding ■ Histogram Distribution ■ Data prefetching ■ Change Language
Takeaways ■ No one knows your data better than you ■ Don’t be afraid to build the infrastructure for your data ■ Optimization is fun ● … and hard ■ Golong doesn’t support SIMD well
Thank you! Let’s connect. Yichen Wei yichen.wei@disney.com

Designing an Energy-efficient Architecture for Geo Databases by Yichen Wei