Skip to content

XiangpengHao/liquid-cache

Repository files navigation

liquid_cache_logo

Crates.io Version docs.rs

Rust CI codecov Codacy Badge ClickBench TPC-H TPC-DS

LiquidCache understands both your data and your query.

  • It transcodes storage data into an optimized, cache-only format, so you can keep using your favorite formats without worrying about performance.
  • It keeps the data that matters in memory and uses modern SSDs efficiently. For example, if your query groups by year, LiquidCache stores only the year in memory and keeps the full timestamp on disk.

LiquidCache is a research project funded by InfluxData, SpiralDB, and Bauplan.

You may want to consider Foyer if you're looking for a black-box cache: easier to setup, but not as "smart" as LiquidCache.

Quick start

This quick start uses the core cache API from src/core. Add these dependencies to your project: liquid-cache, arrow, and datafusion. The example below shows insert, get, get with selection, and get with predicate pushdown.

use arrow::array::{BooleanArray, UInt64Array}; use arrow::buffer::BooleanBuffer; use datafusion::logical_expr::Operator; use datafusion::physical_plan::PhysicalExpr; use datafusion::physical_plan::expressions::{BinaryExpr, Column, Literal}; use datafusion::scalar::ScalarValue; use liquid_cache::cache::{EntryID, LiquidCacheBuilder}; use std::sync::Arc; tokio_test::block_on(async { let cache = LiquidCacheBuilder::new().build().await; let entry_id = EntryID::from(1); let values = Arc::new(UInt64Array::from(vec![10, 11, 12, 13, 14, 15])); // 1) insert cache.insert(entry_id, values.clone()).await; // 2) get let all_rows = cache.get(&entry_id).await.expect("entry should exist"); // 3) get filtered (selection pushdown): keep rows 0, 2, 4 let selection = BooleanBuffer::from(vec![true, false, true, false, true, false]); let selected_rows = cache .get(&entry_id) .with_selection(&selection) .await .expect("entry should exist"); // 4) get with predicate pushdown: col > 12 let predicate: Arc<dyn PhysicalExpr> = Arc::new(BinaryExpr::new( Arc::new(Column::new("col", 0)), Operator::Gt, Arc::new(Literal::new(ScalarValue::UInt64(Some(12)))), )); let predicate_mask = cache .eval_predicate(&entry_id, &predicate) .await .expect("entry should exist") .expect("predicate should be evaluated in cache"); // Conceptual expectations: assert_eq!(all_rows.as_ref(), values.as_ref()); // [10, 11, 12, 13, 14, 15] assert_eq!(selected_rows.as_ref(), &UInt64Array::from(vec![10, 12, 14])); assert_eq!( predicate_mask, BooleanArray::from(vec![ Some(false), Some(false), Some(false), Some(true), Some(true), Some(true), ]), ); });

Development

See dev/README.md

Benchmark

See benchmark/README.md

Performance troubleshooting

Use LiquidCache with DataFusion

LiquidCache requires a few non-default DataFusion configurations:

ListingTable:

let (ctx, _) = LiquidCacheLocalBuilder::new().build(config).await?; let listing_options = ParquetReadOptions::default() .to_listing_options(&ctx.copied_config(), ctx.copied_table_options()); ctx.register_listing_table("default", &table_path, listing_options, None, None) .await?;

Or register Parquet directly:

let (ctx, _) = LiquidCacheLocalBuilder::new().build(config).await?; ctx.register_parquet("default", "examples/nano_hits.parquet", Default::default()) .await?;

Disable background transcoding

For performance testing, disable background transcoding:

let (ctx, _) = LiquidCacheLocalBuilder::new() .with_squeeze_policy(Box::new( squeeze_policies::Evict, )) .build(config) .await?;

x86-64 optimization

LiquidCache is optimized for x86-64 with specific instructions. On ARM (e.g., Apple Silicon), fallback implementations are used. Contributions are welcome.

FAQ

Can I use LiquidCache in production today?

Not yet. Production readiness is our goal, but we are still implementing features and polishing the system. LiquidCache began as a research project exploring new approaches to cost-effective caching. Like most research projects, it takes time to mature—we welcome your help.

How does LiquidCache work?

See our paper for details. We are also working on a technical blog to introduce LiquidCache in a more accessible way.

How can I get involved?

We are always looking for contributors. Feedback and improvements are welcome—explore the issue list and contribute where you can. If you want to get involved in the research side, reach out.

Who is behind LiquidCache?

LiquidCache is a research project funded by:

LiquidCache is and will remain open source and free to use.

Your support for science is greatly appreciated!

License

Apache License 2.0

Contributors

Languages