5

For an example of string literal types, see this TypeScript documentation: https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#literal-types

For an example use case, I'd like to be able to create a library for performing operations on tabular data, where the table's columns are named and have heterogeneous types.

Supposing I have a table like this:

| name | age | quiz1 | quiz2 | midterm | quiz3 | quiz4 | final | | ------- | --- | ----- | ----- | ------- | ----- | ----- | ----- | | "Bob" | 12 | 8 | 9 | 77 | 7 | 9 | 87 | | "Alice" | 17 | 6 | 8 | 88 | 8 | 7 | 85 | | "Eve" | 13 | 7 | 9 | 84 | 8 | 8 | 77 | 

I'd like to be able to get compile time guarantees like this:

let result1: Vec<String> = table.get_column("name"); // ["Bob", "Alice", "Eve"] let result2: Vec<usize> = table.get_column("age"); // [12, 17, 13] 

I know I can represent a generic Table of heterogeneous types using something like Table<(String, usize, usize, ...)> but it's not clear how I would embed the column names, while also allowing the type to be able to change dynamically in the case of operations like:

table.add_column("quiz5", column_values); // returns a new Table that is typed with one more column 

Is there a way to do this in Rust?

One idea I had was to use macros. Are macros capable of producing type errors, so that perhaps...

let result = get_column!(table, "column_name"); 

...gets typed as Vec<String> or Vec<usize> (or as an error) depending on which string literal is passed?

2
  • You may want to try the polars crate. Commented Apr 10, 2022 at 6:53
  • 2
    Reading the documentation of Literal Types in TS, I would say that the closest things are enums. Maybe you should have an enum enumerating all the possible column keys. Commented Apr 10, 2022 at 8:37

2 Answers 2

2

You can achieve the guarantees you're looking for, but not via string literals.

To get compile-time checks that the column type matches the column that you're asking for, you'd need to create marker types (the name of the column) with associated types (the type of the data). Something like this is feasible:

use std::any::Any; use std::collections::HashMap; trait Column { type Data: 'static; const NAME: &'static str; } struct Name; impl Column for Name { type Data = String; const NAME: &'static str = "name"; } struct Age; impl Column for Age { type Data = usize; const NAME: &'static str = "age"; } struct Table { data: HashMap<&'static str, Box<dyn Any>>, } impl Table { fn new() -> Table { Table { data: HashMap::new() } } fn set_column<C: Column>(&mut self, data: Vec<C::Data>) { self.data.insert(C::NAME, Box::new(data)); } fn get_column<C: Column>(&self) -> &Vec<C::Data> { self.data .get(C::NAME) .and_then(|data| data.downcast_ref::<Vec<C::Data>>()) .expect("table does not have that column") } } fn main() { let mut table = Table::new(); table.set_column::<Name>(vec!["Bob".to_owned(), "Alice".to_owned()]); table.set_column::<Age>(vec![12, 17]); dbg!(table.get_column::<Name>()); dbg!(table.get_column::<Age>()); } 
[src/main.rs:50] table.get_column::<Name>() = [ "Bob", "Alice", ] [src/main.rs:51] table.get_column::<Age>() = [ 12, 17, ] 

One flaw with this implementation is it doesn't guarantee at compile-time that the Table actually contains the column you're looking for. For that, you need to encode the column types into the table type like you suggested: Table<(Name, Age, ...)>. It also needs to allow compile-time lookup (does (Name, Age, ...) contain Age?) and the ability to extend the type ((Name,) + Age => (Name, Age)). This is a daunting bit of template juggling you'd have to handle, but there are crates that provide this kind of functionality.

Here's a working example using lhlist (not necessarily advocating for it, it was just a crate I found that works well enough for demonstration purposes). It has a similar API to what we have above and not only has the expressiveness we need, but also allows us to associate data with the individual column types:

#[macro_use] extern crate lhlist; use lhlist::{Label, LVCons, LookupElemByLabel, LabeledValue, Value, Nil}; new_label!(Name: Vec<String>); new_label!(Age: Vec<usize>); new_label!(Grade: Vec<usize>); struct Table<Columns> { columns: Columns } impl Table<Nil> { fn new() -> Table<Nil> { Table { columns: Nil::default() } } } impl<Columns> Table<Columns> { fn add_column<C>(self, data: C::AssocType) -> Table<LVCons<C, Columns>> where C: Label + 'static { Table { columns: lhlist::cons(lhlist::labeled_typearg::<C>(data), self.columns) } } fn get_column<C>(&self) -> &C::AssocType where C: Label + 'static, Columns: LookupElemByLabel<C, Elem = LabeledValue<C>>, { self.columns.elem().value_ref() } } fn main() { let table = Table::new(); let table = table.add_column::<Name>(vec!["Bob".to_owned(), "Alice".to_owned()]); let table = table.add_column::<Age>(vec![12, 17]); dbg!(table.get_column::<Name>()); dbg!(table.get_column::<Age>()); // dbg!(table.get_column::<Grade>()); // compile-time error } 
[src\main.rs:42] table.get_column::<Name>() = [ "Bob", "Alice", ] [src\main.rs:43] table.get_column::<Age>() = [ 12, 17, ] 

This could probably be made more ergonomic in a few regards, but I hope it shows how this could be done. Rust obviously does not have string literal types (I don't think anything can match the type-flexibility that Typescript has), but its not too much of a stretch to use more traditional struct types to achieve your goal.

Sign up to request clarification or add additional context in comments.

Comments

0

You can use downcasting to get the same behavior (your requirements do not seem possible without it, the compiler cannot know what will happen at runtime):

pub struct Table { columns: HashMap<String, Box<dyn AnyColumn>>, } impl Table { pub fn new() -> Self { Self { columns: HashMap::new(), } } pub fn add_column(&mut self, column: impl AnyColumn) -> Option<Box<dyn AnyColumn>> { self.columns.insert(column.name().to_string(), Box::new(column)) } pub fn get_column<E, T: Column<E>>(&self, name: &str) -> Option<&T> { self.columns.get(name).and_then(|c| (c.deref() as &dyn Any).downcast_ref::<T>()) } pub fn get_column_mut<E, T: Column<E>>(&mut self, name: &str) -> Option<&mut T> { self.columns.get_mut(name).and_then(|c| (c.deref_mut() as &mut dyn Any).downcast_mut::<T>()) } } pub struct VecColumn<T> { name: String, data: Vec<T>, } impl<T: 'static> AnyColumn for VecColumn<T> { fn name(&self) -> &str { &self.name } } impl<T: 'static> Column<T> for VecColumn<T> {} impl<T> Index<usize> for VecColumn<T> { type Output = T; fn index(&self, index: usize) -> &Self::Output { &self.data[index] } } impl<T> IndexMut<usize> for VecColumn<T> { fn index_mut(&mut self, index: usize) -> &mut Self::Output { &mut self.data[index] } } pub trait AnyColumn: Any { fn name(&self) -> &str; } pub trait Column<T>: IndexMut<usize, Output = T> + AnyColumn {} 

6 Comments

The OP wants compile time safety.
OP is really contradicting himself as he wants columns to be added dynamically but also a compile-time error when the column does not exist. When you know your columns at compile time you should write a struct of vectors. What's the point of converting get!(table, "column") to table.column?
Indeed. Then "it is not possible" is a good answer. But "you can use reflection" - no, you can't (and by the way, Rust does not have reflection in the usual sense).
Good suggestion, I changed the description of the answer. What is the dynamic type inference called then in rust?
Downcasting? Reflection is usually about retreiving type metadata at runtime, like field name/types, methods, code generation, etc..
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.