Skip to content

Move the math expression parser from Pest to Chumsky and add more features#2685

Open
urisinger wants to merge 18 commits intoGraphiteEditor:masterfrom
urisinger:Math
Open

Move the math expression parser from Pest to Chumsky and add more features#2685
urisinger wants to merge 18 commits intoGraphiteEditor:masterfrom
urisinger:Math

Conversation

@urisinger
Copy link
Contributor

Part of #2026

Switches to the chumsky parser for performance, remove Unit support until #586 is resolved.

Improves parser performance by about 40%, while also reducing compile times and improving readability.

@Keavon Keavon changed the title Add if statments to math parser and switch to chumsky parser Add if statements to math parser and switch to Chumsky parser Jun 5, 2025
@Keavon Keavon changed the title Add if statements to math parser and switch to Chumsky parser Move the math expression parser from Pest to Chumsky and add more features Jun 9, 2025
@Keavon Keavon marked this pull request as draft June 14, 2025 22:14
@Keavon Keavon force-pushed the master branch 2 times, most recently from 34dced3 to 5cacab2 Compare June 20, 2025 06:47
@Keavon
Copy link
Member

Keavon commented Jun 20, 2025

I have rebased this branch to remove merge commits, so you'll need to git reset --hard to its new ref here on origin.

@Keavon Keavon force-pushed the master branch 2 times, most recently from 316a737 to 532e913 Compare June 28, 2025 13:25
@Keavon Keavon force-pushed the master branch 4 times, most recently from ec51271 to e025103 Compare July 10, 2025 05:48
@RegenJacob
Copy link

How far is this?

I might be interested in implementing some math expressions from #2026
I figured this PR would be a blocking factor before I could do that.

@urisinger
Copy link
Contributor Author

How far is this?

I might be interested in implementing some math expressions from #2026
I figured this PR would be a blocking factor before I could do that.

Its ready for review, its just not being merged becuase without variadics of some form this node is pretty useless, creating the custom error message for the node is also a difficult task.

@Keavon Keavon force-pushed the master branch 4 times, most recently from 3697b7d to 90533e6 Compare March 11, 2026 12:15
@Keavon Keavon marked this pull request as ready for review March 16, 2026 21:32
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 issues found across 14 files

Prompt for AI agents (unresolved issues)
 Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately. <file name="libraries/math-parser/src/executer.rs"> <violation number="1" location="libraries/math-parser/src/executer.rs:28"> P2: Binary-operator failures are misreported as function-call `TypeError`, conflating distinct error kinds and producing misleading diagnostics.</violation> <violation number="2" location="libraries/math-parser/src/executer.rs:47"> P2: Conditional truthiness treats `NaN` numeric results as true, allowing invalid conditions to execute the `if` branch.</violation> </file> <file name="frontend/wasm/src/editor_api.rs"> <violation number="1" location="frontend/wasm/src/editor_api.rs:931"> P2: Parser error diagnostics are dropped from logs before the API collapses failures to `None`, reducing debuggability of parse regressions.</violation> </file> <file name="libraries/math-parser/src/lexer.rs"> <violation number="1" location="libraries/math-parser/src/lexer.rs:173"> P2: Unchecked numeric accumulation and exponent casting can overflow for long literals/exponents, causing panics in debug builds or incorrect numeric values in release builds.</violation> </file> <file name="libraries/math-parser/benches/bench.rs"> <violation number="1" location="libraries/math-parser/benches/bench.rs:12"> P2: Parsing benchmark ignores parse errors, allowing parser regressions to be benchmarked silently as error cases.</violation> </file> <file name="libraries/math-parser/src/value.rs"> <violation number="1" location="libraries/math-parser/src/value.rs:170"> P1: Factorial uses an unbounded linear loop over `n`, enabling very large inputs to cause extreme CPU consumption even after `f64` result has already overflowed.</violation> </file> <file name="libraries/math-parser/src/lib.rs"> <violation number="1" location="libraries/math-parser/src/lib.rs:44"> P2: Leftover `dbg!` in shared end-to-end test helper causes unconditional debug output and avoidable per-test overhead.</violation> </file> <file name="libraries/math-parser/src/parser.rs"> <violation number="1" location="libraries/math-parser/src/parser.rs:146"> P1: Chained comparison logic is ineffective: `cmp` consumes the full chain first, so `chained_cmp` never multiplies pairwise comparisons as intended.</violation> </file> <file name="node-graph/gmath-nodes/src/lib.rs"> <violation number="1" location="node-graph/gmath-nodes/src/lib.rs:474"> P1: GCD/LCM nodes accept `i32` but `binary_gcd` assumes non‑negative values. For negative inputs, the shift/subtract loop can cycle forever (e.g., -2 and 4), hanging evaluation or producing wrong results. Normalize to absolute values or remove `i32` support before calling `binary_gcd`.</violation> </file> <file name="libraries/math-parser/src/diagnostic.rs"> <violation number="1" location="libraries/math-parser/src/diagnostic.rs:24"> P2: `CompileError::print()` panics on `term::emit` failures by unwrapping the `Result`, so diagnostic reporting can crash on ordinary I/O or file span errors. Propagate or handle the error instead of panicking in this library helper.</violation> </file> <file name="libraries/math-parser/src/constants.rs"> <violation number="1" location="libraries/math-parser/src/constants.rs:336"> P1: `root` uses `powf(1/n)` for real inputs, causing odd roots of negative reals (e.g. `root(-8, 3)`) to return `NaN` instead of the expected real value.</violation> <violation number="2" location="libraries/math-parser/src/constants.rs:366"> P2: Complex log2 implementation is incorrect: it divides the input by ln(2) instead of computing ln(z)/ln(2) (or `log2()`), yielding wrong results for all complex inputs.</violation> <violation number="3" location="libraries/math-parser/src/constants.rs:524"> P1: `gcd`/`lcm` use unchecked float→`i64` conversion and integer arithmetic, allowing non-finite/huge inputs to produce invalid results or overflow paths.</violation> </file> 

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Add one-off context when rerunning by tagging @cubic-dev-ai with guidance or docs links (including llms.txt)
  • Ask questions if you need clarification on any suggestion

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment on lines +170 to +172
for k in 1..=n {
acc *= k as f64;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Factorial uses an unbounded linear loop over n, enabling very large inputs to cause extreme CPU consumption even after f64 result has already overflowed.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/value.rs, line 170: <comment>Factorial uses an unbounded linear loop over `n`, enabling very large inputs to cause extreme CPU consumption even after `f64` result has already overflowed.</comment> <file context> @@ -107,15 +155,37 @@ impl Number { +	} +	let n = truncated as u64; +	let mut acc = 1.0_f64; +	for k in 1..=n { +	acc *= k as f64; +	} </file context> 
Suggested change
for k in 1..=n {
acc *= k as f64;
}
if n > 170 {
return Number::Real(f64::INFINITY);
}
for k in 1..=n {
acc *= k as f64;
}

// Chain comparisons like `a < b < c` by multiplying the boolean
// (1.0 / 0.0) results, preserving the existing semantics.
let chained_cmp = cmp.clone().foldl(cmp.repeated(), |lhs, rhs| Node::BinOp {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Chained comparison logic is ineffective: cmp consumes the full chain first, so chained_cmp never multiplies pairwise comparisons as intended.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/parser.rs, line 146: <comment>Chained comparison logic is ineffective: `cmp` consumes the full chain first, so `chained_cmp` never multiplies pairwise comparisons as intended.</comment> <file context> @@ -1,316 +1,174 @@ + +	// Chain comparisons like `a < b < c` by multiplying the boolean +	// (1.0 / 0.0) results, preserving the existing semantics. +	let chained_cmp = cmp.clone().foldl(cmp.repeated(), |lhs, rhs| Node::BinOp { +	lhs: Box::new(lhs), +	op: BinaryOp::Mul, </file context> 
fn greatest_common_divisor<T: num_traits::int::PrimInt + std::ops::ShrAssign<i32> + std::ops::SubAssign>(
_: impl Ctx,
/// One of the two numbers for which the GCD will be calculated.
#[implementations(u32, u64, i32)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: GCD/LCM nodes accept i32 but binary_gcd assumes non‑negative values. For negative inputs, the shift/subtract loop can cycle forever (e.g., -2 and 4), hanging evaluation or producing wrong results. Normalize to absolute values or remove i32 support before calling binary_gcd.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At node-graph/gmath-nodes/src/lib.rs, line 474: <comment>GCD/LCM nodes accept `i32` but `binary_gcd` assumes non‑negative values. For negative inputs, the shift/subtract loop can cycle forever (e.g., -2 and 4), hanging evaluation or producing wrong results. Normalize to absolute values or remove `i32` support before calling `binary_gcd`.</comment> <file context> @@ -0,0 +1,780 @@ +fn greatest_common_divisor<T: num_traits::int::PrimInt + std::ops::ShrAssign<i32> + std::ops::SubAssign>( +	_: impl Ctx, +	/// One of the two numbers for which the GCD will be calculated. +	#[implementations(u32, u64, i32)] +	value: T, +	/// The other of the two numbers for which the GCD will be calculated. </file context> 
@@ -1,14 +1,22 @@
use crate::value::{Number, Value};
use lazy_static::lazy_static;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: gcd/lcm use unchecked float→i64 conversion and integer arithmetic, allowing non-finite/huge inputs to produce invalid results or overflow paths.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/constants.rs, line 524: <comment>`gcd`/`lcm` use unchecked float→`i64` conversion and integer arithmetic, allowing non-finite/huge inputs to produce invalid results or overflow paths.</comment> <file context> @@ -107,12 +148,486 @@ lazy_static! { +	if x == 0 && y == 0 { +	return Some(Value::Number(Number::Real(0.0))); +	} +	x = x.abs(); +	y = y.abs(); +	while y != 0 { </file context> 
"root",
Box::new(|values| match values {
[Value::Number(Number::Real(x)), Value::Number(Number::Real(n))] => {
Some(Value::Number(Number::Real(x.powf(1.0 / *n))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: root uses powf(1/n) for real inputs, causing odd roots of negative reals (e.g. root(-8, 3)) to return NaN instead of the expected real value.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/constants.rs, line 336: <comment>`root` uses `powf(1/n)` for real inputs, causing odd roots of negative reals (e.g. `root(-8, 3)`) to return `NaN` instead of the expected real value.</comment> <file context> @@ -107,12 +148,486 @@ lazy_static! { +	"root", +	Box::new(|values| match values { +	[Value::Number(Number::Real(x)), Value::Number(Number::Real(n))] => { +	Some(Value::Number(Number::Real(x.powf(1.0 / *n)))) +	} +	[Value::Number(Number::Complex(x)), Value::Number(Number::Real(n))] => { </file context> 
let mut v = 0u64;
let mut digits = 0;
while let Some(d) = self.peek().and_then(|c| c.to_digit(10)) {
v = v * 10 + d as u64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Unchecked numeric accumulation and exponent casting can overflow for long literals/exponents, causing panics in debug builds or incorrect numeric values in release builds.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/lexer.rs, line 173: <comment>Unchecked numeric accumulation and exponent casting can overflow for long literals/exponents, causing panics in debug builds or incorrect numeric values in release builds.</comment> <file context> @@ -0,0 +1,372 @@ +	let mut v = 0u64; +	let mut digits = 0; +	while let Some(d) = self.peek().and_then(|c| c.to_digit(10)) { +	v = v * 10 + d as u64; +	digits += 1; +	self.bump(); </file context> 
c.bench_function(concat!("parse ", $input), |b| {
b.iter(|| {
let _ = black_box(ast::Node::try_parse_from_str($input)).unwrap();
let _ = black_box(ast::Node::try_parse_from_str($input));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Parsing benchmark ignores parse errors, allowing parser regressions to be benchmarked silently as error cases.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/benches/bench.rs, line 12: <comment>Parsing benchmark ignores parse errors, allowing parser regressions to be benchmarked silently as error cases.</comment> <file context> @@ -9,15 +9,21 @@ macro_rules! generate_benchmarks {	c.bench_function(concat!("parse ", $input), |b| {	b.iter(|| { -	let _ = black_box(ast::Node::try_parse_from_str($input)).unwrap(); +	let _ = black_box(ast::Node::try_parse_from_str($input));	});	}); </file context> 
panic!("failed to parse `{input}`");
}
};
dbg!(&expr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Leftover dbg! in shared end-to-end test helper causes unconditional debug output and avoidable per-test overhead.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/lib.rs, line 44: <comment>Leftover `dbg!` in shared end-to-end test helper causes unconditional debug output and avoidable per-test overhead.</comment> <file context> @@ -3,147 +3,280 @@ +	panic!("failed to parse `{input}`"); +	} +	}; +	dbg!(&expr); +	let context = EvalContext::default(); </file context> 
let mut writer = StandardStream::stderr(ColorChoice::Auto);
let config = term::Config::default();
for diag in &self.diagnostics {
term::emit(&mut writer.lock(), &config, &self.file, diag).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: CompileError::print() panics on term::emit failures by unwrapping the Result, so diagnostic reporting can crash on ordinary I/O or file span errors. Propagate or handle the error instead of panicking in this library helper.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/diagnostic.rs, line 24: <comment>`CompileError::print()` panics on `term::emit` failures by unwrapping the `Result`, so diagnostic reporting can crash on ordinary I/O or file span errors. Propagate or handle the error instead of panicking in this library helper.</comment> <file context> @@ -0,0 +1,159 @@ +	let mut writer = StandardStream::stderr(ColorChoice::Auto); +	let config = term::Config::default(); +	for diag in &self.diagnostics { +	term::emit(&mut writer.lock(), &config, &self.file, diag).unwrap(); +	} +	writer.flush(); </file context> 
"log2",
Box::new(|values| match values {
[Value::Number(Number::Real(real))] => Some(Value::Number(Number::Real(real.log2()))),
[Value::Number(Number::Complex(complex))] => Some(Value::Number(Number::Complex(complex / LN_2))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Complex log2 implementation is incorrect: it divides the input by ln(2) instead of computing ln(z)/ln(2) (or log2()), yielding wrong results for all complex inputs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/constants.rs, line 366: <comment>Complex log2 implementation is incorrect: it divides the input by ln(2) instead of computing ln(z)/ln(2) (or `log2()`), yielding wrong results for all complex inputs.</comment> <file context> @@ -107,12 +148,486 @@ lazy_static! { +	"log2", +	Box::new(|values| match values { +	[Value::Number(Number::Real(real))] => Some(Value::Number(Number::Real(real.log2()))), +	[Value::Number(Number::Complex(complex))] => Some(Value::Number(Number::Complex(complex / LN_2))), +	_ => None, +	}), </file context> 
@Keavon Keavon force-pushed the master branch 7 times, most recently from 9b97ab7 to 2e842cb Compare March 19, 2026 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants