Move the math expression parser from Pest to Chumsky and add more features#2685
Move the math expression parser from Pest to Chumsky and add more features#2685urisinger wants to merge 18 commits intoGraphiteEditor:masterfrom
Conversation
34dced3 to 5cacab2 Compare implicit multiplication
| I have rebased this branch to remove merge commits, so you'll need to |
316a737 to 532e913 Compare ec51271 to e025103 Compare | How far is this? I might be interested in implementing some math expressions from #2026 |
Its ready for review, its just not being merged becuase without variadics of some form this node is pretty useless, creating the custom error message for the node is also a difficult task. |
124235a to a42cad8 Compare 3697b7d to 90533e6 Compare There was a problem hiding this comment.
13 issues found across 14 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately. <file name="libraries/math-parser/src/executer.rs"> <violation number="1" location="libraries/math-parser/src/executer.rs:28"> P2: Binary-operator failures are misreported as function-call `TypeError`, conflating distinct error kinds and producing misleading diagnostics.</violation> <violation number="2" location="libraries/math-parser/src/executer.rs:47"> P2: Conditional truthiness treats `NaN` numeric results as true, allowing invalid conditions to execute the `if` branch.</violation> </file> <file name="frontend/wasm/src/editor_api.rs"> <violation number="1" location="frontend/wasm/src/editor_api.rs:931"> P2: Parser error diagnostics are dropped from logs before the API collapses failures to `None`, reducing debuggability of parse regressions.</violation> </file> <file name="libraries/math-parser/src/lexer.rs"> <violation number="1" location="libraries/math-parser/src/lexer.rs:173"> P2: Unchecked numeric accumulation and exponent casting can overflow for long literals/exponents, causing panics in debug builds or incorrect numeric values in release builds.</violation> </file> <file name="libraries/math-parser/benches/bench.rs"> <violation number="1" location="libraries/math-parser/benches/bench.rs:12"> P2: Parsing benchmark ignores parse errors, allowing parser regressions to be benchmarked silently as error cases.</violation> </file> <file name="libraries/math-parser/src/value.rs"> <violation number="1" location="libraries/math-parser/src/value.rs:170"> P1: Factorial uses an unbounded linear loop over `n`, enabling very large inputs to cause extreme CPU consumption even after `f64` result has already overflowed.</violation> </file> <file name="libraries/math-parser/src/lib.rs"> <violation number="1" location="libraries/math-parser/src/lib.rs:44"> P2: Leftover `dbg!` in shared end-to-end test helper causes unconditional debug output and avoidable per-test overhead.</violation> </file> <file name="libraries/math-parser/src/parser.rs"> <violation number="1" location="libraries/math-parser/src/parser.rs:146"> P1: Chained comparison logic is ineffective: `cmp` consumes the full chain first, so `chained_cmp` never multiplies pairwise comparisons as intended.</violation> </file> <file name="node-graph/gmath-nodes/src/lib.rs"> <violation number="1" location="node-graph/gmath-nodes/src/lib.rs:474"> P1: GCD/LCM nodes accept `i32` but `binary_gcd` assumes non‑negative values. For negative inputs, the shift/subtract loop can cycle forever (e.g., -2 and 4), hanging evaluation or producing wrong results. Normalize to absolute values or remove `i32` support before calling `binary_gcd`.</violation> </file> <file name="libraries/math-parser/src/diagnostic.rs"> <violation number="1" location="libraries/math-parser/src/diagnostic.rs:24"> P2: `CompileError::print()` panics on `term::emit` failures by unwrapping the `Result`, so diagnostic reporting can crash on ordinary I/O or file span errors. Propagate or handle the error instead of panicking in this library helper.</violation> </file> <file name="libraries/math-parser/src/constants.rs"> <violation number="1" location="libraries/math-parser/src/constants.rs:336"> P1: `root` uses `powf(1/n)` for real inputs, causing odd roots of negative reals (e.g. `root(-8, 3)`) to return `NaN` instead of the expected real value.</violation> <violation number="2" location="libraries/math-parser/src/constants.rs:366"> P2: Complex log2 implementation is incorrect: it divides the input by ln(2) instead of computing ln(z)/ln(2) (or `log2()`), yielding wrong results for all complex inputs.</violation> <violation number="3" location="libraries/math-parser/src/constants.rs:524"> P1: `gcd`/`lcm` use unchecked float→`i64` conversion and integer arithmetic, allowing non-finite/huge inputs to produce invalid results or overflow paths.</violation> </file> Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Add one-off context when rerunning by tagging
@cubic-dev-aiwith guidance or docs links (includingllms.txt) - Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| for k in 1..=n { | ||
| acc *= k as f64; | ||
| } |
There was a problem hiding this comment.
P1: Factorial uses an unbounded linear loop over n, enabling very large inputs to cause extreme CPU consumption even after f64 result has already overflowed.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/value.rs, line 170: <comment>Factorial uses an unbounded linear loop over `n`, enabling very large inputs to cause extreme CPU consumption even after `f64` result has already overflowed.</comment> <file context> @@ -107,15 +155,37 @@ impl Number { + } + let n = truncated as u64; + let mut acc = 1.0_f64; + for k in 1..=n { + acc *= k as f64; + } </file context> | for k in 1..=n { | |
| acc *= k as f64; | |
| } | |
| if n > 170 { | |
| return Number::Real(f64::INFINITY); | |
| } | |
| for k in 1..=n { | |
| acc *= k as f64; | |
| } |
| | ||
| // Chain comparisons like `a < b < c` by multiplying the boolean | ||
| // (1.0 / 0.0) results, preserving the existing semantics. | ||
| let chained_cmp = cmp.clone().foldl(cmp.repeated(), |lhs, rhs| Node::BinOp { |
There was a problem hiding this comment.
P1: Chained comparison logic is ineffective: cmp consumes the full chain first, so chained_cmp never multiplies pairwise comparisons as intended.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/parser.rs, line 146: <comment>Chained comparison logic is ineffective: `cmp` consumes the full chain first, so `chained_cmp` never multiplies pairwise comparisons as intended.</comment> <file context> @@ -1,316 +1,174 @@ + + // Chain comparisons like `a < b < c` by multiplying the boolean + // (1.0 / 0.0) results, preserving the existing semantics. + let chained_cmp = cmp.clone().foldl(cmp.repeated(), |lhs, rhs| Node::BinOp { + lhs: Box::new(lhs), + op: BinaryOp::Mul, </file context> | fn greatest_common_divisor<T: num_traits::int::PrimInt + std::ops::ShrAssign<i32> + std::ops::SubAssign>( | ||
| _: impl Ctx, | ||
| /// One of the two numbers for which the GCD will be calculated. | ||
| #[implementations(u32, u64, i32)] |
There was a problem hiding this comment.
P1: GCD/LCM nodes accept i32 but binary_gcd assumes non‑negative values. For negative inputs, the shift/subtract loop can cycle forever (e.g., -2 and 4), hanging evaluation or producing wrong results. Normalize to absolute values or remove i32 support before calling binary_gcd.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At node-graph/gmath-nodes/src/lib.rs, line 474: <comment>GCD/LCM nodes accept `i32` but `binary_gcd` assumes non‑negative values. For negative inputs, the shift/subtract loop can cycle forever (e.g., -2 and 4), hanging evaluation or producing wrong results. Normalize to absolute values or remove `i32` support before calling `binary_gcd`.</comment> <file context> @@ -0,0 +1,780 @@ +fn greatest_common_divisor<T: num_traits::int::PrimInt + std::ops::ShrAssign<i32> + std::ops::SubAssign>( + _: impl Ctx, + /// One of the two numbers for which the GCD will be calculated. + #[implementations(u32, u64, i32)] + value: T, + /// The other of the two numbers for which the GCD will be calculated. </file context> | @@ -1,14 +1,22 @@ | |||
| use crate::value::{Number, Value}; | |||
| use lazy_static::lazy_static; | |||
There was a problem hiding this comment.
P1: gcd/lcm use unchecked float→i64 conversion and integer arithmetic, allowing non-finite/huge inputs to produce invalid results or overflow paths.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/constants.rs, line 524: <comment>`gcd`/`lcm` use unchecked float→`i64` conversion and integer arithmetic, allowing non-finite/huge inputs to produce invalid results or overflow paths.</comment> <file context> @@ -107,12 +148,486 @@ lazy_static! { + if x == 0 && y == 0 { + return Some(Value::Number(Number::Real(0.0))); + } + x = x.abs(); + y = y.abs(); + while y != 0 { </file context> | "root", | ||
| Box::new(|values| match values { | ||
| [Value::Number(Number::Real(x)), Value::Number(Number::Real(n))] => { | ||
| Some(Value::Number(Number::Real(x.powf(1.0 / *n)))) |
There was a problem hiding this comment.
P1: root uses powf(1/n) for real inputs, causing odd roots of negative reals (e.g. root(-8, 3)) to return NaN instead of the expected real value.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/constants.rs, line 336: <comment>`root` uses `powf(1/n)` for real inputs, causing odd roots of negative reals (e.g. `root(-8, 3)`) to return `NaN` instead of the expected real value.</comment> <file context> @@ -107,12 +148,486 @@ lazy_static! { + "root", + Box::new(|values| match values { + [Value::Number(Number::Real(x)), Value::Number(Number::Real(n))] => { + Some(Value::Number(Number::Real(x.powf(1.0 / *n)))) + } + [Value::Number(Number::Complex(x)), Value::Number(Number::Real(n))] => { </file context> | let mut v = 0u64; | ||
| let mut digits = 0; | ||
| while let Some(d) = self.peek().and_then(|c| c.to_digit(10)) { | ||
| v = v * 10 + d as u64; |
There was a problem hiding this comment.
P2: Unchecked numeric accumulation and exponent casting can overflow for long literals/exponents, causing panics in debug builds or incorrect numeric values in release builds.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/lexer.rs, line 173: <comment>Unchecked numeric accumulation and exponent casting can overflow for long literals/exponents, causing panics in debug builds or incorrect numeric values in release builds.</comment> <file context> @@ -0,0 +1,372 @@ + let mut v = 0u64; + let mut digits = 0; + while let Some(d) = self.peek().and_then(|c| c.to_digit(10)) { + v = v * 10 + d as u64; + digits += 1; + self.bump(); </file context> | c.bench_function(concat!("parse ", $input), |b| { | ||
| b.iter(|| { | ||
| let _ = black_box(ast::Node::try_parse_from_str($input)).unwrap(); | ||
| let _ = black_box(ast::Node::try_parse_from_str($input)); |
There was a problem hiding this comment.
P2: Parsing benchmark ignores parse errors, allowing parser regressions to be benchmarked silently as error cases.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/benches/bench.rs, line 12: <comment>Parsing benchmark ignores parse errors, allowing parser regressions to be benchmarked silently as error cases.</comment> <file context> @@ -9,15 +9,21 @@ macro_rules! generate_benchmarks { c.bench_function(concat!("parse ", $input), |b| { b.iter(|| { - let _ = black_box(ast::Node::try_parse_from_str($input)).unwrap(); + let _ = black_box(ast::Node::try_parse_from_str($input)); }); }); </file context> | panic!("failed to parse `{input}`"); | ||
| } | ||
| }; | ||
| dbg!(&expr); |
There was a problem hiding this comment.
P2: Leftover dbg! in shared end-to-end test helper causes unconditional debug output and avoidable per-test overhead.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/lib.rs, line 44: <comment>Leftover `dbg!` in shared end-to-end test helper causes unconditional debug output and avoidable per-test overhead.</comment> <file context> @@ -3,147 +3,280 @@ + panic!("failed to parse `{input}`"); + } + }; + dbg!(&expr); + let context = EvalContext::default(); </file context> | let mut writer = StandardStream::stderr(ColorChoice::Auto); | ||
| let config = term::Config::default(); | ||
| for diag in &self.diagnostics { | ||
| term::emit(&mut writer.lock(), &config, &self.file, diag).unwrap(); |
There was a problem hiding this comment.
P2: CompileError::print() panics on term::emit failures by unwrapping the Result, so diagnostic reporting can crash on ordinary I/O or file span errors. Propagate or handle the error instead of panicking in this library helper.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/diagnostic.rs, line 24: <comment>`CompileError::print()` panics on `term::emit` failures by unwrapping the `Result`, so diagnostic reporting can crash on ordinary I/O or file span errors. Propagate or handle the error instead of panicking in this library helper.</comment> <file context> @@ -0,0 +1,159 @@ + let mut writer = StandardStream::stderr(ColorChoice::Auto); + let config = term::Config::default(); + for diag in &self.diagnostics { + term::emit(&mut writer.lock(), &config, &self.file, diag).unwrap(); + } + writer.flush(); </file context> | "log2", | ||
| Box::new(|values| match values { | ||
| [Value::Number(Number::Real(real))] => Some(Value::Number(Number::Real(real.log2()))), | ||
| [Value::Number(Number::Complex(complex))] => Some(Value::Number(Number::Complex(complex / LN_2))), |
There was a problem hiding this comment.
P2: Complex log2 implementation is incorrect: it divides the input by ln(2) instead of computing ln(z)/ln(2) (or log2()), yielding wrong results for all complex inputs.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At libraries/math-parser/src/constants.rs, line 366: <comment>Complex log2 implementation is incorrect: it divides the input by ln(2) instead of computing ln(z)/ln(2) (or `log2()`), yielding wrong results for all complex inputs.</comment> <file context> @@ -107,12 +148,486 @@ lazy_static! { + "log2", + Box::new(|values| match values { + [Value::Number(Number::Real(real))] => Some(Value::Number(Number::Real(real.log2()))), + [Value::Number(Number::Complex(complex))] => Some(Value::Number(Number::Complex(complex / LN_2))), + _ => None, + }), </file context> 9b97ab7 to 2e842cb Compare
Part of #2026
Switches to the chumsky parser for performance, remove Unit support until #586 is resolved.
Improves parser performance by about 40%, while also reducing compile times and improving readability.