fix(bigquery): escape apostrophes in filter values using standard SQL quoting#38835
Conversation
… quoting The sqlalchemy-bigquery dialect uses Python's repr() to render string literals when literal_binds=True. repr() switches to double-quote delimiters when the string contains an apostrophe (e.g. repr("O'Brien") produces "O'Brien"). In BigQuery SQL, double-quoted tokens are identifiers, not string literals, so any filter containing an apostrophe causes a syntax error. This patch monkey-patches the BigQuery dialect's colspecs to use a custom string type whose literal_processor always produces single-quoted literals with properly doubled internal quotes (standard SQL escaping). Fixes apache#35857 Sequence DiagramThis PR updates BigQuery SQL compilation so text filter values with apostrophes are always rendered as standard single quoted SQL literals. The flow highlights the new dialect patch and how query compilation now produces BigQuery safe filter SQL. sequenceDiagram participant Superset participant BigQueryEngineSpec participant BigQueryDialect participant SafeStringType participant BigQuery Superset->>BigQueryEngineSpec: Load BigQuery engine spec BigQueryEngineSpec->>BigQueryDialect: Patch string literal handling Superset->>BigQueryDialect: Compile query with literal binds BigQueryDialect->>SafeStringType: Process text filter value SafeStringType-->>BigQueryDialect: Return escaped single quoted literal BigQueryDialect->>BigQuery: Execute query with valid filter literal Generated by CodeAnt AI |
There was a problem hiding this comment.
Code Review Agent Run #2b0fe4
Actionable Suggestions - 1
- superset/db_engine_specs/bigquery.py - 1
- Incorrect % escaping in string literals · Line 100-100
Review Details
- Files reviewed - 2 · Commit Range:
470b90a..470b90a- superset/db_engine_specs/bigquery.py
- tests/unit_tests/db_engine_specs/test_bigquery.py
- Files skipped - 0
- Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
| This helper always produces a single-quoted literal with properly doubled | ||
| internal quotes. | ||
| """ | ||
| escaped = value.replace("'", "''").replace("%", "%%") |
There was a problem hiding this comment.
The _process_string_literal function incorrectly escapes % characters to %%, but in BigQuery SQL string literals, % does not require escaping. This causes strings containing % to be misrepresented in queries (e.g., '100%' becomes '100%%').
Code suggestion
Check the AI-generated fix before applying
| escaped = value.replace("'", "''").replace("%", "%%") | |
| escaped = value.replace("'", "''") |
Code Review Run #2b0fe4
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
User description
SUMMARY
Fixes #35857
BigQuery errors when dashboard filters on text columns contain apostrophes (e.g.
O'Brien,Fernando's).Root cause: The
sqlalchemy-bigquerydialect'sprocess_string_literalfunction uses Python'srepr()to render string literals whenliteral_binds=Trueis used during query compilation. When the string contains an apostrophe,repr()wraps the value in double quotes (e.g.repr("O'Brien")->"O'Brien"). In BigQuery SQL, double-quoted tokens are identifiers (like column or table names), not string literals, so the query fails with a syntax error.Fix: Monkey-patch the BigQuery dialect's
colspecsto use a customTypeDecoratorwhoseliteral_processoralways produces single-quoted literals with properly doubled internal quotes ('O''Brien'), which is the standard SQL escaping convention that BigQuery expects. This approach follows the same pattern used for the Databricks engine spec (superset/db_engine_specs/databricks.py).BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before: Filter with
Fernando'sgenerates SQLWHERE name = "Fernando's"(double-quoted identifier, causes BigQuery syntax error)After: Filter with
Fernando'sgenerates SQLWHERE name = 'Fernando''s'(properly escaped single-quoted literal)TESTING INSTRUCTIONS
O'Brien,Fernando's)Unit tests are included:
test_string_literal_with_apostrophe- verifies apostrophe escapingtest_string_literal_without_apostrophe- verifies normal strings unaffectedtest_string_literal_in_filter_with_apostrophe- verifies IN clause escapingADDITIONAL INFORMATION
CodeAnt-AI Description
Escape BigQuery filter values that contain apostrophes
What Changed
O'Briennow run in BigQuery instead of failing with a syntax errorImpact
✅ Fewer BigQuery filter errors✅ Clearer text filtering in dashboards✅ Reliable filters for names with apostrophes💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.