Skip to content

Conversation

@sirreal
Copy link
Member

@sirreal sirreal commented Sep 18, 2025

Ensure that font-family names in the font face class are normalized and correctly encoded for CSS.

This prevents names like O'Reilly Sans or Suisse BP Int'l are correctly encoded and will not break the printed CSS.

This includes tests to verify correct and expected behavior. It's not as straightforward to verify that the normalized font works correctly in the browser.

  1. " rel="nofollow">I've prepared a demonstration to test and compare how these encodings work. It includes some examples from the test suite and can be used to verify that, for example, these font families match:

    @font-face { font-family:"BS\\Quot\"Apos'Semi;Comma,Newline\ALT<Oh😵My!"; } * { font-family: "BS\5C Quot\22 Apos\'Semi;Comma\2C Newline\A LT\3C Oh😵My!"; }

    The behavior can be tested for regressions with the playground. The default twentytwentyfive theme will use the font families Manrope and Fira Code that can be tested on the playground.

    Before (unquoted): https://playground.wordpress.net/?wp=6.8
    After (quoted): https://playground.wordpress.net/?core-pr=9951

    -@font-face{font-family:Manrope;font-style:normal;font-weight:200 800;font-display:fallback;src:url('h/wp-content/themes/twentytwentyfive/assets/fonts/manrope/Manrope-VariableFont_wght.woff2') format('woff2');} +@font-face{font-family:"Manrope";font-style:normal;font-weight:200 800;font-display:fallback;src:url('/wp-content/themes/twentytwentyfive/assets/fonts/manrope/Manrope-VariableFont_wght.woff2') format('woff2');} @font-face{font-family:"Fira Code";font-style:normal;font-weight:300 700;font-display:fallback;src:url('/wp-content/themes/twentytwentyfive/assets/fonts/fira-code/FiraCode-VariableFont_wght.woff2') format('woff2');}

    Trac ticket: https://core.trac.wordpress.org/ticket/63568
    This builds on work and ideas in #8982.


    This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@github-actions
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@sirreal sirreal marked this pull request as ready for review September 19, 2025 16:28
@github-actions
Copy link

github-actions bot commented Sep 19, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props sandeepdahiya, jonsurrell, zieladam. 

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Comment on lines 397 to 408
if (
strlen( $font_family ) > 1 &&
( '"' === $font_family[0] && '"' === $font_family[ strlen( $font_family ) - 1 ] ) ||
( "'" === $font_family[0] && "'" === $font_family[ strlen( $font_family ) - 1 ] )
) {
_doing_it_wrong(
__METHOD__,
__( 'Font font-family should not be wrapped in quotes; they will be added automatically.' ),
'6.9.0'
);
$font_family = substr( $font_family, 1, -1 );
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I don't think this is a good idea and it should not be checking for quoted font families.

An already quoted font family suggests that it's been encoded for CSS already. In that case, the "normalization" here is dangerous.

The original implementation tried to cover both cases:

/*  * Wrap font-family in quotes if it contains spaces  * and is not already wrapped in quotes.  */ if ( str_contains( $font_face['font-family'], ' ' ) &&	! str_contains( $font_face['font-family'], '"' ) &&	! str_contains( $font_face['font-family'], "'" ) ) { $font_face['font-family'] = '"' . $font_face['font-family'] . '"'; }

If a quoted string were found, normalization would need to parse it as a CSS string first, then perform normalization on the parsed result.

#7857 includes some CSS parsing functionality, but that's not ready to land yet. I think at this time the best this can do is call _doing_it_wrong and not perform normalization.

I'd love to hear thoughts from other folks on this.

Copy link
Contributor

@adamziel adamziel Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's CSSProcessor escapes strings:

https://github.com/WordPress/php-toolkit/blob/5303dfbb02d5e347fd03fa8ea207b653f18dbdbf/components/DataLiberation/CSS/class-cssprocessor.php#L876-L923

(Note the method is misnamed, it's called escape_url_value but it should really be called escape_string).

It also parses strings, maybe that would be useful here:

https://github.com/WordPress/php-toolkit/blob/5303dfbb02d5e347fd03fa8ea207b653f18dbdbf/components/DataLiberation/CSS/class-cssprocessor.php#L998

Copy link
Contributor

@adamziel adamziel Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An already quoted font family suggests that it's been encoded for CSS already. In that case, the "normalization" here is dangerous.

Agreed. What are the scenarios where we'd receive an already quoted value? Should we just tokenize our CSS data source?

A safe thing to do would be to assume it may not be encoded right and reject it right away. Alternatively, we could recognize it's a quoted value and try to parse it, but something about that feels off and I worry we'd end up with double-encoded values sooner or later. Sticking to a single accepted input format sounds great.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for asking this, you encouraged me to dig deeper and research. I'll summarize that I think this change is doing the right thing based on usage I've seen, but documentation should be updated to indicate that plain text (unquoted, unescaped) should be provided and the system will handle transformation to a CSS font family name.


I believe this typically comes from JSON in a theme.json file. It's not CSS, it's a JSON string like in this example:

{ "fontFamily": "Open Sans", "fontWeight": "300 800", "fontStyle": "italic", "fontStretch": "normal", "src": [ "file:./assets/fonts/open-sans-italic.woff2" ] }

Each object accepts these properties that map to descriptors for the @font-face CSS at-rule:

  • fontFamily: A valid CSS font-family descriptor.

The documentation suggests that the value should be valid CSS, so someone passing in { "fontFamily": "\"O'Reilly Sans\"" } is doing the right thing according to documentation. At the same time, it would be good to quote Open Sans in the documented example, suggesting it's actually better to pass an unquoted, unescaped, plain JSON string and let the system handle it (much more robust with this PR).

Passing a correctly escaped CSS identifier for the font-family name would regress with this PR, like { "fontFamily": "Open\20Sans" }. However, I think it's very unlikely anyone is doing that.


I did some searching to describe what's currently being done.

I was surprised to see that twentytwentyfive quotes the font face family:

But twentytwentythree does not:



I did a theme search and looked through the top themes:

  • Twenty Twenty-Three uses unquoted families with spaces
  • Twenty Twenty-Two uses unquoted families with spaces "fontFamily": "Source Serif Pro"
  • Extendable uses unquoted families with spaces "fontFamily": "Baloo Tamma 2"
  • Kubio uses unquoted families with spaces
  • YITH Wonder uses unquoted families with spaces
  • Raft uses unquoted families with spaces
  • Variations uses unquoted families with spaces
  • Neve FSE uses unquoted families with spaces
  • Mindscape uses unquoted families with spaces
  • Vertice uses unquoted families with spaces

I looked through the themes with a minimum of 5,000 installs and found many examples of unquoted families with spaces, but no examples of identifiers with CSS Unicode escape sequences.

Copy link
Contributor

@adamziel adamziel Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice research! And I agree with your conclusion. We already have real inputs in both forms out there, continuing to support them is kind. Hopefully we won't see many/any escape sequence. Actually, that makes me wonder...

Why couldn't we support unicode escape sequences? If we treated that string value from theme.json as untokenized CSS, we could easily consume it using the CSSProcessor:

<?php use WordPress\DataLiberation\CSS\CSSProcessor; require_once __DIR__ . '/vendor/autoload.php'; $fontFamilies = [ "\"Fira Code\"", "Source Serif Pro", "Open\\20Sans" ]; foreach ( $fontFamilies as $family ) { $processor = CSSProcessor::create( $family ); while ( $processor->next_token() ) { if($processor->get_token_type() === CSSProcessor::TOKEN_WHITESPACE) { echo ' ';	} else { echo $processor->get_token_value();	}	} echo "\n"; }

Outputs:

Fira Code Source Serif Pro Open Sans 
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be perfectly clear, my familiarity with the font system in WordPress is limited to this PR. I can't speak to the goals and design of the system.

My intuitive understanding is that it's difficult and error prone to work with a CSS value here instead of a JSON string. Maybe it's the fact that the JSON string has quotes and you need strong familiarity with the system to understand that {"fontFamily": "\"O'Reilly Sans\""} is expected. It's also confusing that font families can be CSS identifiers or CSS strings, so Helvetica and "Helvetica" are equally valid and correct ways of referencing the same font family.

It still really feels like this PR is helpful and will mostly do what folks want, but I may be wrong! I'm happy to be challenged on this.

The other font family

Of course, there's the other fontFamily usage that's also a string of CSS like this (again from the example):

{ "name": "Primary", "slug": "primary", "fontFamily": "Charter, 'Bitstream Charter', 'Sitka Text', Cambria, serif" }

Is seems like this could have been an array of unescaped, unquoted, JSON strings and it would be simpler and might avoid the split/sanitize/join issues:

{ // Wrong! This is not how the system works, but maybe it could be? "fontFamily": [ "Charter", "Bitstream Charter", "Sitka Text", "Cambria", "serif", // Uh-oh, that's a problem ], }

The problem there would the generic font family serif. As a generic font family, it must be unquoted. But it's also possible to add a font family named "serif" (quoted) which is different from the generic. CSS semantics handle this by differentiating between the string "serif" and the identifier serif.

I'm not sure how to handle that in an array of JSON strings without making things… messy.

Copy link
Contributor

@adamziel adamziel Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this PR makes things better than they were before, perhaps aside of an unlikely escape (\20). All I'm saying is tokenizing CSS would likely be nearly as fast and would free us from tiptoeing around implicit restrictions around commas, quotes, etc. It might not be the thing to do for 6.9, given we're in beta cycle now, but perhaps for 7.0?

"fontFamily": "Charter, 'Bitstream Charter', 'Sitka Text', Cambria, serif"

A CSS Tokenizer would solve for that one as well and correctly distinguish between strings and identifiers. I don't think we can support arbitrarily complex values without a tokenizer – every time we improve how we massage that string value we effectively move closer to tokenization.

The problem there would the generic font family serif

Again, a tokenizer would help us distinguish between a string "serif" and an identifier serif in regular CSS values, although not in a JSON array.

Lacking one, we could reject "serif" as a custom font-family name and always treat it as an identifier. Perhaps it won't even ever come up? I'd guess it's an unlikely name, especially all lowercase. It seems like there's around 10 generic font family names. We could just list them all and treat them in a special way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My position is that it's actually confusing to work with these snippets of CSS text, and we're bettor off using plain JSON strings, at least in this case of the font-face font family.

In that case, CSS tokenizing isn't helpful. We need to encode for CSS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sirreal oops! I've edited my comment to address JSON syntax more and I've submitted that edit right before you've submitted your comment.

I have mixed feelings about using a JSON array. On one hand, sure, if we're committing to the JSON format let's just embrace it completely. It's less awkward than mixing CSS and JSON syntaxes. On the other, I'm not sure how well that would generalize to other CSS rules with their own subsyntax in the value. If it wouldn't, we'd end up with some keys requiring a CSS string value and some keys requiring a JSON array. I'm also worried about things like calc(), --main-font-family, etc. assuming we want to support them. 🤔

__( 'Font font-family should not be wrapped in quotes; they will be added automatically.' ),
'6.9.0'
);
return $font_family;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% satisfied with this. A broken family name could be used, like """, which would result in broken CSS output. The existing implementation has the same issue. This is likely good enough for anything but the most malicious font name.

The font family name """ could be safely used with this implementation as '"""' or preferably something like "\22\22\22".

One improvement here is that the string must start and end with a matching quote character "…" or '…' in order to be treated as a quoted string. This allows fonts to contain those characters without issue and they'll be normalized properly.


It would be nice if the system knew that plain strings were provided and all quoting and normalization of the font family name were handled by the system. I'm not sure that's possible while maintaining backwards compatibility.

@sirreal sirreal requested a review from Copilot September 19, 2025 17:14

This comment was marked as outdated.

@sirreal sirreal requested a review from Copilot September 19, 2025 17:21

This comment was marked as outdated.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@sirreal
Copy link
Member Author

sirreal commented Sep 23, 2025

I'd love to get feedback from folks that worked on Font Face CSS support. Based on history, @hellofromtonya and @oandregal may be candidates.

*/
"\n" => '\\A ',
'\\' => '\\5C ',
',' => '\\2C ',
Copy link
Contributor

@adamziel adamziel Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one necessary? Not that it's wrong, but we're producing a quoted value anyway. It shouldn't derail any syntax parsing and, within a string, it's seen as a comma either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When printing CSS, the font system checks for , and does a split/join in an effort to normalize the font family names:

if ( str_contains( $output, ',' ) ) {
$items = explode( ',', $output );
foreach ( $items as $item ) {
$formatted_item = self::maybe_add_quotes( $item );
if ( ! empty( $formatted_item ) ) {
$formatted_items[] = $formatted_item;
}
}
return implode( ', ', $formatted_items );
}

It seemed preferable to escape , in the single font family name here to try to avoid being mangled later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so it's a workaround for the deficiencies of another part of the system. Could we go the other way around and parse the value instead of exploding by ,? The CSS Processor has some code that could help with this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core doesn't have a CSS parser, does it?

This Unicode escaping is at worst harmless and at best better than requiring full blown CSS parsing (even if that would be the most correct thing to do later).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't today, but it could. CSS Processor is just a tokenizer, not a full-blown parser. Or you might want to reuse just the part of it that parses strings, just to avoid the blanket explode() call.

@sirreal
Copy link
Member Author

sirreal commented Nov 5, 2025

I've been doing some testing and I have concerns about this approach. This has focused on fonts registered via theme.json, however after reviewing the issue and testing I've realized that font uploads are not processed by the functions that this PR touches.

I need to investigate how font uploads are processed. I've tested uploading a font named O"Reilly,Sans and it breaks the styling because of the quote wrapping. I'd expect that to work in this PR if set in a theme.json file, but it is not handled correctly as an uploaded font.

That case is interesting because we know it's a plain font name (not CSS) so there's no question of whether or not the value has been quoted or escaped already (see #9951 (comment)).

@sirreal sirreal marked this pull request as draft November 5, 2025 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants