-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Fonts: Normalize font face font-family #9951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
201d1cf
08ad65b
f09ea75
9f29537
b88e32c
044e261
927af12
58fffca
98fd99f
93c7284
f99a644
a74fd98
245075f
b4fbd16
31a0dee
d2d2f1c
a61fa4e
96a3293
95a9d84
93fbf89
fecdea9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -307,17 +307,7 @@ private function order_src( array $font_face ) { | |||||||||||||||||||||
| private function build_font_face_css( array $font_face ) { | ||||||||||||||||||||||
| $css = ''; | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| /* | ||||||||||||||||||||||
| * Wrap font-family in quotes if it contains spaces | ||||||||||||||||||||||
| * and is not already wrapped in quotes. | ||||||||||||||||||||||
| */ | ||||||||||||||||||||||
| if ( | ||||||||||||||||||||||
| str_contains( $font_face['font-family'], ' ' ) && | ||||||||||||||||||||||
| ! str_contains( $font_face['font-family'], '"' ) && | ||||||||||||||||||||||
| ! str_contains( $font_face['font-family'], "'" ) | ||||||||||||||||||||||
| ) { | ||||||||||||||||||||||
| $font_face['font-family'] = '"' . $font_face['font-family'] . '"'; | ||||||||||||||||||||||
| } | ||||||||||||||||||||||
| $font_face['font-family'] = $this->normalize_css_font_family( $font_face['font-family'] ); | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| foreach ( $font_face as $key => $value ) { | ||||||||||||||||||||||
| // Compile the "src" parameter. | ||||||||||||||||||||||
|
|
@@ -338,6 +328,69 @@ private function build_font_face_css( array $font_face ) { | |||||||||||||||||||||
| return $css; | ||||||||||||||||||||||
| } | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| /** | ||||||||||||||||||||||
| * Normalizes a font-face name for use in CSS. | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
| * Add quotes to the font-face name and escape problematic characters. | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
| * @see https://www.w3.org/TR/css-fonts-4/#font-family-desc | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
| * @since 6.9.0 | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
| * @param string $font_family The font-face name to normalize. | ||||||||||||||||||||||
| * @return string The normalized font-face name. | ||||||||||||||||||||||
| */ | ||||||||||||||||||||||
| protected function normalize_css_font_family( string $font_family ): string { | ||||||||||||||||||||||
| $font_family = trim( $font_family, " \t\r\f\n" ); | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| if ( | ||||||||||||||||||||||
| strlen( $font_family ) > 1 && | ||||||||||||||||||||||
| ( '"' === $font_family[0] && '"' === $font_family[ strlen( $font_family ) - 1 ] ) || | ||||||||||||||||||||||
| ( "'" === $font_family[0] && "'" === $font_family[ strlen( $font_family ) - 1 ] ) | ||||||||||||||||||||||
| ) { | ||||||||||||||||||||||
| _doing_it_wrong( | ||||||||||||||||||||||
| __METHOD__, | ||||||||||||||||||||||
| __( 'Font family should not be wrapped in quotes; they will be added automatically.' ), | ||||||||||||||||||||||
| '6.9.0' | ||||||||||||||||||||||
| ); | ||||||||||||||||||||||
| return $font_family; | ||||||||||||||||||||||
| } | ||||||||||||||||||||||
|
Comment on lines
+346
to
+357
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On second thought, I don't think this is a good idea and it should not be checking for quoted font families. An already quoted font family suggests that it's been encoded for CSS already. In that case, the "normalization" here is dangerous. The original implementation tried to cover both cases: /*
* Wrap font-family in quotes if it contains spaces
* and is not already wrapped in quotes.
*/
if (
str_contains( $font_face['font-family'], ' ' ) &&
! str_contains( $font_face['font-family'], '"' ) &&
! str_contains( $font_face['font-family'], "'" )
) {
$font_face['font-family'] = '"' . $font_face['font-family'] . '"';
}If a quoted string were found, normalization would need to parse it as a CSS string first, then perform normalization on the parsed result. #7857 includes some CSS parsing functionality, but that's not ready to land yet. I think at this time the best this can do is call I'd love to hear thoughts from other folks on this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here's CSSProcessor escapes strings: (Note the method is misnamed, it's called It also parses strings, maybe that would be useful here:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Agreed. What are the scenarios where we'd receive an already quoted value? Should we just tokenize our CSS data source? A safe thing to do would be to assume it may not be encoded right and reject it right away. Alternatively, we could recognize it's a quoted value and try to parse it, but something about that feels off and I worry we'd end up with double-encoded values sooner or later. Sticking to a single accepted input format sounds great.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for asking this, you encouraged me to dig deeper and research. I'll summarize that I think this change is doing the right thing based on usage I've seen, but documentation should be updated to indicate that plain text (unquoted, unescaped) should be provided and the system will handle transformation to a CSS font family name. I believe this typically comes from JSON in a theme.json file. It's not CSS, it's a JSON string like in this example: {
"fontFamily": "Open Sans",
"fontWeight": "300 800",
"fontStyle": "italic",
"fontStretch": "normal",
"src": [ "file:./assets/fonts/open-sans-italic.woff2" ]
}
The documentation suggests that the value should be valid CSS, so someone passing in Passing a correctly escaped CSS identifier for the font-family name would regress with this PR, like I did some searching to describe what's currently being done. I was surprised to see that twentytwentyfive quotes the font face family:
But twentytwentythree does not:
I did a theme search and looked through the top themes:
I looked through the themes with a minimum of 5,000 installs and found many examples of unquoted families with spaces, but no examples of identifiers with CSS Unicode escape sequences.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice research! And I agree with your conclusion. We already have real inputs in both forms out there, continuing to support them is kind. Hopefully we won't see many/any escape sequence. Actually, that makes me wonder... Why couldn't we support unicode escape sequences? If we treated that string value from theme.json as untokenized CSS, we could easily consume it using the CSSProcessor: <?php
use WordPress\DataLiberation\CSS\CSSProcessor;
require_once __DIR__ . '/vendor/autoload.php';
$fontFamilies = [
"\"Fira Code\"",
"Source Serif Pro",
"Open\\20Sans"
];
foreach ( $fontFamilies as $family ) {
$processor = CSSProcessor::create( $family );
while ( $processor->next_token() ) {
if($processor->get_token_type() === CSSProcessor::TOKEN_WHITESPACE) {
echo ' ';
} else {
echo $processor->get_token_value();
}
}
echo "\n";
}Outputs:
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be perfectly clear, my familiarity with the font system in WordPress is limited to this PR. I can't speak to the goals and design of the system. My intuitive understanding is that it's difficult and error prone to work with a CSS value here instead of a JSON string. Maybe it's the fact that the JSON string has quotes and you need strong familiarity with the system to understand that It still really feels like this PR is helpful and will mostly do what folks want, but I may be wrong! I'm happy to be challenged on this. The other font familyOf course, there's the other {
"name": "Primary",
"slug": "primary",
"fontFamily": "Charter, 'Bitstream Charter', 'Sitka Text', Cambria, serif"
}Is seems like this could have been an array of unescaped, unquoted, JSON strings and it would be simpler and might avoid the split/sanitize/join issues: {
// Wrong! This is not how the system works, but maybe it could be?
"fontFamily": [
"Charter",
"Bitstream Charter",
"Sitka Text",
"Cambria",
"serif", // Uh-oh, that's a problem
],
}The problem there would the generic font family I'm not sure how to handle that in an array of JSON strings without making things… messy.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean this PR makes things better than they were before, perhaps aside of an unlikely escape (
A CSS Tokenizer would solve for that one as well and correctly distinguish between strings and identifiers. I don't think we can support arbitrarily complex values without a tokenizer – every time we improve how we massage that string value we effectively move closer to tokenization.
Again, a tokenizer would help us distinguish between a string Lacking one, we could reject
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My position is that it's actually confusing to work with these snippets of CSS text, and we're bettor off using plain JSON strings, at least in this case of the font-face font family. In that case, CSS tokenizing isn't helpful. We need to encode for CSS.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sirreal oops! I've edited my comment to address JSON syntax more and I've submitted that edit right before you've submitted your comment. I have mixed feelings about using a JSON array. On one hand, sure, if we're committing to the JSON format let's just embrace it completely. It's less awkward than mixing CSS and JSON syntaxes. On the other, I'm not sure how well that would generalize to other CSS rules with their own subsyntax in the value. If it wouldn't, we'd end up with some keys requiring a CSS string value and some keys requiring a JSON array. I'm also worried about things like |
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| return '"' . strtr( | ||||||||||||||||||||||
sirreal marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||
| $font_family, | ||||||||||||||||||||||
| array( | ||||||||||||||||||||||
| /* | ||||||||||||||||||||||
| * Normalize preprocessed whitespace. | ||||||||||||||||||||||
| * https://www.w3.org/TR/css-syntax-3/#input-preprocessing | ||||||||||||||||||||||
| */ | ||||||||||||||||||||||
| "\r" => '\\A ', | ||||||||||||||||||||||
| "\f" => '\\A ', | ||||||||||||||||||||||
| "\r\n" => '\\A ', | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| /* | ||||||||||||||||||||||
| * CSS Unicode escaping for problematic characters. | ||||||||||||||||||||||
| * https://www.w3.org/TR/css-syntax-3/#escaping | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
| * These characters are not required by CSS but may be problematic in WordPress: | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
| * - "<" is replaced to prevent issues with KSES and other sanitization when | ||||||||||||||||||||||
| * printing CSS later. | ||||||||||||||||||||||
| * - "," is replaced to prevent issues where multiple font family names may be | ||||||||||||||||||||||
| * split, sanitized, and joined on the `,` character (regardless of quoting | ||||||||||||||||||||||
| * or escaping). | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
| * Note that the Unicode escape sequences are used rather than backslash-escaping. | ||||||||||||||||||||||
| * This also helps to prevent issues with problematic characters. | ||||||||||||||||||||||
| */ | ||||||||||||||||||||||
| "\n" => '\\A ', | ||||||||||||||||||||||
adamziel marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||
| '\\' => '\\5C ', | ||||||||||||||||||||||
| ',' => '\\2C ', | ||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this one necessary? Not that it's wrong, but we're producing a quoted value anyway. It shouldn't derail any syntax parsing and, within a string, it's seen as a comma either way.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When printing CSS, the font system checks for wordpress-develop/src/wp-includes/fonts/class-wp-font-utils.php Lines 67 to 76 in 4d43703
It seemed preferable to escape
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah so it's a workaround for the deficiencies of another part of the system. Could we go the other way around and parse the value instead of exploding by
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Core doesn't have a CSS parser, does it? This Unicode escaping is at worst harmless and at best better than requiring full blown CSS parsing (even if that would be the most correct thing to do later).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't today, but it could. CSS Processor is just a tokenizer, not a full-blown parser. Or you might want to reuse just the part of it that parses strings, just to avoid the blanket |
||||||||||||||||||||||
| '"' => '\\22 ', | ||||||||||||||||||||||
| '<' => '\\3C ', | ||||||||||||||||||||||
sirreal marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||
| ) | ||||||||||||||||||||||
| ) . '"'; | ||||||||||||||||||||||
| } | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| /** | ||||||||||||||||||||||
| * Compiles the `src` into valid CSS. | ||||||||||||||||||||||
| * | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% satisfied with this. A broken family name could be used, like
""", which would result in broken CSS output. The existing implementation has the same issue. This is likely good enough for anything but the most malicious font name.The font family name
"""could be safely used with this implementation as'"""'or preferably something like"\22\22\22".One improvement here is that the string must start and end with a matching quote character
"…"or'…'in order to be treated as a quoted string. This allows fonts to contain those characters without issue and they'll be normalized properly.It would be nice if the system knew that plain strings were provided and all quoting and normalization of the font family name were handled by the system. I'm not sure that's possible while maintaining backwards compatibility.