converted fractional_part_rounding_thresholds to use an array of u32s by Cazadorro · Pull Request #4719 · fmtlib/fmt

Cazadorro · 2026-03-18T21:30:48Z

converted fractional_part_rounding_thresholds to use an array of u32s instead of a unicode literal character array for compatibility with CUDA and non CUDA platforms, based on discussions in #4167

… u32s instead of a unicode literal character array for compatibliity with CUDA and non CUDA platforms

Cazadorro · 2026-03-18T21:34:30Z

Note that this works with vcpkg as well (tested with overlay port), ran tests locally all appeared to pass (note however I'm on MSVC 2026). I did use the special CUDA compiler arguments for UTF8 in the separate project I used to test VCPKG compatibliity, dealing with that is a whole other can of worms I didn't want to include in this fix, I just wanted to make sure I could use the escape hatch with those compiler arguments as with out this change you can't use a fmt at all no matter what you do with vcpkg and cuda.

below are the arguments I used in the separate project

target_compile_options(cuda_experiments PRIVATE
        $<$<COMPILE_LANGUAGE:CUDA>:-Xcompiler=-utf-8 -Xcudafe=--unicode_source_kind=UTF-8 -Xcicc=--unicode_source_kind=UTF-8>)

soumik15630m · 2026-03-24T17:27:05Z

The PR appears clean to me - Please run clang-format to fix the failing lint check...

…gnostic_fractional_part_rounding_thresholds

vitaut · 2026-03-28T18:09:09Z

include/fmt/format.h

+static constexpr uint32_t utf8_raw_rounding_thresholds[8] = {
+    0x9999999au, 0x828f5c29u, 0x80418938u, 0x80068db9,
+    0x8000a7c6u, 0x800010c7u, 0x800001aeu, 0x8000002bu};


This will be duplicated in every TU which is undesirable. Another workaround is to use a UTF-16 literal and reconstruct a constant from two 16-bit halves.

Also utf8_raw_ prefix is not needed (and incorrect) since these constants have nothing to do with UTF-8 or Unicode in general. We only use Unicode literals to workaround language limitations here.

Okay, I had to look up rules on C++ on how this works, I didn't realize static constexpr outside of a function basically has the exact opposite effect in terms of linkage that static constexpr inside a constexpr function has. However, I believe if I change this to an inline constexpr variable, it will fix this issue: https://stackoverflow.com/a/57407675 The address will be the same across translation units if I understand this correctly. Inline is implied with constexpr functions, but not with constexpr variables.

inline won't work because it's C++17.

For some reason I thought the library required at least c++17, not c++11 I'll see about trying utf16 but if there's any weird platform limitations with that too I won't have the bandwidth to track that down. Also won't there be endian-ness concerns with using utf16 litterals? And at that point why even bother with those if you could just use a char array? Though I guess the previous version must have also had endian-ness issues as well if I read the documentation correctly.

…eclaration of thresholds to inline constexpr, this should mean same address for every translation unit see: https://stackoverflow.com/a/57407675

… on the compiler cannonizing char litteral arrays in order to avoid duplicating the array definition per translation unit

…x variables in fractional_part_rounding_thresholds, and applied clang format to function

…ing_thresholds

vitaut · 2026-03-30T18:25:56Z

include/fmt/format.h

+
+  // recombine as uint32, this should eliminate endian issues, as now we are
+  // shifting bytes as uint32 which should match platform endian.
+  return byte_3 << 24u | byte_2 << 16u | byte_1 << 8u | byte_0;


I suggest using 16-bit (u) literals to simplify this.

I'm really hesitant to do that, won't it actually make things more complicated? I'm still going to have to recombine byte-wise in the final step, and I'm also going to have to extract the bytes of the uint16s in order to do this. There's no way to guarantee that utf16 would match platform endian right? The standard does not appear guarantee utf16 big endian or little endian, nor does it appear to guarantee it matches platform endianness/ endianness of std::uint16_t.

It would look something like:

const uint32_t first_bytes = static_cast<uint16_t>(u"..." u"..."[index]); const uint32_t second_bytes = static_cast<uint16_t>(u"..." u"..."[index]); const uint32_t byte_3 = first_bytes >> 8; const uint32_t byte_2 = first_bytes & 0xFF; const uint32_t byte_1 = second_bytes >> 8; const uint32_t byte_0 = second_bytes & 0xFF; // recombine as uint32, this should eliminate endian issues, as now we are // shifting bytes as uint32 which should match platform endian. return byte_3 << 24u | byte_2 << 16u | byte_1 << 8u | byte_0;

converted formafractional_part_rounding_thresholds to use an array of…

3a3c645

… u32s instead of a unicode literal character array for compatibliity with CUDA and non CUDA platforms

Cazadorro added 2 commits March 24, 2026 18:15

fixed clang format issue

9661f89

Merge branch 'master' of https://github.com/Cazadorro/fmt into cuda_a…

34fcab5

…gnostic_fractional_part_rounding_thresholds

vitaut requested changes Mar 28, 2026

View reviewed changes

Cazadorro and others added 5 commits March 28, 2026 14:12

changed naming of rounding thresholds, and changed static constexpr d…

432f589

…eclaration of thresholds to inline constexpr, this should mean same address for every translation unit see: https://stackoverflow.com/a/57407675

Removed inline constexpr as it doesn't work in C++11, instead relying…

dedbeee

… on the compiler cannonizing char litteral arrays in order to avoid duplicating the array definition per translation unit

fixed spelling errors in recent comments

38d164d

Fixed minor formatting issues in comments, added const to local byte_…

60fec10

…x variables in fractional_part_rounding_thresholds, and applied clang format to function

Merge branch 'fmtlib:master' into cuda_agnostic_fractional_part_round…

5d044de

…ing_thresholds

vitaut requested changes Mar 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

converted fractional_part_rounding_thresholds to use an array of u32s#4719

converted fractional_part_rounding_thresholds to use an array of u32s#4719
Cazadorro wants to merge 8 commits intofmtlib:masterfrom
Cazadorro:cuda_agnostic_fractional_part_rounding_thresholds

Cazadorro commented Mar 18, 2026

Uh oh!

Cazadorro commented Mar 18, 2026 •

edited

Loading

Uh oh!

soumik15630m commented Mar 24, 2026 •

edited

Loading

Uh oh!

vitaut Mar 28, 2026

Uh oh!

Cazadorro Mar 28, 2026

Uh oh!

vitaut Mar 28, 2026

Uh oh!

Cazadorro Mar 28, 2026

Uh oh!

vitaut Mar 30, 2026

Uh oh!

Cazadorro Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Cazadorro commented Mar 18, 2026

Uh oh!

Cazadorro commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soumik15630m commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vitaut Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Cazadorro Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

vitaut Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Cazadorro Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

vitaut Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Cazadorro Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Cazadorro commented Mar 18, 2026 •

edited

Loading

soumik15630m commented Mar 24, 2026 •

edited

Loading