Skip to content

converted fractional_part_rounding_thresholds to use an array of u32s#4719

Open
Cazadorro wants to merge 8 commits intofmtlib:masterfrom
Cazadorro:cuda_agnostic_fractional_part_rounding_thresholds
Open

converted fractional_part_rounding_thresholds to use an array of u32s#4719
Cazadorro wants to merge 8 commits intofmtlib:masterfrom
Cazadorro:cuda_agnostic_fractional_part_rounding_thresholds

Conversation

@Cazadorro
Copy link
Copy Markdown

converted fractional_part_rounding_thresholds to use an array of u32s instead of a unicode literal character array for compatibility with CUDA and non CUDA platforms, based on discussions in #4167

… u32s instead of a unicode literal character array for compatibliity with CUDA and non CUDA platforms
@Cazadorro
Copy link
Copy Markdown
Author

Cazadorro commented Mar 18, 2026

Note that this works with vcpkg as well (tested with overlay port), ran tests locally all appeared to pass (note however I'm on MSVC 2026). I did use the special CUDA compiler arguments for UTF8 in the separate project I used to test VCPKG compatibliity, dealing with that is a whole other can of worms I didn't want to include in this fix, I just wanted to make sure I could use the escape hatch with those compiler arguments as with out this change you can't use a fmt at all no matter what you do with vcpkg and cuda.

below are the arguments I used in the separate project

target_compile_options(cuda_experiments PRIVATE
        $<$<COMPILE_LANGUAGE:CUDA>:-Xcompiler=-utf-8 -Xcudafe=--unicode_source_kind=UTF-8 -Xcicc=--unicode_source_kind=UTF-8>)

@soumik15630m
Copy link
Copy Markdown
Contributor

soumik15630m commented Mar 24, 2026

The PR appears clean to me - Please run clang-format to fix the failing lint check...

Comment on lines +3154 to +3156
static constexpr uint32_t utf8_raw_rounding_thresholds[8] = {
0x9999999au, 0x828f5c29u, 0x80418938u, 0x80068db9,
0x8000a7c6u, 0x800010c7u, 0x800001aeu, 0x8000002bu};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be duplicated in every TU which is undesirable. Another workaround is to use a UTF-16 literal and reconstruct a constant from two 16-bit halves.

Also utf8_raw_ prefix is not needed (and incorrect) since these constants have nothing to do with UTF-8 or Unicode in general. We only use Unicode literals to workaround language limitations here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I had to look up rules on C++ on how this works, I didn't realize static constexpr outside of a function basically has the exact opposite effect in terms of linkage that static constexpr inside a constexpr function has. However, I believe if I change this to an inline constexpr variable, it will fix this issue: https://stackoverflow.com/a/57407675 The address will be the same across translation units if I understand this correctly. Inline is implied with constexpr functions, but not with constexpr variables.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline won't work because it's C++17.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I thought the library required at least c++17, not c++11 I'll see about trying utf16 but if there's any weird platform limitations with that too I won't have the bandwidth to track that down. Also won't there be endian-ness concerns with using utf16 litterals? And at that point why even bother with those if you could just use a char array? Though I guess the previous version must have also had endian-ness issues as well if I read the documentation correctly.

Cazadorro and others added 5 commits March 28, 2026 14:12
…eclaration of thresholds to inline constexpr, this should mean same address for every translation unit see: https://stackoverflow.com/a/57407675
… on the compiler cannonizing char litteral arrays in order to avoid duplicating the array definition per translation unit
…x variables in fractional_part_rounding_thresholds, and applied clang format to function

// recombine as uint32, this should eliminate endian issues, as now we are
// shifting bytes as uint32 which should match platform endian.
return byte_3 << 24u | byte_2 << 16u | byte_1 << 8u | byte_0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using 16-bit (u) literals to simplify this.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really hesitant to do that, won't it actually make things more complicated? I'm still going to have to recombine byte-wise in the final step, and I'm also going to have to extract the bytes of the uint16s in order to do this. There's no way to guarantee that utf16 would match platform endian right? The standard does not appear guarantee utf16 big endian or little endian, nor does it appear to guarantee it matches platform endianness/ endianness of std::uint16_t.

It would look something like:


const uint32_t first_bytes =
      static_cast<uint16_t>(u"..."
                            u"..."[index]);
const uint32_t second_bytes =      
        static_cast<uint16_t>(u"..."
                              u"..."[index]);
                              
const uint32_t byte_3 = first_bytes >> 8; 
const uint32_t byte_2 = first_bytes & 0xFF; 

const uint32_t byte_1 = second_bytes >> 8; 
const uint32_t byte_0 = second_bytes & 0xFF; 

// recombine as uint32, this should eliminate endian issues, as now we are
// shifting bytes as uint32 which should match platform endian.
return byte_3 << 24u | byte_2 << 16u | byte_1 << 8u | byte_0;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants