converted fractional_part_rounding_thresholds to use an array of u32s#4719
converted fractional_part_rounding_thresholds to use an array of u32s#4719Cazadorro wants to merge 8 commits intofmtlib:masterfrom
Conversation
… u32s instead of a unicode literal character array for compatibliity with CUDA and non CUDA platforms
|
Note that this works with vcpkg as well (tested with overlay port), ran tests locally all appeared to pass (note however I'm on MSVC 2026). I did use the special CUDA compiler arguments for UTF8 in the separate project I used to test VCPKG compatibliity, dealing with that is a whole other can of worms I didn't want to include in this fix, I just wanted to make sure I could use the escape hatch with those compiler arguments as with out this change you can't use a fmt at all no matter what you do with vcpkg and cuda. below are the arguments I used in the separate project |
|
The PR appears clean to me - Please run |
…gnostic_fractional_part_rounding_thresholds
include/fmt/format.h
Outdated
| static constexpr uint32_t utf8_raw_rounding_thresholds[8] = { | ||
| 0x9999999au, 0x828f5c29u, 0x80418938u, 0x80068db9, | ||
| 0x8000a7c6u, 0x800010c7u, 0x800001aeu, 0x8000002bu}; |
There was a problem hiding this comment.
This will be duplicated in every TU which is undesirable. Another workaround is to use a UTF-16 literal and reconstruct a constant from two 16-bit halves.
Also utf8_raw_ prefix is not needed (and incorrect) since these constants have nothing to do with UTF-8 or Unicode in general. We only use Unicode literals to workaround language limitations here.
There was a problem hiding this comment.
Okay, I had to look up rules on C++ on how this works, I didn't realize static constexpr outside of a function basically has the exact opposite effect in terms of linkage that static constexpr inside a constexpr function has. However, I believe if I change this to an inline constexpr variable, it will fix this issue: https://stackoverflow.com/a/57407675 The address will be the same across translation units if I understand this correctly. Inline is implied with constexpr functions, but not with constexpr variables.
There was a problem hiding this comment.
inline won't work because it's C++17.
There was a problem hiding this comment.
For some reason I thought the library required at least c++17, not c++11 I'll see about trying utf16 but if there's any weird platform limitations with that too I won't have the bandwidth to track that down. Also won't there be endian-ness concerns with using utf16 litterals? And at that point why even bother with those if you could just use a char array? Though I guess the previous version must have also had endian-ness issues as well if I read the documentation correctly.
…eclaration of thresholds to inline constexpr, this should mean same address for every translation unit see: https://stackoverflow.com/a/57407675
… on the compiler cannonizing char litteral arrays in order to avoid duplicating the array definition per translation unit
…x variables in fractional_part_rounding_thresholds, and applied clang format to function
|
|
||
| // recombine as uint32, this should eliminate endian issues, as now we are | ||
| // shifting bytes as uint32 which should match platform endian. | ||
| return byte_3 << 24u | byte_2 << 16u | byte_1 << 8u | byte_0; |
There was a problem hiding this comment.
I suggest using 16-bit (u) literals to simplify this.
There was a problem hiding this comment.
I'm really hesitant to do that, won't it actually make things more complicated? I'm still going to have to recombine byte-wise in the final step, and I'm also going to have to extract the bytes of the uint16s in order to do this. There's no way to guarantee that utf16 would match platform endian right? The standard does not appear guarantee utf16 big endian or little endian, nor does it appear to guarantee it matches platform endianness/ endianness of std::uint16_t.
It would look something like:
const uint32_t first_bytes =
static_cast<uint16_t>(u"..."
u"..."[index]);
const uint32_t second_bytes =
static_cast<uint16_t>(u"..."
u"..."[index]);
const uint32_t byte_3 = first_bytes >> 8;
const uint32_t byte_2 = first_bytes & 0xFF;
const uint32_t byte_1 = second_bytes >> 8;
const uint32_t byte_0 = second_bytes & 0xFF;
// recombine as uint32, this should eliminate endian issues, as now we are
// shifting bytes as uint32 which should match platform endian.
return byte_3 << 24u | byte_2 << 16u | byte_1 << 8u | byte_0;
converted fractional_part_rounding_thresholds to use an array of u32s instead of a unicode literal character array for compatibility with CUDA and non CUDA platforms, based on discussions in #4167