When attempting to encode a string containing a lone surrogate (e.g., \ud83d) using the utf-16 codec and the surrogatepass error handler, GraalPy raises a UnicodeEncodeError.
def main():
text = "\ud83d"
text.encode("utf-16", "surrogatepass").decode("utf-16", "replace")
if __name__ == "__main__":
main()
throws UnicodeEncodeError: 'utf_16' codec can't encode character '\ud83d' in position 0: malformed input when run on graalpy-25.0.1
while running normaly on python
I encountered this issue when I was trying to build graalpy wheel for tiktoken