Ad
  • Custom User Avatar

    Translation: Please be consistent and use STL all the time. The function can take a std::string_view instead

  • Custom User Avatar

    char* in C++? :/

  • Custom User Avatar

    The Unicode standare only uses 1 to 4 byte sequences, but UTF-8 technically supports up to 6 bytes per character. Once decoded, though, these characters just won't resolve to anything. The kata doesn't request for the actual character's representation anywhere in the code, just the codepoint that it would represent. If you have further questions, please feel free to reply with them.

  • Custom User Avatar

    The number of "ones" or "trues" at the beginning of the byte tells how many bytes will be used. If there are three at the beginning, then the character consists of three bytes. This can range from 2 to 6 in a row before it becomes invalid.

    Since when is it "2 to 6"? AFAIK, it's 2 to 4.

  • Custom User Avatar

    Issue fixed! Tests now include 6 bytes and above, where empty arrays are returned on error. I have also added other randomized error checks, such as continuation byte errors.

  • Custom User Avatar

    Thanks for the update! At the time I was developing the tests, I was not thinking about anything higher than what the Unicode standard supports, (up to 0x10FFFF, although it itself is not used). I will keep this thread updated, although it should be as simple as increasing a number in the test code.

  • Custom User Avatar

    This can range from 2 to 6 in a row before it becomes invalid.

    As far as I can tell from the tests the more than 6 case is not tested.