utf

UTF

[Unicode transformation format]

any sequential data representation of Unicode text.

UTF-8

a standard representation of Unicode text as a byte stream, using 8-bit code units. In UTF-8, ASCII characters have the same representation that they have in ASCII, and other characters are represented in sequences of 2 to 6 bytes without using byte values that are
not used in ASCII.

UTF-16

1. a standard representation of Unicode text as a sequence of 16-bit code units. In UTF-16, all commonly used characters are encoded in a single 16-bit code unit, and additional characters are represented in multiple 16-bit code units.

2. an encoding scheme for UTF-16 that represents each 16-bit code unit in 2 bytes in big-endian sequence (with the most significant byte first) unless a byte order mark indicates otherwise.

UTF-16BE

a standard encoding scheme for UTF-16 that represents each 16-bit code unit in 2 bytes in big-endian sequence (with the most significant byte first).

UTF-16LE

a standard encoding scheme for UTF-16 that represents each 16-bit code unit in 2 bytes in little-endian sequence (with the least significant byte first).

Unless stated otherwise Content of this page is licensed under Creative Commons Attribution 3.0 License