Text to Unicode Converter
Convert text to Unicode code points and back to characters.
Convert each character of a string to its Unicode code point (U+XXXX), or paste code points and reassemble them back to text. Handles emoji sequences, combining marks, and non-BMP characters correctly.
Common use cases: inspecting what a compound emoji is actually made of, generating Unicode escape sequences for JavaScript/CSS/Python code, debugging string-handling bugs around surrogate pairs, and looking up the formal name of a strange character in a pasted string.
Text
Unicode Code Points
Frequently asked questions
What's a Unicode code point?
A unique number assigned to every character in the Unicode standard โ currently over 150,000 characters across every modern script, plus emojis, mathematical symbols, and obsolete writing systems. Code points are written as
U+XXXX in hexadecimal.Why are emojis sometimes multiple code points?
Compound emojis (๐จโ๐ฉโ๐ง, ๐ณ๏ธโ๐, ๐๐ฝ) are sequences of multiple code points joined with a zero-width joiner (
U+200D) or a modifier. Decoding back to characters reassembles the sequence; counting characters in such strings is famously tricky.What's the difference between a code point and a UTF-8 byte?
A code point is the abstract character identity. UTF-8 encodes that identity into 1โ4 bytes for storage and transmission.
U+1F600 (๐) is one code point, but four UTF-8 bytes (F0 9F 98 80).How can I use code points in code?
In JavaScript:
\u{1F600} or String.fromCodePoint(0x1F600). In CSS content: \1F600. In Python: chr(0x1F600). In HTML: 😀. The tool outputs each format so you can copy the one you need.