Text to Unicode Converter

Convert text to Unicode code points and back to characters.

Convert each character of a string to its Unicode code point (U+XXXX), or paste code points and reassemble them back to text. Handles emoji sequences, combining marks, and non-BMP characters correctly.

Common use cases: inspecting what a compound emoji is actually made of, generating Unicode escape sequences for JavaScript/CSS/Python code, debugging string-handling bugs around surrogate pairs, and looking up the formal name of a strange character in a pasted string.

Text

Unicode Code Points

Frequently asked questions

What's a Unicode code point?
A unique number assigned to every character in the Unicode standard โ€” currently over 150,000 characters across every modern script, plus emojis, mathematical symbols, and obsolete writing systems. Code points are written as U+XXXX in hexadecimal.
Why are emojis sometimes multiple code points?
Compound emojis (๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ง, ๐Ÿณ๏ธโ€๐ŸŒˆ, ๐Ÿ‘๐Ÿฝ) are sequences of multiple code points joined with a zero-width joiner (U+200D) or a modifier. Decoding back to characters reassembles the sequence; counting characters in such strings is famously tricky.
What's the difference between a code point and a UTF-8 byte?
A code point is the abstract character identity. UTF-8 encodes that identity into 1โ€“4 bytes for storage and transmission. U+1F600 (๐Ÿ˜€) is one code point, but four UTF-8 bytes (F0 9F 98 80).
How can I use code points in code?
In JavaScript: \u{1F600} or String.fromCodePoint(0x1F600). In CSS content: \1F600. In Python: chr(0x1F600). In HTML: 😀. The tool outputs each format so you can copy the one you need.