About Projects Notes

About Projects Tools Notes Uses

Text to Unicode Converter

Convert text to Unicode code points and back to characters.

Convert each character of a string to its Unicode code point (U+XXXX), or paste code points and reassemble them back to text. Handles emoji sequences, combining marks, and non-BMP characters correctly.

Common use cases: inspecting what a compound emoji is actually made of, generating Unicode escape sequences for JavaScript/CSS/Python code, debugging string-handling bugs around surrogate pairs, and looking up the formal name of a strange character in a pasted string.

Text

Unicode Code Points

Frequently asked questions

What's a Unicode code point?

A unique number assigned to every character in the Unicode standard — currently over 150,000 characters across every modern script, plus emojis, mathematical symbols, and obsolete writing systems. Code points are written as U+XXXX in hexadecimal.

Why are emojis sometimes multiple code points?

Compound emojis (👨‍👩‍👧, 🏳️‍🌈, 👍🏽) are sequences of multiple code points joined with a zero-width joiner (U+200D) or a modifier. Decoding back to characters reassembles the sequence; counting characters in such strings is famously tricky.

What's the difference between a code point and a UTF-8 byte?

A code point is the abstract character identity. UTF-8 encodes that identity into 1–4 bytes for storage and transmission. U+1F600 (😀) is one code point, but four UTF-8 bytes (F0 9F 98 80).

How can I use code points in code?

In JavaScript: \u{1F600} or String.fromCodePoint(0x1F600). In CSS content: \1F600. In Python: chr(0x1F600). In HTML: 😀. The tool outputs each format so you can copy the one you need.

Other Encoders

View all tools →

Base64 Encoder/Decoder

Encode and decode Base64 strings

URL Encoder/Decoder

Encode and decode URLs and query parameters

HTML Entities

Encode and decode HTML entities

Text to Binary

Convert text to binary and vice versa

Text to NATO Alphabet

Convert text to NATO phonetic alphabet