Skip to content
T
Tools.Town
Free Online Tools for Everyone
Data Tools

How Binary Represents Text: ASCII, UTF-8 and Bits

Understand how computers turn letters into bits, the difference between ASCII and UTF-8, why bytes are 8 bits, and how to convert binary to text and back.

23 June 2026 4 min read By Tools.Town Team Fact Checked

Key Takeaways

  • Eight bits can represent 256 distinct values (2 to the 8th power), which was enough for the original character sets plus control codes
  • ASCII covers 128 characters in a single byte
  • For text, yes

Everything is numbers

Computers don’t store letters — they store numbers, and those numbers are written in binary. A binary digit, or bit, is a single 0 or 1. String enough bits together and you can represent any number, and once you can represent numbers, you can represent text by agreeing on which number means which character.

That agreement is called a character encoding. It’s the lookup table that says “the number 72 means the capital letter H.” The Binary to Text Converter applies exactly this mapping in both directions: text to bits, and bits back to text.

Why a byte is 8 bits

Bits are usually grouped into bytes of 8. The reason is historical but durable: 8 bits can hold 256 different values (2⁸ = 256), which was enough to cover the uppercase and lowercase alphabet, digits, punctuation, and a set of control codes with room to spare. The 8-bit byte became the standard unit computers use to address memory, and essentially all modern text encodings build on it.

That’s why valid binary text always comes in multiples of 8. The letter H is one byte — 01001000 — and the word Hi is two bytes — 01001000 01101001. If you try to decode a binary string whose length isn’t divisible by 8, something is missing or extra, which is why a good converter flags it instead of guessing.

ASCII: the original 128

The first widely adopted encoding was ASCII (American Standard Code for Information Interchange). It assigns numbers 0 to 127 to a fixed set of characters:

  • 0–31 are control codes (newline, tab, carriage return, and so on).
  • 32–64 cover space, digits, and common punctuation.
  • 65–90 are the uppercase letters A–Z.
  • 97–122 are the lowercase letters a–z.

So A is 65 (01000001), a is 97 (01100001), and the digit 0 is 48 (00110000). Because ASCII only needs numbers up to 127, every ASCII character fits in a single byte with the top bit unused.

ASCII was perfect for English, but it had no room for accented letters, non-Latin scripts, or symbols — let alone emoji. The world needed something bigger.

UTF-8: one encoding for every language

Unicode is the universal catalogue that gives a unique number (a “code point”) to every character in every writing system, plus emoji and symbols. UTF-8 is the most common way of storing those code points as bytes, and it’s the default for the web.

UTF-8’s clever trick is being variable length:

  • Code points 0–127 (all of ASCII) use a single byte, identical to ASCII. This means every ASCII file is already valid UTF-8.
  • Characters like é or ñ use two bytes.
  • Most other scripts, such as or , use three bytes.
  • Emoji like 🚀 use four bytes.

This is why, in the converter, a string’s byte count can be larger than its character count. The word café is four characters but five bytes, because é takes two. The emoji 🚀 is one character but four bytes. A UTF-8-aware tool handles all of this automatically, which is what lets the Binary to Text Converter round-trip modern text without corrupting it.

Encoding: text to binary

Turning text into binary is a two-step process:

  1. Text → bytes. Run the text through UTF-8 to get the sequence of byte values.
  2. Bytes → bits. Write each byte as an 8-digit binary number, padding with leading zeros so every byte is exactly 8 bits.

For Hi: the bytes are 72 and 105, which become 01001000 and 01101001. Join them with spaces for readability — 01001000 01101001 — or run them together if you prefer.

Decoding: binary to text

Decoding reverses it:

  1. Strip the noise. Remove spaces, tabs, newlines, and commas — they’re only there to make the bits readable.
  2. Group into bytes. Slice the bit string into chunks of 8.
  3. Bytes → text. Convert each chunk back to its numeric value, then run the whole byte sequence through UTF-8 decoding to recover the original characters.

The separator-tolerance in step one is important: 01001000 01101001, 0100100001101001, and a version split across multiple lines all decode to the same Hi.

Common questions and pitfalls

  • Spaces are optional. They help humans read binary but aren’t required for decoding.
  • Watch the length. If decoding fails, count your bits — the total must be divisible by 8.
  • Mind the encoding. Binary produced by an ASCII-only tool may not decode multi-byte characters correctly. A UTF-8-aware converter avoids that.
  • It’s not encryption. Binary is just another representation of the same data. Anyone can decode it, so it provides no secrecy.

Try it yourself

The fastest way to build intuition is to convert a few words and watch the bytes appear. Open the Binary to Text Converter, type your name, and see each letter become eight bits — then paste the binary back to decode it. For converting between number bases like binary, hex, and decimal, use the Binary Converter — and read How a Binary Converter Works to see how those base conversions are done.

Advertisement

Try Binary to Text Converter — Free

Apply what you just learned with our free tool. No sign-up required.

Try Binary to Text Converter

Frequently Asked Questions

Why is a byte 8 bits?
Eight bits can represent 256 distinct values (2 to the 8th power), which was enough for the original character sets plus control codes. The 8-bit byte became the standard unit of addressable memory and has stayed that way.
What's the difference between ASCII and UTF-8?
ASCII covers 128 characters in a single byte. UTF-8 extends this to every Unicode character using one to four bytes, while keeping the first 128 identical to ASCII for backwards compatibility.
Does binary length have to be a multiple of 8?
For text, yes. Each character is stored as whole bytes of 8 bits, so valid binary text always comes in groups of 8. A remainder means digits are missing or extra.

Was this guide helpful?

Your feedback helps us improve our content.

Continue Reading

All Data Tools Guides

Get the best Data Tools tips & guides in your inbox

Join 25,000+ users who get our weekly data tools insights.