• On CHOW: Why are shopping carts so hard to steer?

Underexposed

Read all 'Unicode' posts in Underexposed
May 5, 2008 1:35 PM PDT

Google: Unicode conquers ASCII on the Web

by Stephen Shankland
  • 5 comments

I picture it happening this way. The Roman alphabet is on the run, pursued by a much larger army of Arabic characters with long scimitar-like ligatures, Chinese characters that look like throwing stars, and European peasant letters bristling with umlauts, cedillas, and tildes.

Unicode now is the most common character encoding method on the Web.

Unicode now is the most common character encoding method on the Web.

(Credit: Google)

Unicode has overtaken ASCII as the most popular character encoding scheme on the World Wide Web, Mark Davis, Google's senior international software architect, said in a blog post. Also vanquished at almost exactly the same time was the Western European encoding.

Unicode is a character encoding standard that gracefully accommodates dozens of languages as well as Roman characters with diacritical marks. ASCII, a tried-and true, decades-old standard, is limited to 128 or 256 characters and has a hard time extending beyond the range of a century-old Remington typewriter.

Unicode vanquished ASCII and Western European within 10 days in December, Davis said.

"What's more impressive than simply overtaking them is the speed with which this happened," he added, pointing to a graph showing the meteoric rise of Unicode.

Google's a fan of Unicode Web sites. When it processes data from Web sites, it converts it into Unicode first if it's not already there. That improves international search abilities.

"The continued rise in use of Unicode makes it even easier to do the processing for the many languages that we cover," he said.

Google just converted to Unicode 5.1, he added, "so people speaking languages such as Malayalam can now search for words containing the new characters," he said.

One disadvantage Unicode has over ASCII, though, is that it takes at least twice as much memory to store a Roman alphabet character because Unicode uses more bytes to enumerate its vastly larger range of alphabetic symbols.

August 1, 2007 4:39 PM PDT

Turn your world upside-down with Unicode

by Stephen Shankland
  • 3 comments

I've been a typography buff for years--I even reflexively groan when I see ads that use Helvetica--so this Flip Web site was just too entertaining for me to pass up.

The Flip Web site uses Unicode characters to flip Roman-alphabet text.

(Credit: David Faden)

It relies on a property of the Unicode character-encoding scheme, which has a vast array of letters and glyphs from non-Roman alphabets. Think of it as ASCII on steroids.

Unicode has enough characters that many standard Roman alphabet characters have upside-down equivalents. When you type letters into the upper box, they appear upside down in the lower box via Unicode translation, according to the site's designer, David Faden. Very clever, though Firefox's spell-check wasn't fooled, and capital letters and numbers don't work.

"I can't claim any particular interest in typography," Faden said. "It is fun to dig through the Unicode charts to see what treasures are buried there, though."

  • prev
  • 1
  • next
advertisement

E-readers' next chapter--no happy ending?

There were plenty of e-book readers on display at CES 2010, but many question whether the market for such dedicated devices can support all the new entrants.
• Photos: E-readers at CES 2010

Inside the world's long-lost first microcomputer

Vintage computer historians have long revered the Altair 8800. As it turns out, an unknown computer project at Sacramento State beat the Altair by three years.
• Images: The first microcomputers

About Underexposed

This blog sheds light on digital photography subjects such as cameras, photo editing, and Web sites. Shankland joined CNET News in 1998 after a five-year stint as a science writer. He's a lab rat who grew up in Los Alamos, N.M., and graduated from Harvard.

Contact Stephen at Stephen.Shankland@cnet.com

Add this feed to your online news reader

Underexposed topics

Most Discussed



advertisement

Inside CNET News

Scroll Left Scroll Right