Version: 2008
  • On GameSpot: So-called 'Halo killer' gets 23 to life

March 23, 2005 4:00 AM PST

Faster XML ahead?

  • 12 comments
Related Stories

Putting XML in the fast lane

January 13, 2005

(continued from previous page)

data into a smaller file but require a specific program to view the information. Several companies have already created binary formats to suit their different operating environments or industries.

For example, Expway has created ways of storing XML data in a binary format for the mobile phone and television industries.

In those industries, bulky XML text documents are not suitable, and as a result, XML is not being widely used, said Berjon, the chair of the W3C Working Group on Binary Characterization and a researcher at Expway. Fast performance is essential for sending data to certain devices, such as set-top boxes, because consumers will not tolerate the slow transmission of programming guides or other information, he said.

Similarly, the mobile communications area is ripe for smaller XML files, argue proponents.

Mobile devices are getting more powerful processors to read more data. But all that processing sucks the life out of batteries, which haven't kept pace with chips on the upgrade front, said John Schneider, chief technology officer of AgileDelta, which makes software for devices to compress and more efficiently handle XML data.

By using XML-based protocols, called Web services, mobile carriers could offer more interactive applications than are available today and help meet consumer desire for games, calendars and so on, he said.

Compelling applications "make a huge difference. It adds a lot of value--the more people can access, the more valuable the information becomes," said Schneider.

Meanwhile, Sun Microsystems launched its own project, called Fast InfoSet, which it says can boost the speed of an XML application anywhere from two to 10 times.

Binarians, unitarians and contrarians
It's estimated that there are at least a dozen binary XML formats already in use or in development today. If it does go ahead with an effort to speed up XML, the W3C will seek to create a single group-sanctioned binary format, rather than have several formats for specific purposes, said Quin.

"We hope if we publish something that meets the needs of many of

Previous page | CONTINUED: ...
Page 1 | 2 | 3

See more CNET content tagged:
XML, cell phone carrier, broadcaster, approach, cell phone

Add a Comment (Log in or register) (12 Comments)
  • prev
  • 1
  • next
It'll still be slow
by David Arbogast March 23, 2005 7:53 AM PST
Compared to proprietary, less-verbose data formats, XML will always be a slow dog. Whether it is represented as plain text or in a binary format. XML is nice for collaboration and interoperability between working groups... but if your goal is pure speed, XML is not your best choice by a long shot.
Reply to this comment
XML Mania
by Andrew J Glina March 23, 2005 8:15 AM PST
I have been very surprised at all of the proposed XML usages. As you say, it is great to transfer between applications, just like CSV was used for. But it is too bulky for actual storage of databases for example.
Broken apps
by nzamparello March 23, 2005 9:25 AM PST
I'm just worried that this might affect both the OpenOffice project and RSS since they both use XML.. Plus, a binary format might mean increased development times because now they might have to create an actual programming language, compiler, and an IDE to go along with it... I say just keep it the way it is... Interpreted is fine for most usage anyways.. Look at python and PHP!
Reply to this comment
you left out the grand daddy of all plain text languages...
by tocam27 March 27, 2005 5:04 PM PST
html... just think how screwed up the web would be if five years ago some bright spark had said "I know, html is too damn slow, let's make it binary"...
Use an existing solid standard - ASN.1
by March 23, 2005 9:49 AM PST
ASN.1 (the protocol behind SNMP) covers all the issues I have seen raised so far. It is a solid industry tested protocol with substantial support. It would be smart to use it instead of creating yet another binary exchange format and making a whole set of new mistakes in an effort to avoid the 'old mistakes'.

Cheers,
Kevin
Reply to this comment
Trim the format
by March 23, 2005 11:19 AM PST
The format, XML, is jut too redundant. Get rid of the duplication and the extra and you woudl loose about 1/3 of the size and still have the same content. Parsing would be easier as you would only need to flag events and not 'read' every tag...

For example, instead of:

<books>
<book type='hardback'>
<title>Atlas Shrugged</title>
<author>Ayn Rand</author>
<isbn id='1' rating='good'>0525934189</isbn>
</book>
</books>

Use:

books{
book(type='hardback'){
title{Atlas Shrugged}
author{Ayn Rand}
isbn(id=1,rating=good){0525934189}
}
}

Attribution is obvious as it is in (), the loss of the duplicating closing tag removes the number of bytes of the opening tag.

This format works without any line feeds and can be translated into XML if needed.
Reply to this comment
Simple
by rpmyers1 March 23, 2005 12:59 PM PST
Content-Encoding: gzip

Already supported, and chances are most of the transfers occur over HTTP anyway.
Reply to this comment
WBXML is one option for J2ME devices
by Eric Giguere March 24, 2005 6:44 AM PST
Someday wireless networks will be fast enough to not make a difference, but not yet. I wrote a J2ME Tech Tip three years ago that showed how to use WBXML (a binary encoding of XML that is a WAP standard) to compress XML. You can find it on Sun's site at http://developers.sun.com/techtopics/mobility/midp/ttips/compressxml/. This is one possibility for cases where you have to integrate with an existing XML-based back end infrastructure.

That said, I often recommend a custom binary encoding between the J2ME app and a proxy for the app running in a servlet container. Doing Java on either end makes it easy to move data between the two, see my tip at http://developers.sun.com/techtopics/mobility/midp/ttips/clientserv/ for the details.

Eric
http://www.ericgiguere.com/
Reply to this comment
Processing, not transmission
by filker0 March 24, 2005 8:34 AM PST
Some have suggested that going to a gzipped pure text XML document is sufficient, but it's not so much the transmission time required for the XML text that is the reason for the trouble (though it's part of the problem in some cases), rather it's the amount processor power (read CPU cycles and memory) required to parse, validate, and deserialize the data from an XML document. Gzip encoding actually increases the CPU cycles needed to deserialize an XML document.

I believe that if a binary XML format is standardized, there should be some standard means for the sending and receiving programs to negotiate what they support, so that the binary protocol will only be used when both sides support it. If different binary encodings need to be specified, there should be some way of identifying them, and if two programs don't support any common binary encodings, the text form should always be available as a fallback.

In some 'closed' applications, where the "client" is known to support only one format, a binary only XML engine can be used to save code space and processor cycles. On 'open' systems, most applications would probably have to support text XML parsing as well as zero or more binary encodings.

An advantage to having this standardized is that the XML documents will be in a form that can be decoded by any XML parser that conforms to the standard and supports the specific binary encoding scheme. There may be a few of these different encodings, but they should all be flexible enough that no software vendor feels obligated to create their own proprietary encoding because they think the standardized ones are inadequate.
Reply to this comment
INCREMENTAL XML
by Tobias Nijmeijer March 25, 2005 3:02 AM PST
Hello everybody,

I was just thinking about XML and the speed-performance idea. And the impact unreadable binary files could have.

To my opinion sending complete XML-files is just like upgrading software by sending the complete package over and over again.

If it is about sending large files that contain for an great part the same information send previously. Why not just send PATCHES (or a nicer word: INCREMENTALS) to that file?

It needs communication protocols. It needs software adaption. But it brings fabulous sending-speed performance.

It might even increase SECURITY by not sending all information every time.

Please add your comment to this idea.

T. Nijmeijer
Delft, Netherlands
Reply to this comment
Updates
by filker0 March 25, 2005 5:45 AM PST
Your suggestion (sending only patches) is quite valid for situations where both sides of the transaction (the sender and the receiver) know the state of the data, and this can be done under the current XML specification on an application specific basis, but this does not address the issue that binary encodings of XML are intended to address. It is an application design issue, not a protocol issue.

Though the bandwidth considerations are not inconsiderable, a larger part of the problem is the work required to parse, validate, and deserialize the data from its text form to the binary formats. Some embedded devices (where this is most likely to be best applied) like set-top boxes and mobile phones are limited in the CPU and memory available to handle this process and a standardized binary XML encoding/representation would reduce the headroom required on these devices to handle more sophisticated applications.

I have already seen CDATA and other unrestricted elements being used by some XML applications to pass what is effectively BASE64 encoded binary records around, and this is not decodable by anything other than another application that already knows how to interpret the binary structure. A standardized binary XML encoding would solve this as it would still identify the elements.
And stupid CIO comment
by March 27, 2005 2:14 PM PST
Mr. Eric Newcomer, chief technology officer at Iona Technologies says... "reminiscent of the argument about a decade ago, when everyone said the World Wide Web is too slow and it will never take off."


Uhhh, well let's see then I guess then the VARIANT in VB is fast?

How about a unversal datatype in SQL Server or ORACLE then?

Should we wait 10 years?

If we fix the problem now, it will be superfast later...

However, who in their right mind should even think that the W3C have anything to do with making an XML standard is dreaming!!!

W3C did not make HTML a standard as it's not even a business and is mostly clueless intellectual types who have no idea what business is.

A standard becomes a standard when everyone starts using it, NOT when W3C says, "This is the standard and you must conform."

It's amazing the crap I hear on this IT ezines....They get Mr. Preppy with the long winded resume that says, "I worked for so-and-so but was and am NOT currently in the trenches actually doing the coding or actually using the software or service..."
Reply to this comment
(12 Comments)
  • prev
  • 1
  • next
advertisement

Latest tech news headlines

advertisement

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

More feeds available in our RSS feed index.

Markets

Market news, charts, SEC filings, and more

Related quotes

Dow Jones Industrials (0.00%) 0.00 10,390.11
S&P 500 (0.00%) 0.00 1,103.25
NASDAQ (0.00%) 0.00 2,189.61
CNET TECH (0.00%) 0.00 1,595.68
  Symbol Lookup
advertisement

Inside CNET News

Scroll Left Scroll Right