ie8 fix
Ad: Read more on Cloud Computing

March 23, 2005 4:00 AM PST

Faster XML ahead?

Related Stories

Putting XML in the fast lane

January 13, 2005
The Net's top standards body is getting closer to speeding up XML-based software, a move that could benefit everyone from cell phone carriers to television broadcasters to the military.

But critics say the group's favored approach could cause major compatibility problems, among other things.

XML is fast becoming a widely used way of formatting and saving business documents such as purchase orders. But for certain applications--sending data to set-top boxes, for instance, and offering interactive programs on cell phones--representing data using XML is simply too bulky, say proponents for more efficient XML.

News.context

What's new:
The Net's top standards body is getting closer to speeding up XML-based software, a move that could benefit everyone from cell phone carriers to television broadcasters to the military.

Bottom line:
The possibility of the World Wide Web Consortium pursuing more efficient XML through a binary, rather than text, format is causing concerns over interoperability and questions about the future direction of XML.

More stories on XML and the W3C

"XML has been a victim of its success," said Robin Berjon, of standards group the World Wide Web Consortium, "We've started using it in all kinds of situations that it wasn't designed for."

If XML were zippier, say some, cell phone companies, for example, could meet consumer demand for more complex programs. The Air Force, too, has expressed interest in using speedier XML formats for embedded computing applications, such as those found in fighter jets (click here for related PDF).

A W3C committee recently recommended that the group address the problem by moving away from the traditional way of saving XML data--in text format--and instead create a standard for a binary format. W3C working group recommendations are generally taken up as formal standards efforts, which means the group is one step closer to a major change in the XML standard.

The recommendation still has to be approved by the W3C's Advisory Committee and the W3C's director. But a vote to move forward with a binary XML standard could happen late this summer, said Liam Quin, the XML activity lead at the W3C.

Binarians and contrarians
The issue, though, is already causing controversy among the XML cognoscenti, who worry that significant changes to the specification could cause compatibility headaches and face significant hurdles in getting adopted.

Attendees of a February meeting in Boston argued for different technical approaches to speeding up XML. Some questioned the need to take up a binary XML effort at all, according to people present at the meeting.

We shouldn't "mess around with (XML) just for a short-term fix (when) in the long term, the industry is going to fix that problem naturally," said Eric Newcomer, chief technology officer at Iona Technologies and an attendee of the meeting. Newcomer says current XML performance "is not all that bad" and that the controversy is "reminiscent of the argument about a decade ago, when everyone said the World Wide Web is too slow and it will never take off."

Right now, all the information in an XML document, such as a name and address, is represented as text. Binary formats compress the XML

CONTINUED:
Page 1 | 2 | 3

12 comments

Join the conversation!
Add your comment (Log in or register)
It'll still be slow
Compared to proprietary, less-verbose data formats, XML will always be a slow dog. Whether it is represented as plain text or in a binary format. XML is nice for collaboration and interoperability between working groups... but if your goal is pure speed, XML is not your best choice by a long shot.
Posted by David Arbogast (1712 comments )
Reply Link Flag
XML Mania
I have been very surprised at all of the proposed XML usages. As you say, it is great to transfer between applications, just like CSV was used for. But it is too bulky for actual storage of databases for example.
Posted by Andrew J Glina (1673 comments )
Link Flag
Broken apps
I'm just worried that this might affect both the OpenOffice project and RSS since they both use XML.. Plus, a binary format might mean increased development times because now they might have to create an actual programming language, compiler, and an IDE to go along with it... I say just keep it the way it is... Interpreted is fine for most usage anyways.. Look at python and PHP!
Posted by nzamparello (60 comments )
Reply Link Flag
you left out the grand daddy of all plain text languages...
html... just think how screwed up the web would be if five years ago some bright spark had said "I know, html is too damn slow, let's make it binary"...
Posted by tocam27 (16 comments )
Link Flag
Use an existing solid standard - ASN.1
ASN.1 (the protocol behind SNMP) covers all the issues I have seen raised so far. It is a solid industry tested protocol with substantial support. It would be smart to use it instead of creating yet another binary exchange format and making a whole set of new mistakes in an effort to avoid the 'old mistakes'.

Cheers,
Kevin
Posted by (1 comment )
Reply Link Flag
Trim the format
The format, XML, is jut too redundant. Get rid of the duplication and the extra and you woudl loose about 1/3 of the size and still have the same content. Parsing would be easier as you would only need to flag events and not 'read' every tag...

For example, instead of:

<books>
<book type='hardback'>
<title>Atlas Shrugged</title>
<author>Ayn Rand</author>
<isbn id='1' rating='good'>0525934189</isbn>
</book>
</books>

Use:

books{
book(type='hardback'){
title{Atlas Shrugged}
author{Ayn Rand}
isbn(id=1,rating=good){0525934189}
}
}

Attribution is obvious as it is in (), the loss of the duplicating closing tag removes the number of bytes of the opening tag.

This format works without any line feeds and can be translated into XML if needed.
Posted by (1 comment )
Reply Link Flag
Simple
Content-Encoding: gzip

Already supported, and chances are most of the transfers occur over HTTP anyway.
Posted by rpmyers1 (15 comments )
Reply Link Flag
WBXML is one option for J2ME devices
Someday wireless networks will be fast enough to not make a difference, but not yet. I wrote a J2ME Tech Tip three years ago that showed how to use WBXML (a binary encoding of XML that is a WAP standard) to compress XML. You can find it on Sun's site at <a class="jive-link-external" href="http://developers.sun.com/techtopics/mobility/midp/ttips/compressxml/" target="_newWindow">http://developers.sun.com/techtopics/mobility/midp/ttips/compressxml/</a>. This is one possibility for cases where you have to integrate with an existing XML-based back end infrastructure.

That said, I often recommend a custom binary encoding between the J2ME app and a proxy for the app running in a servlet container. Doing Java on either end makes it easy to move data between the two, see my tip at <a class="jive-link-external" href="http://developers.sun.com/techtopics/mobility/midp/ttips/clientserv/" target="_newWindow">http://developers.sun.com/techtopics/mobility/midp/ttips/clientserv/</a> for the details.

Eric
<a class="jive-link-external" href="http://www.ericgiguere.com/" target="_newWindow">http://www.ericgiguere.com/</a>
Posted by Eric Giguere (13 comments )
Reply Link Flag
Processing, not transmission
Some have suggested that going to a gzipped pure text XML document is sufficient, but it's not so much the transmission time required for the XML text that is the reason for the trouble (though it's part of the problem in some cases), rather it's the amount processor power (read CPU cycles and memory) required to parse, validate, and deserialize the data from an XML document. Gzip encoding actually increases the CPU cycles needed to deserialize an XML document.

I believe that if a binary XML format is standardized, there should be some standard means for the sending and receiving programs to negotiate what they support, so that the binary protocol will only be used when both sides support it. If different binary encodings need to be specified, there should be some way of identifying them, and if two programs don't support any common binary encodings, the text form should always be available as a fallback.

In some 'closed' applications, where the "client" is known to support only one format, a binary only XML engine can be used to save code space and processor cycles. On 'open' systems, most applications would probably have to support text XML parsing as well as zero or more binary encodings.

An advantage to having this standardized is that the XML documents will be in a form that can be decoded by any XML parser that conforms to the standard and supports the specific binary encoding scheme. There may be a few of these different encodings, but they should all be flexible enough that no software vendor feels obligated to create their own proprietary encoding because they think the standardized ones are inadequate.
Posted by filker0 (4 comments )
Reply Link Flag
INCREMENTAL XML
Hello everybody,

I was just thinking about XML and the speed-performance idea. And the impact unreadable binary files could have.

To my opinion sending complete XML-files is just like upgrading software by sending the complete package over and over again.

If it is about sending large files that contain for an great part the same information send previously. Why not just send PATCHES (or a nicer word: INCREMENTALS) to that file?

It needs communication protocols. It needs software adaption. But it brings fabulous sending-speed performance.

It might even increase SECURITY by not sending all information every time.

Please add your comment to this idea.

T. Nijmeijer
Delft, Netherlands
Posted by Tobias Nijmeijer (2 comments )
Reply Link Flag
Updates
Your suggestion (sending only patches) is quite valid for situations where both sides of the transaction (the sender and the receiver) know the state of the data, and this can be done under the current XML specification on an application specific basis, but this does not address the issue that binary encodings of XML are intended to address. It is an application design issue, not a protocol issue.

Though the bandwidth considerations are not inconsiderable, a larger part of the problem is the work required to parse, validate, and deserialize the data from its text form to the binary formats. Some embedded devices (where this is most likely to be best applied) like set-top boxes and mobile phones are limited in the CPU and memory available to handle this process and a standardized binary XML encoding/representation would reduce the headroom required on these devices to handle more sophisticated applications.

I have already seen CDATA and other unrestricted elements being used by some XML applications to pass what is effectively BASE64 encoded binary records around, and this is not decodable by anything other than another application that already knows how to interpret the binary structure. A standardized binary XML encoding would solve this as it would still identify the elements.
Posted by filker0 (4 comments )
Link Flag
And stupid CIO comment
Mr. Eric Newcomer, chief technology officer at Iona Technologies says... "reminiscent of the argument about a decade ago, when everyone said the World Wide Web is too slow and it will never take off."


Uhhh, well let's see then I guess then the VARIANT in VB is fast?

How about a unversal datatype in SQL Server or ORACLE then?

Should we wait 10 years?

If we fix the problem now, it will be superfast later...

However, who in their right mind should even think that the W3C have anything to do with making an XML standard is dreaming!!!

W3C did not make HTML a standard as it's not even a business and is mostly clueless intellectual types who have no idea what business is.

A standard becomes a standard when everyone starts using it, NOT when W3C says, "This is the standard and you must conform."

It's amazing the crap I hear on this IT ezines....They get Mr. Preppy with the long winded resume that says, "I worked for so-and-so but was and am NOT currently in the trenches actually doing the coding or actually using the software or service..."
Posted by (8 comments )
Reply Link Flag
 

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

ie8 fix

What's Hot

Discussions

Shared

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

ie8 fix
  • Recently Viewed Products
  • My Lists
  • My Software Updates
  • Promo
  • Log In | Join CNET