ie8 fix
Ad: Read more on Cloud Computing

March 23, 2005 4:00 AM PST

Faster XML ahead?

Related Stories

Putting XML in the fast lane

January 13, 2005

(continued from previous page)

these people, they will switch to whatever we publish. And indeed many have said they would do that," said Quin.

Others argue that multiple binary formats are needed. Michael Rys, a program manager for Microsoft's SQL Server database and a member of the W3C's XML Query Working Group, said that Microsoft does not favor the W3C creating a single binary XML format.

In a blog posting following the Boston meeting, Rys said that if there must be a binary XML standard, then the W3C should "do it right."

"There will be more than one binary XML format," Rys wrote. The W3C is "very unlikely to define a format that optimizes a dozen, partially conflicting goals."

Another concern facing the W3C is whether a substantial change to the XML standard, such as a binary XML format, will be widely adopted--or ignored.

To process XML data sent over the Internet, devices need a software program called an XML parser. Existing parsers would have to be upgraded, in the form of a patch or service pack, to make sure computers can read both text and binary formats. Without broad adoption of the specification, software developers will be less likely to make use of the speedier XML.

XML proponents note that the latest XML 1.1 specification has been adopted more slowly than hoped. Microsoft, for example, has decided not to support that specification for fear of causing compatibility problems with applications that use the XML 1.0 standard, according to Rys.

Iona's Newcomer noted that there are several different options for making XML go faster. Some approaches would require a complete rewrite of existing parsers, rather than relatively minor changes with a simpler upgrade path, he noted.

If the W3C votes in favor of pursuing a binary XML standard, a working group could be formed as early as this summer and take up to three years to complete a specification, Quin said. To address concerns and solicit feedback, the W3C is scheduling public hearings at various conferences worldwide.

"This one has some controversial aspects," Quin said, "so I would not like to predict the outcome at the moment."

Previous page
Page 1 | 2 | 3

12 comments

Join the conversation!
Add your comment (Log in or register)
It'll still be slow
Compared to proprietary, less-verbose data formats, XML will always be a slow dog. Whether it is represented as plain text or in a binary format. XML is nice for collaboration and interoperability between working groups... but if your goal is pure speed, XML is not your best choice by a long shot.
Posted by David Arbogast (1712 comments )
Reply Link Flag
XML Mania
I have been very surprised at all of the proposed XML usages. As you say, it is great to transfer between applications, just like CSV was used for. But it is too bulky for actual storage of databases for example.
Posted by Andrew J Glina (1673 comments )
Link Flag
Broken apps
I'm just worried that this might affect both the OpenOffice project and RSS since they both use XML.. Plus, a binary format might mean increased development times because now they might have to create an actual programming language, compiler, and an IDE to go along with it... I say just keep it the way it is... Interpreted is fine for most usage anyways.. Look at python and PHP!
Posted by nzamparello (60 comments )
Reply Link Flag
you left out the grand daddy of all plain text languages...
html... just think how screwed up the web would be if five years ago some bright spark had said "I know, html is too damn slow, let's make it binary"...
Posted by tocam27 (16 comments )
Link Flag
Use an existing solid standard - ASN.1
ASN.1 (the protocol behind SNMP) covers all the issues I have seen raised so far. It is a solid industry tested protocol with substantial support. It would be smart to use it instead of creating yet another binary exchange format and making a whole set of new mistakes in an effort to avoid the 'old mistakes'.

Cheers,
Kevin
Posted by (1 comment )
Reply Link Flag
Trim the format
The format, XML, is jut too redundant. Get rid of the duplication and the extra and you woudl loose about 1/3 of the size and still have the same content. Parsing would be easier as you would only need to flag events and not 'read' every tag...

For example, instead of:

<books>
<book type='hardback'>
<title>Atlas Shrugged</title>
<author>Ayn Rand</author>
<isbn id='1' rating='good'>0525934189</isbn>
</book>
</books>

Use:

books{
book(type='hardback'){
title{Atlas Shrugged}
author{Ayn Rand}
isbn(id=1,rating=good){0525934189}
}
}

Attribution is obvious as it is in (), the loss of the duplicating closing tag removes the number of bytes of the opening tag.

This format works without any line feeds and can be translated into XML if needed.
Posted by (1 comment )
Reply Link Flag
Simple
Content-Encoding: gzip

Already supported, and chances are most of the transfers occur over HTTP anyway.
Posted by rpmyers1 (15 comments )
Reply Link Flag
WBXML is one option for J2ME devices
Someday wireless networks will be fast enough to not make a difference, but not yet. I wrote a J2ME Tech Tip three years ago that showed how to use WBXML (a binary encoding of XML that is a WAP standard) to compress XML. You can find it on Sun's site at <a class="jive-link-external" href="http://developers.sun.com/techtopics/mobility/midp/ttips/compressxml/" target="_newWindow">http://developers.sun.com/techtopics/mobility/midp/ttips/compressxml/</a>. This is one possibility for cases where you have to integrate with an existing XML-based back end infrastructure.

That said, I often recommend a custom binary encoding between the J2ME app and a proxy for the app running in a servlet container. Doing Java on either end makes it easy to move data between the two, see my tip at <a class="jive-link-external" href="http://developers.sun.com/techtopics/mobility/midp/ttips/clientserv/" target="_newWindow">http://developers.sun.com/techtopics/mobility/midp/ttips/clientserv/</a> for the details.

Eric
<a class="jive-link-external" href="http://www.ericgiguere.com/" target="_newWindow">http://www.ericgiguere.com/</a>
Posted by Eric Giguere (13 comments )
Reply Link Flag
Processing, not transmission
Some have suggested that going to a gzipped pure text XML document is sufficient, but it's not so much the transmission time required for the XML text that is the reason for the trouble (though it's part of the problem in some cases), rather it's the amount processor power (read CPU cycles and memory) required to parse, validate, and deserialize the data from an XML document. Gzip encoding actually increases the CPU cycles needed to deserialize an XML document.

I believe that if a binary XML format is standardized, there should be some standard means for the sending and receiving programs to negotiate what they support, so that the binary protocol will only be used when both sides support it. If different binary encodings need to be specified, there should be some way of identifying them, and if two programs don't support any common binary encodings, the text form should always be available as a fallback.

In some 'closed' applications, where the "client" is known to support only one format, a binary only XML engine can be used to save code space and processor cycles. On 'open' systems, most applications would probably have to support text XML parsing as well as zero or more binary encodings.

An advantage to having this standardized is that the XML documents will be in a form that can be decoded by any XML parser that conforms to the standard and supports the specific binary encoding scheme. There may be a few of these different encodings, but they should all be flexible enough that no software vendor feels obligated to create their own proprietary encoding because they think the standardized ones are inadequate.
Posted by filker0 (4 comments )
Reply Link Flag
INCREMENTAL XML
Hello everybody,

I was just thinking about XML and the speed-performance idea. And the impact unreadable binary files could have.

To my opinion sending complete XML-files is just like upgrading software by sending the complete package over and over again.

If it is about sending large files that contain for an great part the same information send previously. Why not just send PATCHES (or a nicer word: INCREMENTALS) to that file?

It needs communication protocols. It needs software adaption. But it brings fabulous sending-speed performance.

It might even increase SECURITY by not sending all information every time.

Please add your comment to this idea.

T. Nijmeijer
Delft, Netherlands
Posted by Tobias Nijmeijer (2 comments )
Reply Link Flag
Updates
Your suggestion (sending only patches) is quite valid for situations where both sides of the transaction (the sender and the receiver) know the state of the data, and this can be done under the current XML specification on an application specific basis, but this does not address the issue that binary encodings of XML are intended to address. It is an application design issue, not a protocol issue.

Though the bandwidth considerations are not inconsiderable, a larger part of the problem is the work required to parse, validate, and deserialize the data from its text form to the binary formats. Some embedded devices (where this is most likely to be best applied) like set-top boxes and mobile phones are limited in the CPU and memory available to handle this process and a standardized binary XML encoding/representation would reduce the headroom required on these devices to handle more sophisticated applications.

I have already seen CDATA and other unrestricted elements being used by some XML applications to pass what is effectively BASE64 encoded binary records around, and this is not decodable by anything other than another application that already knows how to interpret the binary structure. A standardized binary XML encoding would solve this as it would still identify the elements.
Posted by filker0 (4 comments )
Link Flag
And stupid CIO comment
Mr. Eric Newcomer, chief technology officer at Iona Technologies says... "reminiscent of the argument about a decade ago, when everyone said the World Wide Web is too slow and it will never take off."


Uhhh, well let's see then I guess then the VARIANT in VB is fast?

How about a unversal datatype in SQL Server or ORACLE then?

Should we wait 10 years?

If we fix the problem now, it will be superfast later...

However, who in their right mind should even think that the W3C have anything to do with making an XML standard is dreaming!!!

W3C did not make HTML a standard as it's not even a business and is mostly clueless intellectual types who have no idea what business is.

A standard becomes a standard when everyone starts using it, NOT when W3C says, "This is the standard and you must conform."

It's amazing the crap I hear on this IT ezines....They get Mr. Preppy with the long winded resume that says, "I worked for so-and-so but was and am NOT currently in the trenches actually doing the coding or actually using the software or service..."
Posted by (8 comments )
Reply Link Flag
 

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

ie8 fix

What's Hot

Discussions

Shared

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

ie8 fix
  • Recently Viewed Products
  • My Lists
  • My Software Updates
  • Promo
  • Log In | Join CNET