• On BNET: 3 worst things about the iPhone 3G S

March 23, 2005 4:00 AM PST

Faster XML ahead?

  • 12 comments
Related Stories

Putting XML in the fast lane

January 13, 2005
The Net's top standards body is getting closer to speeding up XML-based software, a move that could benefit everyone from cell phone carriers to television broadcasters to the military.

But critics say the group's favored approach could cause major compatibility problems, among other things.

XML is fast becoming a widely used way of formatting and saving business documents such as purchase orders. But for certain applications--sending data to set-top boxes, for instance, and offering interactive programs on cell phones--representing data using XML is simply too bulky, say proponents for more efficient XML.

News.context

What's new:
The Net's top standards body is getting closer to speeding up XML-based software, a move that could benefit everyone from cell phone carriers to television broadcasters to the military.

Bottom line:
The possibility of the World Wide Web Consortium pursuing more efficient XML through a binary, rather than text, format is causing concerns over interoperability and questions about the future direction of XML.

More stories on XML and the W3C

"XML has been a victim of its success," said Robin Berjon, of standards group the World Wide Web Consortium, "We've started using it in all kinds of situations that it wasn't designed for."

If XML were zippier, say some, cell phone companies, for example, could meet consumer demand for more complex programs. The Air Force, too, has expressed interest in using speedier XML formats for embedded computing applications, such as those found in fighter jets (click here for related PDF).

A W3C committee recently recommended that the group address the problem by moving away from the traditional way of saving XML data--in text format--and instead create a standard for a binary format. W3C working group recommendations are generally taken up as formal standards efforts, which means the group is one step closer to a major change in the XML standard.

The recommendation still has to be approved by the W3C's Advisory Committee and the W3C's director. But a vote to move forward with a binary XML standard could happen late this summer, said Liam Quin, the XML activity lead at the W3C.

Binarians and contrarians
The issue, though, is already causing controversy among the XML cognoscenti, who worry that significant changes to the specification could cause compatibility headaches and face significant hurdles in getting adopted.

Attendees of a February meeting in Boston argued for different technical approaches to speeding up XML. Some questioned the need to take up a binary XML effort at all, according to people present at the meeting.

We shouldn't "mess around with (XML) just for a short-term fix (when) in the long term, the industry is going to fix that problem naturally," said Eric Newcomer, chief technology officer at Iona Technologies and an attendee of the meeting. Newcomer says current XML performance "is not all that bad" and that the controversy is "reminiscent of the argument about a decade ago, when everyone said the World Wide Web is too slow and it will never take off."

Right now, all the information in an XML document, such as a name and address, is represented as text. Binary formats compress the XML

CONTINUED: ...
Page 1 | 2 | 3

See more CNET content tagged:
XML, cell phone carrier, broadcaster, approach, cell phone

Add a Comment (Log in or register) (12 Comments)
  • prev
  • 1
  • next
It'll still be slow
by David Arbogast March 23, 2005 7:53 AM PST
Compared to proprietary, less-verbose data formats, XML will always be a slow dog. Whether it is represented as plain text or in a binary format. XML is nice for collaboration and interoperability between working groups... but if your goal is pure speed, XML is not your best choice by a long shot.
Reply to this comment
XML Mania
by Andrew J Glina March 23, 2005 8:15 AM PST
I have been very surprised at all of the proposed XML usages. As you say, it is great to transfer between applications, just like CSV was used for. But it is too bulky for actual storage of databases for example.
Broken apps
by nzamparello March 23, 2005 9:25 AM PST
I'm just worried that this might affect both the OpenOffice project and RSS since they both use XML.. Plus, a binary format might mean increased development times because now they might have to create an actual programming language, compiler, and an IDE to go along with it... I say just keep it the way it is... Interpreted is fine for most usage anyways.. Look at python and PHP!
Reply to this comment
you left out the grand daddy of all plain text languages...
by tocam27 March 27, 2005 5:04 PM PST
html... just think how screwed up the web would be if five years ago some bright spark had said "I know, html is too damn slow, let's make it binary"...
Use an existing solid standard - ASN.1
by March 23, 2005 9:49 AM PST
ASN.1 (the protocol behind SNMP) covers all the issues I have seen raised so far. It is a solid industry tested protocol with substantial support. It would be smart to use it instead of creating yet another binary exchange format and making a whole set of new mistakes in an effort to avoid the 'old mistakes'.

Cheers,
Kevin
Reply to this comment
Trim the format
by March 23, 2005 11:19 AM PST
The format, XML, is jut too redundant. Get rid of the duplication and the extra and you woudl loose about 1/3 of the size and still have the same content. Parsing would be easier as you would only need to flag events and not 'read' every tag...

For example, instead of:

<books>
<book type='hardback'>
<title>Atlas Shrugged</title>
<author>Ayn Rand</author>
<isbn id='1' rating='good'>0525934189</isbn>
</book>
</books>

Use:

books{
book(type='hardback'){
title{Atlas Shrugged}
author{Ayn Rand}
isbn(id=1,rating=good){0525934189}
}
}

Attribution is obvious as it is in (), the loss of the duplicating closing tag removes the number of bytes of the opening tag.

This format works without any line feeds and can be translated into XML if needed.
Reply to this comment
Simple
by rpmyers1 March 23, 2005 12:59 PM PST
Content-Encoding: gzip

Already supported, and chances are most of the transfers occur over HTTP anyway.
Reply to this comment
WBXML is one option for J2ME devices
by Eric Giguere March 24, 2005 6:44 AM PST
Someday wireless networks will be fast enough to not make a difference, but not yet. I wrote a J2ME Tech Tip three years ago that showed how to use WBXML (a binary encoding of XML that is a WAP standard) to compress XML. You can find it on Sun's site at http://developers.sun.com/techtopics/mobility/midp/ttips/compressxml/. This is one possibility for cases where you have to integrate with an existing XML-based back end infrastructure.

That said, I often recommend a custom binary encoding between the J2ME app and a proxy for the app running in a servlet container. Doing Java on either end makes it easy to move data between the two, see my tip at http://developers.sun.com/techtopics/mobility/midp/ttips/clientserv/ for the details.

Eric
http://www.ericgiguere.com/
Reply to this comment
Processing, not transmission
by filker0 March 24, 2005 8:34 AM PST
Some have suggested that going to a gzipped pure text XML document is sufficient, but it's not so much the transmission time required for the XML text that is the reason for the trouble (though it's part of the problem in some cases), rather it's the amount processor power (read CPU cycles and memory) required to parse, validate, and deserialize the data from an XML document. Gzip encoding actually increases the CPU cycles needed to deserialize an XML document.

I believe that if a binary XML format is standardized, there should be some standard means for the sending and receiving programs to negotiate what they support, so that the binary protocol will only be used when both sides support it. If different binary encodings need to be specified, there should be some way of identifying them, and if two programs don't support any common binary encodings, the text form should always be available as a fallback.

In some 'closed' applications, where the "client" is known to support only one format, a binary only XML engine can be used to save code space and processor cycles. On 'open' systems, most applications would probably have to support text XML parsing as well as zero or more binary encodings.

An advantage to having this standardized is that the XML documents will be in a form that can be decoded by any XML parser that conforms to the standard and supports the specific binary encoding scheme. There may be a few of these different encodings, but they should all be flexible enough that no software vendor feels obligated to create their own proprietary encoding because they think the standardized ones are inadequate.
Reply to this comment
INCREMENTAL XML
by Tobias Nijmeijer March 25, 2005 3:02 AM PST
Hello everybody,

I was just thinking about XML and the speed-performance idea. And the impact unreadable binary files could have.

To my opinion sending complete XML-files is just like upgrading software by sending the complete package over and over again.

If it is about sending large files that contain for an great part the same information send previously. Why not just send PATCHES (or a nicer word: INCREMENTALS) to that file?

It needs communication protocols. It needs software adaption. But it brings fabulous sending-speed performance.

It might even increase SECURITY by not sending all information every time.

Please add your comment to this idea.

T. Nijmeijer
Delft, Netherlands
Reply to this comment
Updates
by filker0 March 25, 2005 5:45 AM PST
Your suggestion (sending only patches) is quite valid for situations where both sides of the transaction (the sender and the receiver) know the state of the data, and this can be done under the current XML specification on an application specific basis, but this does not address the issue that binary encodings of XML are intended to address. It is an application design issue, not a protocol issue.

Though the bandwidth considerations are not inconsiderable, a larger part of the problem is the work required to parse, validate, and deserialize the data from its text form to the binary formats. Some embedded devices (where this is most likely to be best applied) like set-top boxes and mobile phones are limited in the CPU and memory available to handle this process and a standardized binary XML encoding/representation would reduce the headroom required on these devices to handle more sophisticated applications.

I have already seen CDATA and other unrestricted elements being used by some XML applications to pass what is effectively BASE64 encoded binary records around, and this is not decodable by anything other than another application that already knows how to interpret the binary structure. A standardized binary XML encoding would solve this as it would still identify the elements.
And stupid CIO comment
by March 27, 2005 2:14 PM PST
Mr. Eric Newcomer, chief technology officer at Iona Technologies says... "reminiscent of the argument about a decade ago, when everyone said the World Wide Web is too slow and it will never take off."


Uhhh, well let's see then I guess then the VARIANT in VB is fast?

How about a unversal datatype in SQL Server or ORACLE then?

Should we wait 10 years?

If we fix the problem now, it will be superfast later...

However, who in their right mind should even think that the W3C have anything to do with making an XML standard is dreaming!!!

W3C did not make HTML a standard as it's not even a business and is mostly clueless intellectual types who have no idea what business is.

A standard becomes a standard when everyone starts using it, NOT when W3C says, "This is the standard and you must conform."

It's amazing the crap I hear on this IT ezines....They get Mr. Preppy with the long winded resume that says, "I worked for so-and-so but was and am NOT currently in the trenches actually doing the coding or actually using the software or service..."
Reply to this comment
(12 Comments)
  • prev
  • 1
  • next
advertisement

Latest tech news headlines

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

More feeds available in our RSS feed index.

Markets

Market news, charts, SEC filings, and more

Related quotes

Dow Jones Industrials (0.06%) 4.76 8,183.17
S&P 500 (0.35%) 3.12 882.68
NASDAQ (0.31%) 5.38 1,752.55
CNET TECH (0.38%) 4.78 1,259.65
  Symbol Lookup
advertisement

Inside CNET News

Scroll Left Scroll Right