Version: 2008
  • On MovieTome: The next Marvel mutant movie?

January 13, 2005 4:00 AM PST

Putting XML in the fast lane

  • 17 comments
Related Stories

XML: Too much of a good thing?

September 7, 2004

Extra headaches of securing XML

March 29, 2004

Taking XML's measure

September 23, 2003

(continued from previous page)

encoding is necessary because it can greatly improve performance, which is necessary in certain situations.

In initial tests, they found that applications perform two or three times faster when using the software. The goal of the Fast Infoset project is to generate interest among developers and eventually create a standardized binary format.

Manufacturers of consumer devices such as Canon, as well as mobile-phone companies such as Nokia, have argued for a binary XML format. Without it, large files such as images will take too long to download to devices such as cell phones, they argue.

The primary concern is interoperability. Potentially, several different binary formats for specific purposes could emerge, which are not universally understood. For example, there may be a method for encoding images sent to consumer electronics, which may differ substantially from others.

Bray is skeptical of the entire notion of converting XML to any format other than text.

"The fact that XML is ordinary plain text that you can pull into Notepad...has turned out to be a boon, in practice," he said. "Any time you depart from that straight-and-narrow path, you risk loss of interoperability. Experience with interoperability via XML as it is, has been excellent. Why take chances?"

Bray noted that there are methods for speeding up XML traffic other than creating a binary format. Advances in networking and processing power go a long way in addressing performance concerns, though perhaps not on battery-constrained mobile phones, he said.

Janet Perna, the general manager of IBM's information management group, said one alternative to binary XML is to handle the mushrooming in XML traffic with faster networking. Five or six years ago, people thought that the Internet would be too slow for doing online commerce, but the industry eventually overcame those barriers, she said.

"I don't see (growing XML traffic) as a limitation here. I think we'll keep up with it," she said.

ZapThink, a research firm specializing in XML and Web services, echoed concerns over binary XML, notably the possibility of proprietary implementations. ZapThink analysts also noted that an XML message can touch several different pieces of software and hardware, such as security systems, all of which would support any binary XML standard.

ZapThink's Ron Schmelzer said binary XML may be limited to niche uses such as high-volume applications, which demand the best performance.

Leader Technologies' Lamb supports the idea of binary XML but with one important caveat--that it be standardized.

"The amount of transactions that contain XML continues to exponentially expand, so we don't want to get caught behind the problem," he said. "But if we can't achieve a standard (binary XML), then my support would go way down."

See more CNET content tagged:
XML, XML document, Sun Microsystems Inc., traffic, agreement

Add a Comment (Log in or register) (17 Comments)
  • prev
  • 1
  • next
Why Not Compress
by catmando January 13, 2005 6:46 AM PST
If the transmission of XML is the concern, why not compress the data during the transmission process? That way it can be stored in it's native format for editing and searching, but compressed to a small state for the transmission.

Some storage systems like Oracle allow for behind the scenes compression so compression could even spill over to storage systems and be transparent to the consumers and producers of XML.
Reply to this comment
Binary implies compression...
by Johnny Mnemonic January 13, 2005 2:20 PM PST
In order to maintain compatibility among various
systems, everyone must agree as to how the bits
are to me transformed. When we discuss binary formats
it implies compression. But, what form of
compression? The entire packet, header, body, etc. At
what point do you compress? At what level of the
stack? Eveyone must agree so everyone can understand
the transmission.
That doesn't really solve speed issues
by January 17, 2005 4:14 PM PST
You have to factor in the time to compress and decompress the data.
compression doesn't speed parsing
by stumiller January 18, 2005 3:49 AM PST
Compression will only save transmission times, but it will not speed the parsing of the data, which I imagine is at least 50% of the bottleneck. Binary XML would help here.
HTML was verbose too...
by Jean-Paul Figer January 13, 2005 8:22 AM PST
...and was the winner. Http protocol also exchanges text information. Tim Bray is right : the benefits of text information are so huge than binary XML is a bad idea. Optimizations can always be done locally if needed. Same story as assembler compared to high level languages.
Reply to this comment
Coherent Informatics (CI)
by January 13, 2005 9:35 AM PST
The company Exos Services Inc., in Denver has already devloped a very efficent XML platform. It prioritizes internet stream packets, which cause no bogg down. Check them out http://www.exosservices.com/.

http://www.som.tulane.edu/tccep/documents/CI_Defined.pdf

Its amazing
Reply to this comment
CI
by January 18, 2005 10:12 AM PST
Nick, how do you know about CI?
Bray is just wrong
by January 13, 2005 9:55 AM PST
First, looking at binary format such as Corba, it's much more interoperable than xml these day. Just go to the Axis forum and see how many people having problem with .net and java?

Get real. Corba has both communication interoperabability and source code interoperability. Soap is a true backward standard with these regards.

I work on both and I know the heart of it. I developed enterprise applications using both and I know. Corba is 5 to 10 times faster. It's much more scalable.

So, what's the problem of Corba? These are addressed by webservice/soap, and that's why it shines, not because of text based transportation:

1) It's not embraced by all vendor.
2) It does not use web as the mean of transportation.
3) The addressing scheme is really stupid (not using a simple http url, but a weird and long IOR string
4) Looking using JNDI/nameservice. Most of the time, people know exactly what the address is and not needed to use the repository server for the service.
5) Screw up programming model in Corba. Put it in one short: it's stupid. a) It's like black whole, sucking all your programming into it. There's an article talking about this. b) It forces you to use Corba objects when you don't need to, or want to. c) The architecture is too complex, and vendors hardly get it right. Example would be taken well known, matured libraries such as Ace/Tao and test it in a reall world app, it would not handle Tcp/Ip package corruption. Or Mico for example, it would not have time out option (with the version I last checked a while ago). This shows it's how hard you have to work to get these right.

So, Soap/WS advantage is not text base (text helps a little). Corba's problem has nothing to do with binary. Interoperability is not because of text or not (it's opened standard).

Another misleading point is not only the network that slows down the connection. For example: take out the network, and use the same computer for the client and the server. It's still very slow. Why? CPU cycles. XML consumes so much CPU cycles that 100% is utilize. This is not the case for Corba.

My advice is to look at the problem seriously. Work with it like I do. Find the exact problem and fix it. Do not speculate blindly. Bray is just too blind, and bias, because it's his baby.

Here's some similarity between Soap/ws and Corba to show that some of the myth Soap/ws brings:

1) Both use another declaration language: wsdl for soap and idl for corba. Idl is much easier to master.

2) Programing style most vendors support is to generate the stub, skeleton from the declaration language. So no different here. Both are cumbersome, and taken the same amount of work.
Reply to this comment
Except...
by Johnny Mnemonic January 13, 2005 2:28 PM PST
CORBA was the dream, but, Microsoft killed it.
Corba was way ahead of its time. But, it offers
far more than just data and services
interoperability. XML offers a simple data transport
whereas CORBA offers full data/object/method
distribution. Far more complex and difficult to
integrate, and, currently, far too much overhead.
What you gain in data transport you lose in
processing speed.
View reply
Excellent comparision of WS vs. CORBA
by January 17, 2005 10:23 AM PST
I couldn't agree more. When I brief people on Web Services, I tell them it's next generation CORBA.

My company uses the IFX XML standard in financial applications. The documents can be quite large and intricate. They are typically used between dedicated clients and servers, where a substantial investment is made in their development. Performance is ALWAYS an issue. As you pointed out, and as the Sun article showed, the XMLlanguage object binding can be the dominant cost. This is network independent - as you pointed out, for the intraprocess example.

Sun article: http://java.sun.com/developer/technicalArticles/WebServices/fastWS/
and
http://java.sun.com/developer/technicalArticles/xml/fastinfoset/
Transparent compression when/if needed
by My-Self January 13, 2005 10:27 AM PST
There is no reason to create a new standard / fragment XML compatibility. Transparent compression when/if needed is the way to go.

HTML is verbose too, and the proper way to handle it is to setup on the fly compression (such as Apache's mod_deflate).
http://httpd.apache.org/docs-2.0/mod/mod_deflate.html

It significantly cuts bandwidth usage, speeds up transmission with very little impact on the processing power. The best part is once setup, you just forget about it, it's completely transparent.
Reply to this comment
forget about it
by Al Johnsons June 3, 2007 11:46 AM PDT
http://www.analogstereo.com/seat_leon_owners_manual.htm
performance is never an issue (for long)
by stumiller January 14, 2005 4:45 AM PST
If initial research puts the performance boost at 3x for binary XML, then at the rate which hardware performance grows this will be matched in 2 years. XML's usage will surely grow, but XML itself will not get more complex. In a few years from now, even the most constrained devices (PDA, cameras, personal devices) will likely have ample computing power and storage to handle today's text-based XML. Why bother with what will undoubtedly become a fragmented market of open and proprietary binary-XML specs?
Reply to this comment
XML tokenization?
by January 14, 2005 5:27 PM PST
Why not simply add a header tag section to the top of an XML document that defines what 'short tags' (tokens) in the rest of the document should expand to? It's the old data dictionary song, with a faster beat... very easy to automate prior to transmission.
<root>
<dictionary>
<define token="a1" fullname="LastName"/>
</dictionary>
<a1>Smith</a1>
</root>
Reply to this comment
People are already using binary formats for XML interchange
by January 14, 2005 7:05 PM PST
Whether Tim Bray likes it or not, people are already using binary interchange formats.

The question before the W3C is whether we should try to gain support for a single such format, to promote interoperability, or whether people should go off and use the 80 or so formats in widespread use today.

Liam
(W3C XML Activity Lead)
Reply to this comment
Fast streaming XML documents
by May 24, 2005 6:41 PM PDT
Many concerns about integration and compatibility are being resolved by companies implementing the XML and service oriented architectures. At the same time due to lack of proper standards and improper implementations, it is still hard to have large amounts of information bundled within the XML documents for transactions. There is a need to create compression standards and a robust XML architectures. However, a certain vendor controlling and pushing a format may ultimately prove to be a downfall for the rapid growth in XML adoption due to competing formats.
Reply to this comment
(17 Comments)
  • prev
  • 1
  • next
advertisement

Latest tech news headlines

advertisement

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

More feeds available in our RSS feed index.

Markets

Market news, charts, SEC filings, and more

Related quotes

IBM (0.83%) 1.02 124.51
Sun Microsystems (1.48%) 0.12 8.22
Microsoft (0.52%) 0.15 28.67
Dow Jones Industrials (1.23%) 123.19 10,146.61
S&P 500 (1.30%) 13.92 1,083.22
NASDAQ (1.39%) 29.43 2,141.87
CNET TECH (1.38%) 21.18 1,559.57
  Symbol Lookup
advertisement

Inside CNET News

Scroll Left Scroll Right