September 7, 2004 4:00 AM PDT

XML: Too much of a good thing?

Related Stories

Trying to make Web services make sense

June 22, 2004

Taking XML's measure

September 23, 2003

XML and Unicode: Mix with care

June 16, 2003

XML makes its mark

February 10, 2003
Despite rumors to the contrary, the adult entertainment industry is not developing its own dialect of Extensible Markup Language dubbed XXXML.

Aside from that, it's hard to find an industry or interest that isn't taking advantage of the fast-growing standard for Web services and data exchange. In the six years since the main XML specification was first published, it's spawned hundreds of dialects, or schemas, benefiting everyone from butchers to bulldozer operators wishing to easily exchange information electronically.


What's new:
In the six years since the main XML specification was first published, it's spawned hundreds of dialects, benefiting everyone from butchers to bulldozer operators.

Bottom line:
The proliferation could mean the standard is a success or could be the start of a new headache.

More stories on this topic

While some industry observers worry proliferation has gone too far, potentially creating new instances of the interoperability problems that XML was meant to solve, proponents say the explosion of schemas is a testament to the format's success.

Tim Bray, co-inventor of the main XML specification, said the proliferation of special-interest XML dialects validates what he and his colleagues set out to achieve.

"The idea from the start was to make it as easy as possible for people to come up with their own special languages for their specific problems," Bray said. "In the big picture, I think XML is more successful than any of us who designed ever thought it would be."

XML is most often lauded as a foundation for delivering Web services and is the base for plans from Microsoft and other software makers to ease the development and maintenance of business programs. Web services and XML are also major components of Indigo, a new communications subsystem that's slated to be part of Longhorn, the next major release of Windows. Microsoft recently revised its plans for Longhorn and said it will make Indigo available for Windows XP and other current versions of Windows, meaning that it should soon become even easier to exchange XML data between computers.

Also, XML data exchange is a must for companies wishing to join the growing movement toward building new business software using a more flexible model called a "services-oriented architecture." Proponents say SOAs can make software easier to reconfigure as needs change and that they're cheaper to maintain in the long run.

As a vehicle for describing complex sets of data in a globally comprehensible way that works smoothly across the Internet, however, XML is already there. Just ask your local chicken farmer, who is, or soon will be, benefiting from Meat and Poultry XML (mpXML), an offshoot of the Global Standards Management Process that is designed to meet the special needs of producers, retailers and distributors of flesh food.

Turns out meat is a classic example of an industry with agreed-on data sets (Prime or Choice? Wing or drumstick? Fresh or frozen?) where speedy electronic transmission of data can be a major asset, said Blake Ashby, executive vice president of

"Anything our people can do to move that product through the supply chain faster pays off for them in less shrinkage" and spoilage, he said. "Without a system, the managers of these (grocery store) meat departments have to spend time walking the aisle and seeing what they have too much of and when it expires."

Relatively speedy industry agreement on XML has helped producers and sellers boost business and prepare for new challenges, Ashby said. "The need for a global standard has really increased, especially now that Congress is pushing for country-of-origin labeling," he said.

The benefits of XML were similarly obvious for the newspapers and other media outlets that need to deal with voluminous and often inconsistently formatted statistics reported on sports pages, said Alan Karben, chairman of the SportsML Working Group, a branch of the International Press Telecommunications Council that oversees Sports Markup Language.

"Because people's appetites for esoteric sports statistics are so insatiable, the data reports that get exchanged and formatted for display are often incredibly intricate," Karben wrote in an e-mail exchange. "For our industry, the benefits of XML are clear: consistent input no matter what the provider, what the sport, what the native language."

XML has succeeded, co-creator Bray said, because it has solved several of the more vexing challenges for electronic data exchange, including growing need to deal with diverse languages and character sets.

"One of the big problems is internationalization," Bray said. "One of the reasons XML took off is because it solved a lot of those issues with Unicode, which was fairly new at that point."

How much is too much?
While XML makes it easy to create special-purpose dialects, the privilege shouldn't be abused, Bray warned. Competing schemas handling similar tasks create the potential for confusion and broken connections. Consider musical notation, where there are at least a half-dozen projects to apply XML to standardizing music scores. Similarly, the seemingly arcane field of cave exploration has inspired at least three attempts at XML data standards.

"There's an incentive to create a language to solve your specific problem," Bray said. "But if there's something out there already that might serve your need, you should consider using it."

Ron Schmelzer, an analyst for research company ZapThink, says industry leaders typically have little trouble agreeing on what data needs to be represented in an XML schema but get hung up on how to do it--sometimes creating conflicting specifications.

"When you have two different organizations trying to push two different vocabularies for solving the same problem, it doesn't help the supply chain," Schmelzer said. "If you're a small guy, supporting a bunch of different schemas gets difficult."

XML everywhere

Extensible Markup Language offshoots such as Web logging foundation Really Simple Syndication and new Microsoft Office formats get all the attention, but XML is transforming the way people exchange information in countless areas.

Some examples you might have missed:

• LandXML is a format for arranging data on terrain. It's most commonly used to feed data from engineering applications that design roads, construction sites and other projects directly into navigation systems. Bulldozers and other construction vehicles use the systems to eliminate most of the need to have surveyors on site during construction.

• Karst Markup Language is one of several efforts to develop an XML schema optimized for sharing data from cave surveys and maps.

• Recipe Markup Language uses XML to create a standardized format for organizing and presenting cooking directions.

•  MusicXML is one of several efforts to create an XML format for expressing music and notations. Among the potential benefits, scores could be fed directly into MIDI systems for playback.

• Theological Markup Language is meant to standardize scriptural citations and other references to theological documents.

• Mind Reading Markup Language is an apparently farcical and now abandoned project to mess with your head.

Source: CNET research

But proliferating schemas are more often a reflection of the complexity of the data that needs to be described, said Chuck Allen, director of the HR-XML Consortium, a human resources trade group shepherding more than a dozen XML offshoots to standardize data formats in areas such as payroll and stock-incentive plans.

"There has been some concern about hundreds of standards groups duplicating efforts, and there are cases where some of these groups could look over the other's shoulders more closely," Allen said. "But it gets complicated when you're trying to draft metadata standards to capture all this very complex domain knowledge."

Allen said his group employs sensible standards to ensure new XML projects truly serve a purpose. "We need at least three organizational sponsors and 10 participants," he said. "The main criteria are 'Is it in our domain?' and 'Is anybody else doing something about it?'"

Likewise, it would have been easy for the insurance industry to spawn a wealth of standards specialized for everything from boat coverage to reinsurance. But Lloyd Chumbley, assistant vice president of standards for trade group Acord, said the industry had a head start because it had already centralized on common paper forms, mainly to ensure agents could easily share data with insurers.

"When you're trying to do quotes for a policy, the last thing you need is to have to talk several different languages to communicate with several different insurers," he said. "The insurance industry for the most part has been using standardized forms generated by Acord since the 1960s, and that that helped us maintain a single point of reference as everything got digitized."

Chumbley said the main proliferation challenge in the insurance industry is the localized schemas that have emerged to reflect changes in national laws. "We deal with a lot of different organizations internationally to consolidate XML schemas and definitions," he said. "When you're dealing across different cultures and legal systems, it takes time, but we're making progress."

Allen also expects consolidation of XML dialects. "There's been a lot of speculation as to whether there'll be more convergence, and I think that is going to be the case," he said. "I think it'll be because of IP (intellectual property) issues...which are sometimes more costly than the actual development. It takes a lot of resources to review the patent libraries, police the group's IP policies. If you have fewer organizations, there's fewer IP agreements."

John Simpson, author of several XML-related books, said the proliferation of XML dialects to describe similar data sets isn't the chaos machine one might assume, thanks to the ease of translating from one dialect to another.

"The fact there are different standards is's almost trivial to get it from one dialect into another," Simpson said, crediting the simplicity and integrity of the main XML specification.

"They came up with really simple rules for how the XML spec is going to develop, and those have allowed tremendous flexibility," said Simpson, who created his own schema for classifying "B" movies. "People refer to XML as a language, but it's really a grammar for inventing new languages or describing ones that already exist. The XML spec itself is this kind of wonderful chameleon."

Stephen O'Grady, an analyst at researcher RedMonk, agreed that the simplicity of the base XML standard makes it easy to accommodate multiple dialects, but he envisions a kind of Darwinian selection for competing schemas: Multiple approaches to similar problems blossom, the market states a preference and supporting software is tweaked to push data from one XML dialect into another.

"Because XML is the way it is, it's usually not intrinsically difficult to extract information," O'Grady said. "The situation with (Web log formats) RSS and Atom is a good example. I think it's likely the market will end up deciding one is the way to go over another, and then it's a pretty easy task to consolidate."

High-tech chess players, meanwhile, have a bounty of options. With five projects and counting underway to develop an XML-based system for describing chess moves, about the only apparent agreement is that one side has to be white and the other black.


Join the conversation!
Add your comment
Too many XML "standards"
I agree, there are way too many XML "standards" around. In order for the use of XML to overtake the use of previous standards there must be a consolidation in each sector's use of XML to one standard. Without this, XML will still be too difficult to use because each user will have to translate it to every other user's specification, defeating the purpose and ease of use XML can bring to industries.
Posted by Vader1809 (1 comment )
Reply Link Flag
The Tower is Falling! The Tower is Falling!
How many human dialects are there? We still manage to do business and have a very complex civilization to show for it.

XML is just bits on the wire. It isn't the best design one could come up with, but it improved over the original inventor's work. There is no news content in this article.
Posted by (101 comments )
Link Flag
This article misses the point
XML is not a standard for exchanging information, it's a template for creating application specific messaging standards.

In order to exchange data, standard application data field definitions have to be agreed upon. Eg. what exactly do I mean by my "TimeStamp" and "OrderId". You do that by creating a "Document Type Definition" (DTD)

To call each instantiation of XML a "dialect" is just wrong and confusing. They're all just different DTD's within the XML standard.
Posted by (1 comment )
Reply Link Flag
The only thing I've read so far that makes sense
I found the lack of even the most basic understanding of XML in this article very disturbing. Your comments on this are the only ones that have made some sense so far.

Schemas are not "dialects" or "flavors" of XML. XML is a language to mark up documents, basically, it defines the STRUCTURE of what information looks like. A Schema (or DTD) is a particular instance of a document type, it gives the structure a MEANING. In other words, XML tells you how what an element is '<'>' (<tag-name>), for example. A schema fills in the tag name and defines it relation to other tags. So, you might have a schema that defines <paragraph> and the rules for what a paragraph contains.

Now, what about more than one schema for the same topic area? Lets say 6 schemas for music notation. This is potentially redudant, but possibly not depending on the scope of what the schema is trying to solve. At any rate, it is a moot point because of the related XML standard called XSL (I won't get into XSL versus XSLT, I'll keep it simple). XSL is a stylesheet language that allows one to transform one XML document to any other kind of output (another XML document, HTML, plain text, etc etc etc). This means you can write a stylesheet to translate schema A to schema B. So, in our music example, XSL allows you to say something like <note> in schema A translate to <ANote> in schema B. Moreover, you could even create an XSL that would translate some music schema into an actual music format to play on your computer.

Also note that both the Schema definition language itself as well as XSL are both XML! XML is how information is structured. Schema defines that structure and XSL defines how to translate from one format to another.
Posted by (1 comment )
Link Flag
Just the way MOST things go
This is just the way things go. The Tower of Babble is exactly what happens with things that are open and free. They cannot by their nature it comply to strict standards (its too easy not to). When it gets too complex something new will be there to replace it. I guess things like XML give the Microsoft's of the world great advantage. It's amazing to watch.
Posted by Stan Johnson (322 comments )
Reply Link Flag
XML for robots/Knowledge
So it will XML, defining the attributes of all ojects in the world ,following a Object oriented method can be useful for any System or program to derive or get useful information of all things(objects).

the last XML standard need to Made is for Human Behavior...(its too dangerous Systems will predict our behavior)
Posted by (1 comment )
Reply Link Flag
That is the whole point
As I recall, the XML standard was designed to more
easily allow a diversity of knowledge representation.
The "semantic web" is an example of extending the old
hypertext idea to encompass actual knowledge that can
more easily be shared. One can imagine a time where
this would allow a system to grow and aquire
knowledge by simply plugging in. The diversity of
DTD's is necessary in order to find a data
representation that might be used for this
Posted by Johnny Mnemonic (374 comments )
Link Flag
WebServices is facing the same problem of increasing complexity
XML was suppose to be simple, a carrier of data and some logic. It has become way too complicated for an average IT user. This wiki page [] has some discussion around various merits/demerits.

On a different note, same thing is happening with WebServices--Uncontrolled Proliferation of Specifications'. See my blog entry at <a class="jive-link-external" href="" target="_newWindow"></a> which touches this topic briefly.
Posted by xbhatti (2 comments )
Reply Link Flag
What Am I Missing Here?
The article seems to suggest that XML is a way of standardizing communications by diversifying communications. At one point it even suggests XML is the Lingua Franca. I once thought the point was that HTML was the Lingua Franca ... at least it was until Microsoft came along with their own non-conforming version that I lovingly call MSML.

It seems to me that having 5000+ variations of a communications standard will make things LESS compatible and LESS standardized.

It's like allowing anyone on the planet to make up custom languages so that we can all learn to communicate with each other. ***?

What am I missing?

- Sheldon
Posted by (1 comment )
Reply Link Flag
Anybody ever use EDI?
XML is just the lastest and greatest version...

I like XML better, but it is STILL EDI.
Posted by jwhirsch (2 comments )
Reply Link Flag

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot



RSS Feeds

Add headlines from CNET News to your homepage or feedreader.