- Related Stories
-
Trying to make Web services make sense
June 22, 2004 -
Taking XML's measure
September 23, 2003 -
XML and Unicode: Mix with care
June 16, 2003 -
XML makes its mark
February 10, 2003
Aside from that, it's hard to find an industry or interest that isn't taking advantage of the fast-growing standard for Web services and data exchange. In the six years since the main XML specification was first published, it's spawned hundreds of dialects, or schemas, benefiting everyone from butchers to bulldozer operators wishing to easily exchange information electronically.
What's new:
In the six years since the main XML specification was first published, it's spawned hundreds of dialects, benefiting everyone from butchers to bulldozer operators.
Bottom line:
The proliferation could mean the standard is a success or could be the start of a new headache.
While some industry observers worry proliferation has gone too far, potentially creating new instances of the interoperability problems that XML was meant to solve, proponents say the explosion of schemas is a testament to the format's success.
Tim Bray, co-inventor of the main XML specification, said the proliferation of special-interest XML dialects validates what he and his colleagues set out to achieve.
"The idea from the start was to make it as easy as possible for people to come up with their own special languages for their specific problems," Bray said. "In the big picture, I think XML is more successful than any of us who designed ever thought it would be."
XML is most often lauded as a foundation for delivering Web services and is the base for plans from Microsoft and other software makers to ease the development and maintenance of business programs. Web services and XML are also major components of Indigo, a new communications subsystem that's slated to be part of Longhorn, the next major release of Windows. Microsoft recently revised its plans for Longhorn and said it will make Indigo available for Windows XP and other current versions of Windows, meaning that it should soon become even easier to exchange XML data between computers.
Also, XML data exchange is a must for companies wishing to join the growing movement toward building new business software using a more flexible model called a "services-oriented architecture." Proponents say SOAs can make software easier to reconfigure as needs change and that they're cheaper to maintain in the long run.
As a vehicle for describing complex sets of data in a globally comprehensible way that works smoothly across the Internet, however, XML is already there. Just ask your local chicken farmer, who is, or soon will be, benefiting from Meat and Poultry XML (mpXML), an offshoot of the Global Standards Management Process that is designed to meet the special needs of producers, retailers and distributors of flesh food.
"Anything our people can do to move that product through the supply chain faster pays off for them in less shrinkage" and spoilage, he said. "Without a system, the managers of these (grocery store) meat departments have to spend time walking the aisle and seeing what they have too much of and when it expires."
Relatively speedy industry agreement on XML has helped producers and sellers boost business and prepare for new challenges, Ashby said. "The need for a global standard has really increased, especially now that Congress is pushing for country-of-origin labeling," he said.
The benefits of XML were similarly obvious for the newspapers and other media outlets that need to deal with voluminous and often inconsistently formatted statistics reported on sports pages, said Alan Karben, chairman of the SportsML Working Group, a branch of the International Press Telecommunications Council that oversees Sports Markup Language.
"Because people's appetites for esoteric sports statistics are so insatiable, the data reports that get exchanged and formatted for display are often incredibly intricate," Karben wrote in an e-mail exchange. "For our industry, the benefits of XML are clear: consistent input no matter what the provider, what the sport, what the native language."
XML has succeeded, co-creator Bray said, because it has solved several of the more vexing challenges for electronic data exchange, including growing need to deal with diverse languages and character sets.
"One of the big problems is internationalization," Bray said. "One of the reasons XML took off is because it solved a lot of those issues with Unicode, which was fairly new at that point."
How much is too much?
While XML makes it easy to create special-purpose dialects, the privilege shouldn't be abused, Bray warned. Competing schemas handling similar tasks create the potential for confusion and broken connections. Consider musical notation, where there are at least a half-dozen projects to apply XML to standardizing music scores. Similarly, the seemingly arcane field of cave exploration has inspired at least three attempts at XML data standards.
"There's an incentive to create a language to solve your specific problem," Bray said. "But if there's something out there already that might serve your need, you should consider using it."
Ron Schmelzer, an analyst for research company ZapThink, says industry leaders typically have little trouble agreeing on what data needs to be represented in an XML schema but get hung up on how to do it--sometimes creating conflicting specifications.
"When you have two different organizations trying to push two different vocabularies for solving the same problem, it doesn't help the supply chain," Schmelzer said. "If you're a small guy, supporting a bunch of different schemas gets difficult."
XML everywhere
Extensible Markup Language offshoots such as Web logging foundation Really Simple Syndication and new Microsoft Office formats get all the attention, but XML is transforming the way people exchange information in countless areas.
Some examples you might have missed:
LandXML is a format for arranging data on terrain. It's most commonly used to feed data from engineering applications that design roads, construction sites and other projects directly into navigation systems. Bulldozers and other construction vehicles use the systems to eliminate most of the need to have surveyors on site during construction.
Karst Markup Language is one of several efforts to develop an XML schema optimized for sharing data from cave surveys and maps.
Recipe Markup Language uses XML to create a standardized format for organizing and presenting cooking directions.
MusicXML is one of several efforts to create an XML format for expressing music and notations. Among the potential benefits, scores could be fed directly into MIDI systems for playback.
Theological Markup Language is meant to standardize scriptural citations and other references to theological documents.
Mind Reading Markup Language is an apparently farcical and now abandoned project to mess with your head.
Source: CNET News.com research
But proliferating schemas are more often a reflection of the complexity of the data that needs to be described, said Chuck Allen, director of the HR-XML Consortium, a human resources trade group shepherding more than a dozen XML offshoots to standardize data formats in areas such as payroll and stock-incentive plans.
"There has been some concern about hundreds of standards groups duplicating efforts, and there are cases where some of these groups could look over the other's shoulders more closely," Allen said. "But it gets complicated when you're trying to draft metadata standards to capture all this very complex domain knowledge."
Allen said his group employs sensible standards to ensure new XML projects truly serve a purpose. "We need at least three organizational sponsors and 10 participants," he said. "The main criteria are 'Is it in our domain?' and 'Is anybody else doing something about it?'"
Likewise, it would have been easy for the insurance industry to spawn a wealth of standards specialized for everything from boat coverage to reinsurance. But Lloyd Chumbley, assistant vice president of standards for trade group Acord, said the industry had a head start because it had already centralized on common paper forms, mainly to ensure agents could easily share data with insurers.
"When you're trying to do quotes for a policy, the last thing you need is to have to talk several different languages to communicate with several different insurers," he said. "The insurance industry for the most part has been using standardized forms generated by Acord since the 1960s, and that that helped us maintain a single point of reference as everything got digitized."
Chumbley said the main proliferation challenge in the insurance industry is the localized schemas that have emerged to reflect changes in national laws. "We deal with a lot of different organizations internationally to consolidate XML schemas and definitions," he said. "When you're dealing across different cultures and legal systems, it takes time, but we're making progress."
Allen also expects consolidation of XML dialects. "There's been a lot of speculation as to whether there'll be more convergence, and I think that is going to be the case," he said. "I think it'll be because of IP (intellectual property) issues...which are sometimes more costly than the actual development. It takes a lot of resources to review the patent libraries, police the group's IP policies. If you have fewer organizations, there's fewer IP agreements."
John Simpson, author of several XML-related books, said the proliferation of XML dialects to describe similar data sets isn't the chaos machine one might assume, thanks to the ease of translating from one dialect to another.
"The fact there are different standards is immaterial...it's almost trivial to get it from one dialect into another," Simpson said, crediting the simplicity and integrity of the main XML specification.
"They came up with really simple rules for how the XML spec is going to develop, and those have allowed tremendous flexibility," said Simpson, who created his own schema for classifying "B" movies. "People refer to XML as a language, but it's really a grammar for inventing new languages or describing ones that already exist. The XML spec itself is this kind of wonderful chameleon."
Stephen O'Grady, an analyst at researcher RedMonk, agreed that the simplicity of the base XML standard makes it easy to accommodate multiple dialects, but he envisions a kind of Darwinian selection for competing schemas: Multiple approaches to similar problems blossom, the market states a preference and supporting software is tweaked to push data from one XML dialect into another.
"Because XML is the way it is, it's usually not intrinsically difficult to extract information," O'Grady said. "The situation with (Web log formats) RSS and Atom is a good example. I think it's likely the market will end up deciding one is the way to go over another, and then it's a pretty easy task to consolidate."
High-tech chess players, meanwhile, have a bounty of options. With five projects and counting underway to develop an XML-based system for describing chess moves, about the only apparent agreement is that one side has to be white and the other black.






XML is just bits on the wire. It isn't the best design one could come up with, but it improved over the original inventor's work. There is no news content in this article.
In order to exchange data, standard application data field definitions have to be agreed upon. Eg. what exactly do I mean by my "TimeStamp" and "OrderId". You do that by creating a "Document Type Definition" (DTD)
To call each instantiation of XML a "dialect" is just wrong and confusing. They're all just different DTD's within the XML standard.
Schemas are not "dialects" or "flavors" of XML. XML is a language to mark up documents, basically, it defines the STRUCTURE of what information looks like. A Schema (or DTD) is a particular instance of a document type, it gives the structure a MEANING. In other words, XML tells you how what an element is '<'>' (<tag-name>), for example. A schema fills in the tag name and defines it relation to other tags. So, you might have a schema that defines <paragraph> and the rules for what a paragraph contains.
Now, what about more than one schema for the same topic area? Lets say 6 schemas for music notation. This is potentially redudant, but possibly not depending on the scope of what the schema is trying to solve. At any rate, it is a moot point because of the related XML standard called XSL (I won't get into XSL versus XSLT, I'll keep it simple). XSL is a stylesheet language that allows one to transform one XML document to any other kind of output (another XML document, HTML, plain text, etc etc etc). This means you can write a stylesheet to translate schema A to schema B. So, in our music example, XSL allows you to say something like <note> in schema A translate to <ANote> in schema B. Moreover, you could even create an XSL that would translate some music schema into an actual music format to play on your computer.
Also note that both the Schema definition language itself as well as XSL are both XML! XML is how information is structured. Schema defines that structure and XSL defines how to translate from one format to another.
the last XML standard need to Made is for Human Behavior...(its too dangerous Systems will predict our behavior)
easily allow a diversity of knowledge representation.
The "semantic web" is an example of extending the old
hypertext idea to encompass actual knowledge that can
more easily be shared. One can imagine a time where
this would allow a system to grow and aquire
knowledge by simply plugging in. The diversity of
DTD's is necessary in order to find a data
representation that might be used for this
purpose.
On a different note, same thing is happening with WebServices--Uncontrolled Proliferation of Specifications'. See my blog entry at http://www.khaitan.org/mt/archives/000020.html which touches this topic briefly.
It seems to me that having 5000+ variations of a communications standard will make things LESS compatible and LESS standardized.
It's like allowing anyone on the planet to make up custom languages so that we can all learn to communicate with each other. ***?
What am I missing?
- Sheldon
- Anybody ever use EDI?
- by jwhirsch July 15, 2005 11:45 AM PDT
- XML is just the lastest and greatest version...
- Like this Reply to this comment
-
-
- it is STILL EDI
- by alek_nedic May 18, 2007 5:47 AM PDT
- http://www.analogstereo.com/vacuum/miele_vacuum_convenience.htm
- Like this
-
(14 Comments)I like XML better, but it is STILL EDI.