The XML Revolution

[Colour Management]
[Free Stock Photos]
[InDesign CS2]
[QuarkXPress 6.5]

[Acrobat 7]
[Creative Suite 2]
[Fontographer 4.1]
[FrameMaker 7]
[InDesign CS2]
[PageMaker 7]
[PagePlus 10]
[Publisher 2003]
[QuarkXPress 6.5]
[Ventura 10]

[Colour Management]
[Free Stock Photos]
[CD Authoring]
[Mac/PC History]
[Creative Suite]
[Acrobat Directions]
[Font Formats]
[DTP Tagging ]
[Data-Driven Design]
[Windows XP]
[Paper to PDF]
[Page Imposition]
[eBook / ePaper]
[Removable Storage]
[Understand Halftones]
[XML Revolution]
[PDF Workflows]
[Text Typography]
[Spot Colour]
[Database Publishing]
[Layout Compositing]
[Design Automation]
[Stock Photography]
[Quark v Adobe]
[Asymmetric Grids]
[Commercial Print]
[Colour Print]
[Preparing Print]
[Understanding EPS]
[Layout Grids]
[Beyond CYMK]
[Acrobat v Immedia]
[Postscript 3]

[Acrobat Tutorial]
[PageMaker Tutorial]


[Home / What's New]
[DTP / Publishing]
[Vector Drawing]
[Bitmap / Photo Editing]
[Web Design]
[3D Software]
[All Reviews]
[All Articles / Tutorials]
[Book Search / Shop]
[Site Map / Search]

you can help support the site with a direct donation or by shopping via the following links:


Thank you!



Publishing anywhere on any device

Tom Arah investigates the importance of XML and Adobe's Network Publishing initiative.

Things have certainly changed a lot in the world of publishing. Two years ago Adobe, the company that kicked off the DTP revolution, was in dire straits (see RW 51). By concentrating so blindly on its central PDF (portable document format) strategy it had failed to see the full significance of HTML and the Web, its share price had plunged accordingly and Quark was threatening an unwelcome takeover.

Now, with a revamped range of programs, the situation has been transformed. In fact none of the company's new applications, InDesign, LiveMotion and GoLive, can claim to be best-of-breed compared to XPress, Flash or Dreamweaver but, by working together with the rejuvenated flagships Illustrator, Acrobat and Photoshop, they combine to offer the best all-round solution for publishing to both page and screen. The end result is a revived company and a booming share price - a fact well illustrated by the recent press trip to Berlin.

Normally the price for such lavish hospitality is to be bombarded with new product launches and strategy announcements. In this case, however, Adobe had nothing to push. This certainly made the trip enjoyable but was also slightly worrying. My unease grew as I pondered the front-page article in my complimentary copy of the International Herald Tribune (you know how it is). The heading pretty much said it all "XML and the Net: Better than Sliced Bread" though this proved pretty tame compared to the Microsoft spokesman's quote comparing of the invention of XML to the invention of writing! If even a tenth of the predictions come true, it's clear that XML is going to revolutionize all aspects of computing - including publishing.

So just what is XML, where does it come from and what does it do? The article rightly gave the credit for XML (extensible markup language) to the World Wide Web Consortium (W3C). In particular it quoted Tim Bray, one of the co-editors of the specification, saying that XML was designed by "a bunch of people with a track record in the publishing industry… who thought we were building the document format of the future. The enemy, explicitly was Microsoft Word, FrameMaker, Quark - all these fragile, proprietary binary file formats that lock up the inventory of human knowledge." In other words XML was explicitly invented to revolutionize the world of publishing! is the home of XML development.

So how exactly is it going to achieve this feat? The answer is frighteningly simple. Essentially XML boils down to a method of labeling alphanumeric content with user-definable tags. Just as with HTML these tags take the form of angle-bracketed text surrounding the content, the difference is that the tags themselves aren't defined as part of the language, hence the all-important extensibility. In other words XML is a markup language for defining other markup languages.

An example shows how the system works. While in HTML you might surround an address with the in-built <address> tag, in XML you could take things several stages further with a <contact> tag that included <firstname>, <surname>, <company>, <address> and <postcode> tagged information. Or a <pricelist> could contain <product>, <quantity>, <colour>, <size> and <price> tags.

The underlying principle isn't that different to HTML but the benefits that flow are staggering as they enable XML to act as a cross between HTML, a structured database and a dynamic application. The most obvious benefits are for e-commerce where XML-aware search engines will be able to intelligently compile buying choices from across the Web. Even better, to then make the purchase, all you will need to do is click on the price that takes your fancy and all billing, delivery and stock handling can be managed automatically through XML.

The implications for business are hard to overestimate, but what about for design? In fact the XML specification has absolutely nothing to say about design at all. Paradoxically though it's exactly because XML is only concerned with the structuring of content that it ultimately also proves so revolutionary in terms of design. In particular, because the appearance of an XML document isn't determined by the XML itself, this can be left to other dedicated markup languages.

Currently the most common of these is CSS (cascading style sheet) which Web designers will already be familiar with. CSS works by defining the appearance of any given tag so that an <H1> heading tag can be set to a particular font, point-size and colour and every instance is automatically formatted accordingly. This is intrinsically more efficient than HTML and CSS also enables greater formatting control with the ability to manage borders, spacing, ruling lines, background colours and images and so on.

An XML file in IE5 with and without CSS styling.

What really sets the semi-detached style sheet approach apart though is its re-styling capability. By editing a single shared CSS file, for example, the look and feel of an entire Web site can be instantly changed. Even more exciting is the potential to change style sheets on the fly. If a viewer prefers a serif to a sans serif, or wants to make the text larger, for example, this can easily be accommodated. Because the appearance is completely separate from the content, the developer or user can simply swap design "skins". Even more importantly this opens up the ability to tailor designs to given devices such as differing screen resolutions, or print, or even audio.

CSS is certainly powerful and has the advantage of familiarity but there's another styling language that offers even more. XSL (XML stylesheet language) is itself based on XML and offers CSS-style formatting through XSLFO (XSL Formatting Objects) but adds in a whole new area of restyling capability through XSLT (XSL Transformation). What makes XSLT so different is that it is able to work hand in hand with the content information in the XML document to transform it before display.

Using XSLT it is possible, for example, to automatically pull out all the <heading> tags in an XML file as a table of contents at the top of the file, or to pull out the relevant stories from a number of XML files to produce a completely customized document. Throw in some XML-aware scripting, for example to convert <price> tag figures according to their currency attribute, and you can automatically produce documents tailor-made to each reader. Compared to such dynamic, personalized, on-the-fly generation, traditional, static, one-off publishing will soon seem very crude.

In most cases the generated output file will itself be XML-based but, thanks to template matching, XSLT can also translate from XML, for example, converting each instance of the XML document's <heading> tag to HTML-compliant <H1> tags. Eventually of course this stage shouldn't be necessary as browsers will deal with XML directly - or at least XHTML the XML-compliant version of HTML - but in the meantime XSLT transformations enable XML to be converted to HTML on the fly. Alternatively it can be converted to WML (wireless markup language) for the current generation of WAP phones, or to whatever replaces it.

And of course this is where all the restyling and transformation benefits of XML have really been leading. The absolutely fundamental change that is going to occur over the next five years is the shift from dial-up access to the internet via PC to always-on broadband access via handheld devices. And what this means for the design and publishing industry is another medium, the Web handheld, that will be just as important to cater for as the PC and print.

In fact, if the analysts are right, the handheld will be the most important medium of all. According to predictions, in five years time you are likely to be reading PC Pro on a handheld tablet. Or rather you'll be browsing a unique customized digest of all your areas of interest from across the Dennis stable of XML-based publications (no doubt interspersed with the odd carefully targeted ad and special offer).

Suddenly all the hype about XML seems very understandable. XML is the underlying technology that will turn the Internet into a giant database, that will underpin the next wave of e-commerce, and that will enable dynamic customized publishing to any device. And of course this time round Microsoft has spotted the potential almost immediately. With its .NET strategy the company has embraced the concept of XML and extended it with its vision of a Microsoft-dominated, XML-based computing platform in which Net, computer, application and content merge.

It's clear then that one of Tim Bray's original targets - the ubiquitous Word DOC (and with it the rest of the Office apps' proprietary binary formats) will soon fall to XML, but what about the two other formats he singled out? Have the two publishing giants behind the XPress and FrameMaker formats been as quick to recognize the way the wind is blowing? Or are they going to be left trailing again?

Based on past experience, I wasn't expecting too much of Quark. After all XPress hasn't exactly been at the forefront of the Web revolution. Quark's vision of the Web wasn't based on HTML but rather on its proprietary, binary and completely misguided Immedia technology. The result was a huge number of very unhappy high-end publishers whose print-based assets simply couldn't be realized on the Web without redesigning from scratch. Visit today, however, and you'll find that Quark has seen the open standard light and is a born-again XML evangelist.

Quark has become an XML evangelist.

It's not just talk either. The avenue.quark product which looks set to be merged into XPress 5 already lets you take XPress-based publications and set up simple rules for converting stylesheets to XML-based tags. By converting each XPress document (including all that legacy material) into an XML master it can then be translated to HTML for access today using dedicated XML translation software such as StoryServer. For the typical regular, high-end XPress publication this system is absolutely ideal. Invest in the right set-up and XML-based infrastructure and you have instant high quality repurposing that can be customized to individual devices and individual requirements. No wonder Quark is excited.

So what about Adobe? The third target Tim Bray singled out was FrameMaker and it's easy to see why. With the program's tag-based, text-flow architecture FrameMaker could almost have been designed as the archetypal XML publisher. With this open goal sitting in front of it, however, Adobe has somehow taken its eye off the ball. While investing all its R & D into the heavily PDF and print-oriented InDesign, the natural cross-publishing tie-in of FrameMaker and XML seems to have been missed completely. With FrameMaker 6, Adobe was even reduced to bundling in a third-party add-on, WebWorks Publisher, to paper over the lack of any serious in-built XML development. The contrast with Quark's total commitment to XML could hardly be clearer.

Oh dear this is worryingly familiar. By concentrating on PDF it looks like Adobe has again missed the full significance of the Web - only this time it's XML rather than HTML that is has underestimated. Even worse, with its decision to start again from scratch with InDesign, it looks like the company has managed to come up with a "next-generation publishing solution" that is already hopelessly out-of-date! So is the current Adobe revival premature? Has the company missed the boat again? Should I forget about future European jaunts?

Hopefully not. While there weren't any major announcements on the press trip, Adobe made up for it in early November with a whole host of releases involving HP, Nokia, RealNetworks and Interwoven, and all centred around the company's new "Network Publishing" initiative. This is being billed as the third wave of publishing following on from DTP-based print and HTML-based traditional Web publishing. And the goal sounds very familiar: "making visually rich, personalized content available anytime, anywhere on any device."

Adobe's vision for the future is Network Publishing.

Adobe hasn't been asleep then but the obvious question is how is the vision going to be achieved? After Microsoft and Quark's conversion, I expected a wholesale championing of XML, but XML is only the first of "the industry standards on which Network Publishing is based" followed by PDF, SVG (scalable vector graphic), SMIL (synchronized multimedia integrated language) and WML. The supporting announcements are hardly any more enlightening with two GoLive-related announcements of a WML emulator in association with Nokia and a SMIL extension in association with RealNetworks - all very worthy but hardly earth-shattering. Ultimately the only real practical clue is that "content created with Adobe software will be meta-tagged for management, distribution and display".

So is Network Publishing little more than a public relations exercise to cover up the fact that Adobe is still betting on the wrong horse, choosing PDF rather than XML? Again I don't think so. It's important to realize that while the XML and XSL combination is hugely powerful it's not a panacea. To begin with, while the central tagging idea in XML is very simple, when you get down to handling the necessary DTDs, namespaces, XSLT and so on it's a different matter. That's fine for the organizational publishing that Microsoft and Quark cater to, but it's just not feasible lower down the scale. Even more fundamental is XML's in-built focus on alphanumeric content. CSS and XSLFO can dress that up to an extent, but by its nature XML is not a design-intensive solution and was never intended to be.

In particular of course it leaves graphical content out in the cold - something Adobe could hardly be expected to endorse. This is where Adobe's huge investment in developing SVG over the last few years comes in. Almost unnoticed Adobe has completely remodelled its approach to vector graphics, using the lessons learnt from XML to entirely separate the content (shapes, paths and text) and the styled appearance. Without a widely available SVG-viewing platform, the benefits are still notional rather than practical, but the joint announcement with RealNetworks of improved SMIL support is doubly important. To begin with it promises to take SVG out of simple restylable Web button territory into full interactive multimedia. Just as importantly the SVG viewer will be automatically available to over 150 million RealPlayer users.

SVG is an open XML-based standard - but Adobe is its major champion.

Perhaps what's most significant about both SVG and SMIL initiatives though is that they aren't competitors to XML - they are XML! As discussed earlier XML is a markup language for producing other markup languages and SVG and SMIL are the W3C-endorsed open standards for describing vector graphics and multimedia in XML. This means that the text content within the SVG remains accessible and even that the SVG can even be automatically generated via XML translation. Suddenly with a scalable, restylable, dynamic, interactive, open, multimedia Web-based SVG/SMIL/XML solution, vanilla-XML looks considerably less compelling. There's certainly plenty of rich ground for Illustrator, GoLive and LiveMotion to exploit over the next few years.

But what about PDF? Surely such a print-oriented technology must have had its day in the era of the handheld screen? Again don't be too hasty. To begin with of course it's important not to underestimate print. Reading material onscreen is all very well, but when you come across something you want to keep, it's natural to print it out, or these days to WebCapture it as PDF. As such PDF remains very much at the heart of Adobe's Network Publishing vision with the supporting promotional video showing users happily printing at Web kiosks and loading Acrobat Reader to view PDFs on their PalmPilot.

But hang on. It's bad enough trying to read PDFs on a full size screen but does Adobe really expect users to peer at multi-column, A4 page through a tiny window? I think it's safe to assume not which can only mean one thing -the PDF is going to be given a dedicated onscreen reading mode. In fact there's no reason to see this as that fundamental a change. Many users already commonly copy the text from a PDF into Word as RTF as the text immediately becomes much easier to read onscreen when separated from its print-oriented layout.

A dedicated screen view would revolutionize the currently print-based Acrobat PDF.

Of course in a dedicated onscreen viewing mode, Acrobat would offer far greater control than this - but how might it be handled? Presumably this is where that meta-tagging of content comes in. The text in a PDF is already accessible so it's a relatively small step to shift from RTF-formatted text to XML-structured text and, as we've seen, the ensuing benefits in terms of data management and restyling are huge. Even better is the way that, by building in the repurposing intelligence within the PDF/Reader combination, the necessary complicated XML-based content and XSL-based styling and translations can be simply and neatly wrapped up in the file itself.

Even so it's clear that somehow the user would have to be put in control of the meta-tagging process to design the screen layout in exactly the same way that they currently design its printed layout. But how? Step forward the most likely candidate - InDesign. If the program was always intended to fill this dual role, controlling the design of PDFs for screen-based viewing as much as for print-based reading, it would certainly explain the next-generation hype that surrounded the program's launch. And explain why Adobe insiders are so much keener on InDesign than any of its current users.

By now though this is pure speculation. More importantly of course, without any actual products to test and compare, it's impossible to predict which technologies and approaches will prove successful in practice. Frankly though I don't think that many of the developers in Microsoft, Quark or Adobe would claim that they know exactly how things will turn out either, let alone Tim Bray and the development team that came up with XML in the first place.

Two things are certain though. XML is here to stay in one form or another and the world of publishing is going to be transformed again as a result.

Tom Arah

January 2001

Hopefully you've found the information you were looking for. For further information please click here.

For free trials and special offers please click the following recommended links:

For further information on the following design applications and subjects please click on the links below:

[3D], [3ds max], [Adobe], [Acrobat], [Cinema 4D], [Corel], [CorelDRAW], [Creative Suite], [Digital Image], [Dreamweaver], [Director], [Fireworks], [Flash], [FreeHand], [FrameMaker], [FrontPage], [GoLive], [Graphic Design], [HTML/CSS], [Illustrator], [InDesign], [Macromedia], [Macromedia Studio], [Microsoft], [NetObjects Fusion], [PageMaker], [Paint Shop Pro], [Painter], [Photo Editing], [PhotoImpact], [Photoshop], [Photoshop Elements], [Publisher], [QuarkXPress], [Web Design]

To continue your search on the site and beyond please use the Google and Amazon search boxes below:

Web independent, informed, intelligent, incisive, in-depth...

All the work on the site (over 250 reviews, over 100 articles and tutorials) has been written by me, Tom Arah It's also me who maintains the site, answers your emails etc. The site is very popular and from your feedback I know it's a useful resource - but it takes a lot to keep it up.

You can help keep the site running, independent and free by Bookmarking the site (if you don't you might never find it again), telling others about it and by coming back (new content is added every month). Even better you can make a donation eg $5 the typical cost of just one issue of a print magazine or buy anything via or (now or next time you feel like shopping) using these links or the shop - it's a great way of quickly finding the best buys, it costs you nothing and I gain a small but much-appreciated commission.

Thanks very much, Tom Arah

[DTP/Publishing] [Vector Drawing] [Bitmap/Photo] [Web] [3D]
[Reviews/Archive] [Shop]  [Home/What's New]

Copyright 1995-2005, Tom Arah, Please get in contact to let me know what you think about a particular piece or the site in general.