Hixie's Natural Log: Error handling and Web language design

2004-01-22 00:09 UTC Error handling and Web language design

I've been following the recent burst of posts about whether XML should have required that Web browsers stop processing upon hitting an error (as it does) or whether it should have let Web browsers recover from errors in vendor-specific ways (like HTML does) with some amusement, because asking the question in this yes/no form misses the point:

There is a third, better option.

Since a lot of people don't really understand the problem here, I'm going to give some background.

What's the point of a specification? It is to ensure interoperability, so that authors get the same results on every product that supports the technology.

Why would we ever have to worry about document errors? Murphy said it best:

If there are two or more ways to do something, and one of those ways can result in a catastrophe, then someone will do it.

Authors will write invalid documents. This is something that most Web developers, especially developers who understand the specs well enough to understand what makes a document invalid, do not really understand. Ask someone who does HTML/CSS quality assurance (QA) for a Web browser, or who has written code for a browser's layout engine. They'll go on at length about the insanities that they have seen, but the short version is that pretty much any random stream of characters has been written by someone somewhere and been labelled as HTML.

Why is this a problem? Because Tim Berners Lee, and later Dan Connolly, when they wrote the original specs for HTML and HTTP, did not specify what should happen with invalid documents. This wasn't a problem for the first five or so years of the Web.

At the start, there was no really dominant browser, so browsers presumably just implemented the specs and left the error handling to chance or convenience of the implementor. After a few years, though, when the Web started taking off, Netscape's browser soared to a dominant position. The result was that Web authors all pretty much wrote their documents using Netscape. Still no problem really though: Netscape's engineers didn't need to spend much time on error handling, so long as they didn't change it much between releases.

Then, around the mid-nineties, Microsoft entered the scene. In order to get users, they had to make sure that their browser rendered all the Web pages in the World Wide Web. Unfortunately, at this point, it became obvious that a large number of pages (almost all of them in fact) relied in some way on the way Netscape handled errors.

Why did pages depend on Netscape's error handling? Because Web developers changed their page until it looked right in Netscape, with absolutely no concern for whether the page was technically correct or not. I did this myself, back when I made my first few sites. I remember reading about HTML4 shortly after that become a W3C Recommendation and being shocked at my ignorance.

So, Microsoft reversed engineered Netscape's error handling. They did a ridiculously good job of it. The sheer scale of this feat is awe-inspiring. Internet Explorer reproduces aspects of Netscape's error handling which nobody at Netscape ever knew existed. Think about this for a minute.

Shortly after, Microsoft's browser became dominant and Netscape's browser was reduced to a minority market share. Other browsers entered the scene; Opera, Mozilla (the rewrite of the Netscape codebase), and Konqueror (later to be used as the base for Safari) come to mind, as they are still in active development. And in order to be usable, these browsers have to make sure they render their pages just like Internet Explorer, which means handling the errors in the same way.

Browser developers and layout engine QA engineers spend probably more than half their total work hours debugging invalid markup trying to work out what obscure aspect of the de facto error handling rules are being used to obtain the desired rendering. More than half!

It's easy to see why Web browser developers tend to be of the opinion that for future specifications, instead of having to reverse engineer the error handling behaviour of whatever browser happens to be the majority browser, errors should just cause the browser to abort processing.

Summary of the argument so far: Authors will write invalid content regardless. If the specification doesn't say what should happen, then once there is a dominant browser, its error handling (whether intentionally designed or just a side-effect of the implementation) will become the de facto standard. At this point, there is no going back, any new product that wants to interoperate has to support those rules.

So what is the better solution? Specifications should explicitly state what the error recovery rules are. They should state what the authors must not do, and then tell implementors what they must do when an author does it anyway.

This is what CSS1 did, to a large extent (although it still leaves much undefined, and I've been trying to make the rules for handling those errors clearer in CSS2.1 and CSS3). This is what my Web Forms 2.0 proposal does. Specifications should ensure that compliant implementations interoperate, whether the content is valid or not.

Note that all this is moot if you use XML 1.x, because XML specifies that well-formedness errors should be fatal. So if you don't want to have this behaviour in your language, don't use XML.

Pingbacks: 1