loge.hixie.ch

Hixie's Natural Log

2002-11-15 22:19 UTC When specifications collide

What's the charset of the following entity, assuming it is labelled as text/xml?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
      "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <title>What Am I?</title>
  <meta http-equiv="Content-Type" content="text/xml;charset=iso-8859-1"/>
 </head>
 <body>
  <p>What charset is this document?</p>
 </body>
</html>

Take a guess.

Ok, I'll tell you.

If you are an XHTML UA, then the <meta> element overrides the Content-Type, making the document ISO-8859-1. However, if you are just a normal XML UA, receiving this over HTTP, then RFC3023 says (section 3.1, paragraph 3) that the charset must be treated as US-ASCII. However, if you are an XML UA reading this from your local file system, then the XML spec says that it must be treated as UTF-8.

Pingbacks: 1 2