2002-11-15 22:19 UTC When specifications collide
What's the charset of the following entity, assuming it is labelled as text/xml
?
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>What Am I?</title> <meta http-equiv="Content-Type" content="text/xml;charset=iso-8859-1"/> </head> <body> <p>What charset is this document?</p> </body> </html>
Take a guess.
Ok, I'll tell you.
If you are an XHTML UA, then the <meta>
element overrides the Content-Type, making the document ISO-8859-1.
However, if you are just a normal XML UA, receiving this over HTTP, then
RFC3023 says (section 3.1, paragraph 3) that the
charset must be treated as US-ASCII. However, if you are an XML UA reading
this from your local file system, then the XML
spec says that it must be treated as UTF-8.