This is why XSLT on the client side is a bad idea. Try opening that page in a UA that doesn't support XSLT: all you get is garbage, because the content is not in a language with well-known semantics (it is, in fact, in a proprietary encoding of something vaguely resembling HTML, with metadata encoded in a proprietary language written in XML).
Transformations should be done on the server side, so that what is sent over the wire is in a well-known format (HTML, MathML, etc). The UA can then decide whether to display the content using the author's styling hints (CSS) or to display the content using its own rules (as Lynx does, as Opera typically does on hand-held devices such my mobile phone, as voice-based browsers do, etc).
Accessibility is not just about blind people. It's about a growing set of devices with esoteric characteristics. Don't turn your readers away!
Imagine you have a file you know is in UTF-8, and that you are viewing the raw bytes of this file in a text editor which displays high-bit bytes as octal sequences. For example, you could have the string:
Escamillo\342\200\231s supporters
How can you work out what the corresponding Unicode codepoint is?
Write out each digit of each octal sequence like this:
3
4
2
2
0
0
2
3
1
Then, below each digit, write out the corresponding binary using the table below, remembering to pad the results so that the first digit corresponds to two bits and the second and third digits correspond to three bits each.
Octal
Binary
0
0
1
1
2
10
3
11
4
100
5
101
6
110
7
111
So now your notes look like:
3
4
2
2
0
0
2
3
1
11
100
010
10
000
000
10
011
001
Rewrite the binary string in groups of eight (i.e. in bytes):
11100010 10000000 10011001
Here, a vague understanding of how UTF-8 works helps. Count how many of the most-significant bits in your string are on. This tells you how many bytes your character takes. In this case, we have three:
11100010 10000000 10011001
...which is lucky since we do indeed have three bytes. The two other bytes start with a single high bit, which means they are continuation bytes. To get the actual bits that form your character, you take the least significant bits of each byte up to the zero before the most significant bits that are set.
11100010 10000000 10011001
Take these bits and stick them together:
0010000000011001
...and then group them in fours:
0010 0000 0001 1001
In this case we happen to have a multiple of four bits, but sometimes you don't have such a convenient number of bits, so start counting at the least significant end (the right hand side) and then pad the most significant end with zero bits.
Next, you convert each of these nibbles to hexidecimal:
0010
0000
0001
1001
2
0
1
9
And finally you look up your character, in this case U+2019, in the Unicode names list, which in this case gives us "RIGHT SINGLE QUOTATION MARK".
I've been quite busy recently, mainly with stuff I'm not at liberty to talk about, and with other stuff that I don't really want to talk about. All in all, though, things are looking good at the moment. I might even rebind my F2 key at some point... last week I unbound it after bug 00002 was invalidated, but the bug was reopened and even reconfirmed, so who knows.
One of the things I've been working with is ImageMagick with ActivePerl on Windows. I have a large bitmap in memory (aquired through some Win32 APIs), I pass it to ImageMagick, and I tell it to save it to disk as a PNG file. For some strange reason, the PNG file comes out with a green tint, where it should be pure white. I've tried playing with the gamma settings, the rendering intent settings, and some other stuff, but nothing seems to have changed the result. I'll have to look at this more tomorrow.
I've also been occasionally updating test cases to match the new CSS2.1 text, and also doing some CSS2.1 test suite work. It's slow going. There comes a point where you have so many tests that before writing any new tests you have to first look at the existing ones to make sure you haven't covered it already... Otherwise you end up doing what I did the other day, which is write a test case which is almost byte-for-byte identical with an existing one. So I'm being careful when it comes to the 2.1 tests.
As part of the 2.1 test suite I'm going to have to write some scripts to help me name the files, as well as helping me import files from other tests suites and then converting them to the new format. I'm not going to be doing that any time soon, though. I have a TODO list the size of a small planetoid at the moment.
I've also been involved with some forms stuff recently, one example of which is a thread in www-international about what to do when the user has entered characters into a form that can't accept those characters. I'm hoping we can either get an errata into HTML4 for this or maybe we can get an all-out update to the HTML form controls.
While relaxing, I've been reading the MYTH series. So far I'm loving it. I've also started playing in a concert band, although I'm reserving judgement on that for now. Since I don't speak a word of Norwegian, it's a somewhat weird experience. Still, it gives me an excuse to do something else on the weekends, which I'm told is good for slowing life down and preventing panics. I hope it works; it's not something I know much about, really. Although I'm doing my best to learn as fast as I can.