Hixie's Natural Log

2004-01-18 23:00 UTC Void filling: Web Applications Language

About 11 months ago, I mentioned that the W3C had so far failed to address a need in the Web community: There is no language for Web applications. There is a language for hypertext documents (HTML), there is a language for vector graphic images (SVG), there is a vocabulary for embedding Math into both of those (MathML), and there are lots of support technologies (DOM, ECMAScript, CSS, SMIL)... But there is no language designed for writing applications, like Voidwars (a game) or Bugzilla (an issue tracking system) or for that matter the Mozillazine Forums or eBay auctions. What is needed is one (or maybe more) markup languages specifically designed to allow the semantics of sites like the above to be marked up, thus allowing for improvements in the accessibility of such sites.

It's been nearly a year since I first mentioned this, and the only group that seems to have done anything about this is Microsoft, with their worryingly comprehensive set of proprietary technologies (Avalon, XAML, WVG, etc) that appear designed to ensure vendor lock-in.

I intend to do something about this (hopefully within a W3C context, although that will depend on the politics of the situation). If you write Web-based applications, I would be interested in hearing about what your needs are. Please let me know: webapps@hixie.ch

Pingbacks: 1

2004-01-13 10:59 UTC Confusing spam

Maybe I've missed something. I don't know. Or maybe this is a joke. I just got a spam with the subject line This letter can only define Nigeria Scam, a.k.a 419, which starts off explaining what 419 spam is, saying that much of Nigeria's government is corrupt, and so forth. Fair enough, I thought (curious as to the goal of a spam that explained some of the story behind 419 fraud, even if this wasn't even close to an accurate explanation). Maybe this is ironic educational spam from some well-meaning, although confused, spam fighter.

Then I read paragraph 5:

The point I am making is nothing more than asking you to handle a pure deal of approximately USD$50,000,000.00, which will take approximately two weeks to conclude from here. Then the funds clear in any account of yours after 72hrs upon the remittance.

What? I'm confused. I thought you just said this was a scam?

Maybe they are trying to increase the bar, so that only very gullible people fall for these scams?

2004-01-10 11:39 UTC Mad people, Tim, and the Groom Lake facility

On my way to the office (which is the staging point for my mission to today's primary objective, central Olso) I passed an old lady who appeared to be muttering to herself, and it struck me: I can no longer tell the difference between insane people, and people on hands-free mobile phones. Literally. I have no idea if she was on the phone or not. And she definitely wasn't speaking to anyone physically near her.

Later today, Tim will be arriving for a few months. I haven't seen him since August. Hopefully he'll be encouraging me to get to work slightly, ah, earlier, than I have been.

Last night I finished reading a series of seven books by Robert Doherty which I started over the new year. I bought Area 51 around the 23rd of December, finished that day or the next, spent a few days itching to buy Area 51: The Reply, which I finally did around the 26th, along with Area 51: The Mission. I then spent about 2 days reading and about 8 days itching to buy Area 51: The Sphinx, which I finally did on Monday (the 5th), along with Area 51: The Grail, Area 51: Excalibur, and Area 51: The Truth. There appear to be no real analysis sites on the Web for this series, which surprises me. (Is The Lurker's Guide an anomaly, or what? I made my entry into science fiction fandom with Babylon 5, which, at the aforementioned site, has incredibly detailed analysis of every scene of every show, cross-referenced across episodes with detailed plot descriptions, directors comments, and so forth. Did other series not cause that kind of response? Even Stargate SG-1 doesn't really seem to have that kind of detailed analysis. Although, having tried writing one for some episodes myself, I can understand that, I guess. Good analysis is long, hard work.)

Turns out there is another book, Area 51: Nosferatu, now available, with yet another (Legend (Area 51)) coming in "March" (quotemarks because I've become rather familiar with projected publication dates what with my involvement with software development, specification editing, and book proof-reading). I also noticed, while buying those books, that one of my favourite authors, Peter F. Hamilton (author of the simply stunning Night's Dawn trilogy) has some more books on sale now.

However, no more books for me for at least a week. Reading does terrible things to my productivity. I have an addictive personality and very little self-control (which is why I don't drink) so when I start reading, I have to finish, even if it is past 5am. Not something I want to keep doing for extended periods of time, really.

I'd better be off now, my exfil window is closing.

2004-01-06 15:23 UTC Reminder of some notes from a Winter in Oslo

It seems I forgot about this, so I'll just re-mention it in the hopes that I'll remember it now: When cycling in the snow,

  1. You have no brakes, and
  2. You can only go in a straight line.

2004-01-03 00:48 UTC Unicode decoder tools

To help with debugging of Unicode and UTF-8 related problems, I've written two tools:


Paste in some UTF-8 bytes (either as hexadecimal, decimal, octal, or binary numbers, or as a hex dump, or as raw bytes in the form of Windows-1252 or ISO-8859-1 characters) and this script will tell you what the characters are, including UTF-8 decoding diagnostics.

For example, if you are viewing a UTF-8 encoded file in a raw Emacs buffer, and your buffer contains \342​\200​\253​\330​\263​\331​\204​\330​\247​\331​\205, and you want to know what on earth that is, you just need to select that exact string, paste it into the script's input field, and click the submit button. It will then tell you the characters are:


This can be very useful, especially since the first one above (the RLE) is not a visible character! The script also includes some other useful information, such as the binary representation of each input byte and the entities you would use to include the characters in a US-ASCII HTML or XML file.


This little script will simply search for the characters you specify in the Unicode NamesList.txt file, giving the information for each character you selected.

For example if you enter into the input field and submit the form, it will tell you, amongst other things:

Character number 1 is decimal 9786, hex 0x263A, octal \23072, binary 10011000111010

	= have a nice day!

Full source code is of course available.