loge.hixie.ch

Hixie's Natural Log

2004-01-03 00:48 UTC Unicode decoder tools

To help with debugging of Unicode and UTF-8 related problems, I've written two tools:

utf8-decoder

Paste in some UTF-8 bytes (either as hexadecimal, decimal, octal, or binary numbers, or as a hex dump, or as raw bytes in the form of Windows-1252 or ISO-8859-1 characters) and this script will tell you what the characters are, including UTF-8 decoding diagnostics.

For example, if you are viewing a UTF-8 encoded file in a raw Emacs buffer, and your buffer contains \342​\200​\253​\330​\263​\331​\204​\330​\247​\331​\205, and you want to know what on earth that is, you just need to select that exact string, paste it into the script's input field, and click the submit button. It will then tell you the characters are:

202B	RIGHT-TO-LEFT EMBEDDING
0633	ARABIC LETTER SEEN
0644	ARABIC LETTER LAM
0627	ARABIC LETTER ALEF
0645	ARABIC LETTER MEEM

This can be very useful, especially since the first one above (the RLE) is not a visible character! The script also includes some other useful information, such as the binary representation of each input byte and the entities you would use to include the characters in a US-ASCII HTML or XML file.

character-identifier

This little script will simply search for the characters you specify in the Unicode NamesList.txt file, giving the information for each character you selected.

For example if you enter into the input field and submit the form, it will tell you, amongst other things:

Character number 1 is decimal 9786, hex 0x263A, octal \23072, binary 10011000111010

U+263A	WHITE SMILING FACE
	= have a nice day!

Full source code is of course available.

2003-12-24 11:15 UTC More notes from a Winter in Oslo

If, after snowing, it rains, the result will be a ground covered by inviting patches of snow that turn into deadly puddles of cold water on contact. After areas of the ground have been subjected to this treatment for a few microseconds, they will turn into sludge, a grey oozing solid not unlike a slush drink made from dirt. This substance has the consistence of water, and splashes like it, but is able to remain in piles where it has been pushed by heavy wheels.

2003-12-23 14:42 UTC Some notes from a Winter in Oslo

Two key things to remember when cycling on icy snow are:

  1. You have no brakes, and
  2. You can only go in a straight line.

2003-12-04 16:34 UTC Extending HTML4 Forms

For the past few months I've been working on a proposal to extend HTML4 Forms in a backwards compatible manner, to address the needs that were not covered by XForms 1.0.

And thus the Proposed XHTML Module: XForms Basic draft specification came to be!

Please send me your comments at the e-mail address given in the draft. This specification is not just blue-sky work, there is a good chance large parts of it will be implemented in user agents in the medium-term future. Your input is wanted!

Pingbacks: 1

2003-12-02 17:14 UTC The mystery of why only four properties apply to table columns

Have you ever wondered why columns in CSS can only have four properties applied to them?

On the face of it, it seems strange — if you wanted to make all the cells in a column of prices be red and right-aligned, the following snippet of CSS would seem to make sense:

col.prices { color: red; text-align: right; }

...and indeed, IE6 supports this. So why does the spec say it shouldn't?

The answer to this question lies in another question: How would you implement it?

The colour of text is dependent on the 'color' property of its element. Unless specified, the 'color' property (basically) defaults to 'inherit', which means "take the value of the parent element".

So for some text in a cell, the colour is determined by the 'color' property of the cell, which is taken from the row, which is taken from the table, which is taken from the table's parent, and so on.

What about the column? Well, the column isn't one of the cell's ancestors, so it never gets a look-in! And therein lies the problem.

Now, while that explains why it doesn't apply in the current model, it doesn't actually explain why the model couldn't be changed a little. For example you could say that if a cell has no value set, it should inherit from the column instead of the row.

Unfortunately, you then run into two problems. The first is that in CSS, everything is always set. That isn't a huge problem, because you could always say that the initial value of certain properties was 'auto' or 'normal', and have that value Do The Right Thing when inherited. It wouldn't be pretty, but it could be done. (This is the approach I used to solve some similar problems in the CSS3 Generated and Replaced Content module.)

The second problem is rather more fundamental, and to explain it we'll take a look at the overall processing model for CSS.

As you read this, remember that what we want to do is compute the value of 'color' on cells from the appropriate table column.

Here is how CSS works, at a very high level:

  1. Parse the stylesheets and the document.
  2. For each element in the document:
    1. Decide which CSS rules apply.
    2. Perform the CSS cascade with those rules.
    3. Perform inheritance of properties if the result of the cascade is the keyword 'inherit' (or no specified value for inherited properties).
    4. Perform computations (turn 'em's into 'px's, etc). According to CSS2.1, the getComputedStyle() DOM method returns these values.
    At this point, every element has a value for every property ('display', 'color', etc).
  3. Lay out the document.
  4. Paint the document.

Now, columns are only columns because their 'display' property is set to 'table-column', and a cell is only a cell because it's 'display' is 'table-cell', and the exact relationship between a cell and a column can only be calculated when laying out the document, since you have to take into account what cells span several columns, etc.

So you know which cell is in which column during stage 3, and not earlier. But the stage where you would work out what the color property of a cell is step 2! And at this stage, you don't yet know which column applies to which cell, so you can't go and ask the column for the value.

Hence why the spec limits the properties that apply.

So why does it work in IE6? Well, it can do this despite what I said above because it doesn't support explicit CSS inheritance (the 'inherit' keyword), it doesn't support getComputedStyle(), and it doesn't support 'display: table-column' and the other table display types. In IE6, the model is probably more like:

  1. Parse the stylesheets and the document.
  2. For each element in the document:
    1. If it's a table, map out its structure.
    2. Decide which CSS rules apply.
    3. Perform the CSS cascade.
    4. Perform inheritance of properties, inheriting magically from table columns if the column isn't inheriting its style itself.
  3. Lay out the document.
  4. Paint the document.

As you can see, this is rather different from what the specs say, and is only possible because the model has been quite radically changed, at the expense of a number of useful features.

By the way, if anyone has a way to solve this problem, the working group would probably be very interested in hearing it. The www-style@w3.org mailing list would be the place to make such suggestions.

Pingbacks: 1 2