∅ the empty set

Safari UTF-8 rendering glitch fixed in WebKit nightly build

WebKit Nightly Build

Safari v2.0.4 (419.3) and previous versions have this annoying habit of not displaying utf-8 encoded characters properly under certain conditions:

If you use the css 'content' property to display an utf-8 encoded string, it won't be properly rendered in Safari. I wanted to add a typographic quotes before and after my citation block with the following css code:


blockquote p.cite:before { content:"“ "; }
blockquote p.cite:after  { content:" ”"; }

Firefox renders it properly, but Safari sadly garbles it up. While trying to find a workaround, I was please to see that the latest WebKit nightly build didn't display this issue, and rendered the text correctly:

Incorrectly rendered
Correctly rendered

A similar problem arises if you use certain objects of the script.aculo.us javascript library like the Ajax.InPlaceEditor for instance. The string returned doesn't display correctly in Safari if it contains accentuated characters e.g., whatever the page encoding is (note: all my page encodings are utf-8, Moreover, the form submits utf-8 encoded values whatever the page encoding is).

I managed a workaround by testing the browser version and applying an extra utf-8 decoding step to the string before returning it to the client for display. Far from perfect, but it works for me.


if ((browser_detection('browser') == 'saf') &&
    (browser_detection('math_number') <= $safari_bugged_version)) {

    $new_title = (strlen($new_title) > 0 ) ? utf8_decode($new_title) : " ";
}
echo $new_title;

I encountered similar glitches with web applications such as Wufoo, the online form builder. If you use accentuated characters in form fields, they won't be displayed correctly in the current version of Safari, but will in Firefox or the latest WebKit.

Incorrectly rendered
Correctly rendered

Apparently, this issue is solved, and I am looking forward to seeing this build more widely released.

Ø permalink: https://davidroessli.com/logs/2007/03/safari_utf8_rendering_glitch_f/


Reponses to “Safari UTF-8 rendering glitch fixed in WebKit nightly build”

#1 by Ryan Campbell

15:13 on 12 March 2007

Thanks for writing this up. I've been bothered by that display issue for the longest time. It's nice to see some explanation behind it.

#2 by Mark Rowe

10:21 on 15 March 2007

From memory the issue here was that external stylesheets default to the latin-1 text encoding while users assume they would default to the encoding of the document referencing them. I do not recall what the specification says about this, but I believe WebKit was updated to respect any character set specified in the HTTP headers for the stylesheet (as it always has) and fall back to the main documents encoding if one is not set. That would explain why you are seeing the correct behaviour now where it failed in the past.

The JavaScript issue sounds a bit different and I can't think of a reason for the issue off the top of my head. Since it's fixed in WebKit and you have a workaround for your site, I'll leave that as is.

- Mark

#3 by Mark Rowe

10:26 on 15 March 2007

I should mention that the proper solution for the CSS generated content issue would be to serve your stylesheet with a MIME type that mentions the encoding of the content. Content-type: text/css; charset=utf-8 will have your curly-quotes looking splendid in Safari 2.0 as well as the nightly builds, and is safer than relying on the specific heuristic used to determine the encoding in a stylesheet.

#4 by David Roessli

12:30 on 15 March 2007

Thanks Mark for your insight and comments. I don't know why I didn't think to use the charset property or rule before.

In html file:

<link type="text/css; charset=utf-8" rel="stylesheet" href=".." />

In external stylesheet file:

@charset "utf-8";

Unfortunately, I can't seem to get it to work properly. I added a rule to set the charset information in .htaccess as described in the W3C document FAQ: Setting charset information in .htaccess. The new http headers can be viewed with through a web-sniffer report of a css file.

AddType 'text/css; charset=UTF-8' css

The css file shows up correctly: the quotes are curly. Unfortunately, the html page isn't rendered correctly.

I wrapped up a quick set of test files to illustrate the differences with and without the charset property in the main html file and the external stylesheet file, and was surprised to see that none of them work as they should.

  1. Embedded stylesheet: works fine
  2. External stylesheet without @charset rule referenced without the charset property in link
  3. External stylesheet with @charset rule referenced without the charset property in link
  4. External stylesheet without @charset rule referenced with the charset property in link
  5. External stylesheet with @charset rule referenced with the charset property in link

All files are utf-8 encoded. All files are rendered correctly in the latest nightly build, but not in Safari 2.

#5 by Mark Rowe

14:02 on 15 March 2007

David, that's curious! I guess I misremembered the cause for this issue. The fact that @charset isn't working in Safari 2.0 is not surprising -- support for it was only added to WebKit in September last year. The issue I mentioned, which doesn't seem to be the case in the examples you've provided, was about the character set on the main document being ignored when determining the encoding of the stylesheet. If you're really curious and interested in finding a workaround, it should be possible to work out roughly where in the archive of nightly builds the bug went away. This would make it easier to work out what the fix for the problem was and can make it simple to work backwards from there to find a workaround.

David, thanks for testing the nightly builds! If you run into any other issues which are not yet fixed, please don't hesitate to file bug reports so that they can be investigated as soon as possible.

#6 by Adam Dinwoodie

15:10 on 28 May 2007

I've just been having the same problem. My solution was to escape the UTF-8 characters in the CSS file; for example, rather than using

content: "“"

or similar, use
content: "\201C"

It doesn't cure the problem, but it works in all the browsers I've tested without needing to play around with MIME types or conditional rules.

#7 by David Roessli

09:34 on 30 May 2007

Thanks Adam, your tip works like a charm. Actually, I did previously try it but I didn't remove the character 'x' following the backslash and hence it didn't work.
Make sure you use,

content: "\201c"

instead of,
content: "\x201c"

Note: TextMate encodes in hex with the "x"

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)



Previous: MacBook Pro Core 2 Duo WEP problems fixed?

Next: Winter hits back with a revenge


About

Hello, my name is David Roessli. I am a freelance web designer and developer based in Geneva, Switzerland.

This weblog is an nth attempt to solve my multiple online personalities and weblog/rss feeds burnout issues. (more)

Words

I have been contemplating the idea of upgrading my desktop Mac since this spring. The latest 27" iMac (Quad-Core) seemed the perfect candidate, but the release of Apple's 27" Monitor last September made me stick with the Mac Pro...

Music

The autopsy of an iconic album cover picked up on Kottke.org. A stacked graph of successive radio signals from pulsar CP 1919, in a 1977 astronomy encyclopedia that originated in a 1970 Ph.D. thesis. Fascinating <3...

Pictures

Check out my latest Flickr ramblings. Mostly day to day cameraphone pictures stolen here and there.


© 2000-2016 David Roessli | v4.1 | as valid xhtml and css as possible | hosted by pair Networks | RSS feeds.