Safari UTF-8 rendering glitch fixed in WebKit nightly build 

Tags :
WebKit Nightly Build

Safari v2.0.4 (419.3) and previous versions have this annoying habit of not displaying utf-8 encoded characters properly under certain conditions:

If you use the css 'content' property to display an utf-8 encoded string, it won't be properly rendered in Safari. I wanted to add a typographic quotes before and after my citation block with the following css code:


blockquote p.cite:before { content:"“ "; }
blockquote p.cite:after  { content:" ”"; }

Firefox renders it properly, but Safari sadly garbles it up. While trying to find a workaround, I was please to see that the latest WebKit nightly build didn't display this issue, and rendered the text correctly:

Incorrectly rendered
Correctly rendered

A similar problem arises if you use certain objects of the script.aculo.us javascript library like the Ajax.InPlaceEditor for instance. The string returned doesn't display correctly in Safari if it contains accentuated characters e.g., whatever the page encoding is (note: all my page encodings are utf-8, Moreover, the form submits utf-8 encoded values whatever the page encoding is).

I managed a workaround by testing the browser version and applying an extra utf-8 decoding step to the string before returning it to the client for display. Far from perfect, but it works for me.


if ((browser_detection('browser') == 'saf') &&
(browser_detection('math_number') <= $safari_bugged_version)) {
$new_title = (strlen($new_title) > 0 ) ? utf8_decode($new_title) : " ";
}
echo $new_title;

I encountered similar glitches with web applications such as Wufoo, the online form builder. If you use accentuated characters in form fields, they won't be displayed correctly in the current version of Safari, but will in Firefox or the latest WebKit.

Incorrectly rendered
Correctly rendered

Apparently, this issue is solved, and I am looking forward to seeing this build more widely released.

Posted a response ? — Webmention it

This site uses webmentions. If you've posted a response and need to manually notify me, you can enter the URL of your response below.

Comments and responses

  • 12 Mar 2007

    Thanks for writing this up. I’ve been bothered by that display issue for the longest time. It’s nice to see some explanation behind it.

  • 15 Mar 2007

    From memory the issue here was that external stylesheets default to the latin-1 text encoding while users assume they would default to the encoding of the document referencing them. I do not recall what the specification says about this, but I believe WebKit was updated to respect any character set specified in the HTTP headers for the stylesheet (as it always has) and fall back to the main documents encoding if one is not set. That would explain why you are seeing the correct behaviour now where it failed in the past.
    The JavaScript issue sounds a bit different and I can’t think of a reason for the issue off the top of my head. Since it’s fixed in WebKit and you have a workaround for your site, I’ll leave that as is.
    - Mark

  • 15 Mar 2007

    I should mention that the proper solution for the CSS generated content issue would be to serve your stylesheet with a MIME type that mentions the encoding of the content. Content-type: text/css; charset=utf-8 will have your curly-quotes looking splendid in Safari 2.0 as well as the nightly builds, and is safer than relying on the specific heuristic used to determine the encoding in a stylesheet.

  • 15 Mar 2007

    Thanks Mark for your insight and comments. I don’t know why I didn’t think to use the charset property or rule before.
    In html file:
    <link type="text/css; charset=utf-8" rel="stylesheet" href=".." />
    In external stylesheet file:
    @charset “utf-8”;
    Unfortunately, I can’t seem to get it to work properly. I added a rule to set the charset information in .htaccess as described in the W3C document FAQ: Setting charset information in .htaccess. The new http headers can be viewed with through a web-sniffer report of a css file.
    AddType ‘text/css; charset=UTF-8’ css
    The css file shows up correctly: the quotes are curly. Unfortunately, the html page isn’t rendered correctly.
    I wrapped up a quick set of test files to illustrate the differences with and without the charset property in the main html file and the external stylesheet file, and was surprised to see that none of them work as they should.

    Embedded stylesheet: works fine
    External stylesheet without charset rule referenced without the charset property in link External stylesheet with charset rule referenced without the charset property in link
    External stylesheet without charset rule referenced with the charset property in link External stylesheet with charset rule referenced with the charset property in link

    All files are utf-8 encoded. All files are rendered correctly in the latest nightly build, but not in Safari 2.

  • 15 Mar 2007

    David, that’s curious! I guess I misremembered the cause for this issue. The fact that @charset isn’t working in Safari 2.0 is not surprising — support for it was only added to WebKit in September last year. The issue I mentioned, which doesn’t seem to be the case in the examples you’ve provided, was about the character set on the main document being ignored when determining the encoding of the stylesheet. If you’re really curious and interested in finding a workaround, it should be possible to work out roughly where in the archive of nightly builds the bug went away. This would make it easier to work out what the fix for the problem was and can make it simple to work backwards from there to find a workaround.
    David, thanks for testing the nightly builds! If you run into any other issues which are not yet fixed, please don’t hesitate to file bug reports so that they can be investigated as soon as possible.

  • 28 May 2007

    I’ve just been having the same problem. My solution was to escape the UTF-8 characters in the CSS file; for example, rather than using
    content: ““”
    or similar, use
    content: “\201C”
    It doesn’t cure the problem, but it works in all the browsers I’ve tested without needing to play around with MIME types or conditional rules.

  • 30 May 2007

    Thanks Adam, your tip works like a charm. Actually, I did previously try it but I didn’t remove the character ‘x’ following the backslash and hence it didn’t work.
    Make sure you use,
    content: “\201c”
    instead of,
    content: “\x201c”
    Note: TextMate encodes in hex with the “x”

Want more ? — prev/next entries