2

I'm using Firefox 3.5.4 (EN) under Windows XP SP3 (TR). When I open the web reports page of my company, Turkish characters are not displayed properly, so I manually have to change the Character Encoding setting from Western (Windows-1252) to Turkish (Windows-1254). I don't have this problem with other Turkish sites as they automatically change the encoding to Turkish.

How can I make Firefox automatically find the proper character encoding settings for problematic web sites?

Edit: I've found the following code line in the source code of relevant page:

<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="TEXT/HTML; CHARSET=WINDOWS-1254">

3 Answers3

3

Generally the page's encoding is followed, unless the server specifies an encoding. As the <meta> tag seems to specify what you're expecting, and as manually switching to that value helps, it sounds like the server you're getting the page from is sending an incorrect encoding (Windows-1252) in the headers to the browser.

The proper way to fix it is to configure the server properly. For a company webserver, this probably means bugging the server admin to do it.

To see the (wrong) headers, if you're familiar with such tools, you can use things like Firebug's "Net" panel in Firefox, or Web Inspector's "Resources" panel in Chrome or Safari. Or, if you don't know these tools and the web site is publicly accessible, then you easily see the server's headers online using, for example, Web-Sniffer.

Assuming the login page specifies the same as the actual pages, then this yields:

Content-Type: text/html

...without any value for charset. Not sure if a browser should then still interpret that <meta> tag, but apparently Firefox is ignoring it, and making some best guess.

Firefox ignoring it might be caused by the HTML source. The <meta> tag should always be specified within <head> before anything else, as it might also apply to the title, scripts, CSS and so on. On this site, it doesn't and, even worse, the HTML is a total mess:

<SCRIPT LANGUAGE=JavaScript SRC="/dergi/_ScriptLibrary/pm.js"></SCRIPT>
<SCRIPT LANGUAGE=JavaScript>
  thisPage._location = "/dergi/giris/login.asp";
</SCRIPT>
<FORM name=thisForm METHOD=post>
<HTML>
<style type="text/css">
<!--
  [..]
-->
</style>
<HEAD>
  [..]
  <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="TEXT/HTML; CHARSET=WINDOWS-1254">
  <META NAME="GENERATOR" CONTENT="Microsoft FrontPage 5.0">
  <META NAME="AUTHOR" CONTENT="[removed to protect the innocent...]">  
  <TITLE>YAYSAT DERGİ RAPORLARI</TITLE>
</HEAD>
<BODY>
<center>
[..]
</center>
</body>
<INPUT type=hidden name="_method">
<INPUT type=hidden name="_thisPage_state" value="">
</FORM>
</html>

Huge developer fail.

(Incidentally, Web-Sniffer shows <meta http-equiv=content-type content="text/html; charset=ISO-8859-1">, but that is due to its values for Accept-Charset. Firebug shows <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="TEXT/HTML; CHARSET=WINDOWS-1254"> just like in the question.)

Arjan
  • 31,163
quack quixote
  • 42,640
  • 1
    The interesting thing is when I open the same page in IE8, it's displayed properly. – Mehper C. Palavuzlar Nov 06 '09 at 13:51
  • 1
    If I recall correctly then Internet Explorer was the first to use <meta> as it did not honour the response headers as sent by the server. So, IE8 might very well favour the <meta> and then ignore the value from the response headers. See also "Firefox displays garbage characters in lieu of web page" at http://superuser.com/questions/23777/firefox-displays-garbage-characters-in-lieu-of-web-page/23814#23814 for an example of how things can get messed up... – Arjan Nov 06 '09 at 13:57
  • thx for the edits, arjan. spot on. – quack quixote Nov 06 '09 at 13:59
  • 1
    I used Web-Sniffer and it returned the following result: Accept-Encoding: gzip[CRLF]
    Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7[CRLF]
    Does that mean the problem come definitely from the server settings?
    – Mehper C. Palavuzlar Nov 06 '09 at 14:00
  • Thanks very much. I will accept your answer if you please paste it as an answer. – Mehper C. Palavuzlar Nov 06 '09 at 14:29
  • That's ok; I've added it to this answer and cleaned up the comments a bit. Read, and complain to the developer! – Arjan Nov 06 '09 at 14:50
  • arjan, what have you done to my poor answer?? j/k, that's great, above and beyond the call of duty. post it as your own for god's sake so i can vote it up. – quack quixote Nov 06 '09 at 14:56
  • You need the votes, ~quack, with your lousy 4,434 reputation. ;-) – Arjan Nov 06 '09 at 14:59
  • funny, i was gonna say the same to you, seeing 's how i'm about to pass you. – quack quixote Nov 06 '09 at 15:01
  • You will soon pass me, those lousy 4,434 points only gathered in about a month :-) And feel free to remove my opinion on "Huge developer fail." ;-) – Arjan Nov 06 '09 at 15:04
  • no way, that's spot on too! – quack quixote Nov 06 '09 at 15:05
3

The Firefox add-on Charset Switcher may help you if you don't control the contents of your website.

If you're asking what html you should generate, then my first remark is that the text should not be encoded at all in Windows-1254. Html pages should more correctly be encoded in UTF-8, since this encoding is much surer to display correctly on all browsers and on all client operating systems.

The tag should then look like:

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">

harrymc
  • 480,290
1

This bug (Firefox 4.0.1) has been reported: https://bugzilla.mozilla.org/show_bug.cgi?id=651142