[pmwiki-users] Upgrade to 2.2.35 : problem with some page using apostrophe

Petko Yotov 5ko at 5ko.fr
Sun Nov 13 02:40:14 CST 2011

On Sunday 13 November 2011 01:32:40, Petko Yotov wrote :
> There are indeed problems with some characters such as typographical
> apostrophes and dashes, and yes, they are different from normal
> apostrophes.
> For some reason, the browsers don't treat these characters the same way as
> PHP does. The PHP iconv() function, like the `iconv` system program,
> appear unable to convert these characters so that the browsers display
> them correctly.

I should add the utf_encode() function.

These characters appear to be non-standard, or more precisely from a different 

The code points 128-159 (0x80-0x9F) are not denined in the ISO-8859-1 charset, 
they are defined in the Windows-1252 charset:

  https://en.wikipedia.org/wiki/Windows-1252 (the special characters are
    in the cells with thick green borders)

From Wikipedia:
  It is very common to mislabel Windows-1252 text with the charset label
  ISO-8859-1. A common result was that all the quotes and apostrophes
  (produced by "smart quotes" in Microsoft software) were replaced with
  question marks or boxes on non-Windows operating systems, making text
  difficult to read. Most modern web browsers and e-mail clients treat the
  MIME charset ISO-8859-1 as Windows-1252 in order to accommodate such
  mislabeling. This is now standard behavior in the draft HTML 5
  specification, which requires that documents advertised as ISO-8859-1
  actually be parsed with the Windows-1252 encoding.

So, the PHP conversion functions actually follow the standard, but the text 
sent by the browsers is not completely standard.

In order to convert these characters, maybe our automatic conversion from 
ISO-8859-1 to UTF-8 should do the same : consider the page text as 
Windows-1252. Indeed, if the text contains characters at these code points, 
these characters can only be Windows-1252-encoded.


More information about the pmwiki-users mailing list