[pmwiki-users] Moving a pmwiki installation to a new host
5ko at 5ko.fr
Sat Sep 28 04:29:08 CDT 2013
Unfortunately there is not an easy solution to this problem, see below.
Leandro Fanzone writes:
> Hello, I have an installation of pmwiki on a Fedora Core 4 server, and I
> decided to migrate it to Ubuntu 12.04. As I did not want to install pmwiki
> again, I just copied /var/www to the new machine and installed Apache + PHP.
> As a result, some pages that had titles with Spanish letters (á, ñ, etc.)
> cannot be accessed anymore. I see that the files do exist (albeit they have
> the special letters changed somehow) but when I try to open those pages
> pmwiki cannot find them. For example: a page called "Documentación" exists
> in the filesystem as "Documentaci?n", but pmwiki tries to access it as
> "DocumentaciN". It seems an encoding problem, apparently the contents are
> stored in Latin1 (ISO-8859-1), and in the filenames sometimes the special
> letters were changed with ? and sometimes they keep the Latin1 letter, but
> for some reason pmwiki does not generate the same filename as before to
> access them. I am completely lost, I don't know if this is a configuration
> problem of PHP, of Apache, of the LANG variable...
This is likely a problem of the filesystem encoding (charset). It is
possible that the older server had a different filesystem encoding than the
A charset (character set) is set of rules defining the byte or bytes used to
represent different letters, characters and symbols. Different charsets
generally use the same bytes for the plain Roman/Latin letters (ASCII) and
the most used punctuation symbols, but for example international letters
like "ó" may be "tied" to different bytes in different charsets. If your
filenames contain such characters, there is no guarantee that you'll be able
to copy them without errors from one filesystem to another.
PmWiki (actually PHP) doesn't care much about the charset, it tries to
process just the stream of bytes, whatever the charset.
So if your wiki content is in Latin1 and PmWiki creates a link to a page
"Documentación", it will look for a filename which is the stream of bytes
with positions 68, 111, 99, 117, 109, 101, 110, 116, 97, 99, 105, 245, 110,
where the "ó" character is byte number 245.
If in your directory there is no such filename, PmWiki will show a link
as if the page doesn't exist.
The Unicode/UTF-8 charset defines "ó" as two consecutive bytes, 195 and 179,
which are obviously not the same.
When you copy files from one filesystem to another, there may be two cases -
either (A) your copy program is aware of the two charsets and recodes the
actual letters to the correct byte positions, or (B) it is not aware of the
charsets and tries to copy the files and tells the new filesystem "this file
is named this string of bytes: 68, 111, 99, 117, 109, 101, 110, 116, 97,
99, 105, 245, 110" which (B1) may or (B2) may not be accepted by the new
filesystem -- eg. that stream of bytes is not valid UTF-8.
In case of (A) you'll be able to see the correct filenames when you browse
your filesystem, but PmWiki may be unable to find the files as it expects
different byte streams/positions.
In case of (B1) PmWiki should be able to find its filenames and it should
work like before, but when you browse your filesystem, you may see weird
In case of (B2) neither you, nor PmWiki see the correct filenames with
international characters. It looks as if you are in this case.
Note, Pagelists/searches use a different approach than links. A link to a
page asks if there is such a file, while a pagelist/search will list all
files in the wiki.d directory and will try to process them - if a file is
named "Documentaci?n", the "?" character is not allowed in a pagename so
PmWiki tries to deduce an allowed pagename and it can list "DocumentaciN".
> I think I can just change every filename to match pmwiki,
Try with a 1-2 files first to see if it works, because you'll have the (A)
case above and PmWiki may still not be able to locate them.
> but on one
> hand that implies a lot of work, and on the other, the titles that has
> special characters are changed as well, which looks horrible.
What does "looks horrible" mean? If you rename a file to something that
looks OK in the filesystem, PmWiki may be able to access it and will try to
show these bytes in the Latin1 charset. If the filesystem charset is UTF-8,
pmwiki will show "DocumentaciÃ³n" because the bytes 195 and 179 ("ó" in
UTF-8) are the characters "Ã" and "³" in Latin1.
Some wiki admins restrict pagenames and filenames to ASCII characters, which
are on the same byte positions in most charsets. Then the page is named
"Documentacion" and there is a directive (:title Documentación:) in it so
that it displays correctly. This is generally more migration-proof than
allowing all international characters.
There is a recipe that converts all links to the correct plain letters, see
If you want to go this way, you just write a small bash script on the old
server (!!BACKUP. Your. Files. Before!!) that will rename the files to ascii
characters: something like this:
for filename in * ; do \
newfilename=`echo $filename | \
iconv -f iso8859-1 -t ascii//TRANSLIT -c -`; \
echo "$filename -> $newfilename"; \
This will just show you if and how your filenames would be renamed. If you
are OK with this, change the script it to actually rename the files.
Then install the recipe ISO8859MakePageNamePatterns and test if the wiki
works on the old server. If it does, place (:title Correct title:) in the
pages where the accents were lost, and copy the wiki.d directory and
local/config.php to the new server.
Another note: the encoding of the config.php file also matters - if your
wiki is in iso8859-1, save your file on that encoding and not, eg. UTF-8.
You must use a text editor allowing you to select the encoding of the files.
See http://www.pmwiki.org/wiki/PmWiki/LocalCustomizations#encoding .
More information about the pmwiki-users