[pmwiki-users] Page File Format: why? Tools to handle?

Oliver Betz list_ob at gmx.net
Sun Mar 18 09:37:05 CDT 2007


"Patrick R. Michaud" wrote:

>> what were the reasons for the current method to store page markup text
>> (in one line, with newline and percent sign converted)?
>
>1.  It's relatively easy to handle this from PHP, while still being
>    possible to manipulate the files using various other tools.

I already thought so.

>2.  For security reasons, it's very important that some characters
>    be encoded somehow (notably the '<' character).  Since we have to 
>    encode at least some characters, we might as well use an encoding
>    scheme that is easy to handle.

one disadvantage is that "line based" tools like diff or SCM don't
work very good with this and humans can't handle the data very well.

>3.  At the time PmWiki was developed (circa 2001-2002), there
>    weren't a lot of standardized libraries or file formats that
>    readily met PmWiki's needs, so I used this one.  (At the time
>    I had several other projects and systems that made use of
>    flat-file storage, so I had familiarity with this format.)
>    
>> It's somewhat hard to access this fomat from outside, e.g. diff and
>> merge.
>> 
>> The PhpWiki "dump" format (multipart MIME) is much "friendlier" in
>> this respect.
>
>I hadn't seen the PhpWiki format details.

It's like a multipart MIME email message. Metadata in the headers,
markup in the message parts. Likely I wouldn't do it exactly this way,
but I like the idea. Link to examples below.

Please note that I don't know how the pages are stored internally
(usually in a database) by PhpWiki, that's only the dump format used
for export, import, backups.

>> Before I start to hack an import filter for Beyond Compare: Are there
>> tools to convert, compare, edit the pages?
>
>Are you planning to import pages from PhpWiki?  A simpler approach

Maybe later, but the reason why I posted the question is that it's
very hard to check and merge differences between page data files.

For example, when installing an update of PmWiki I would like to merge
the changes of files in wikilib.d with my local files.

Also for other maintenance purposes it would be nice to have a better
access to the markup text without using PmWiki.

That's an advertised advantage of DokuWiki, but IMNSHO the separate
metadata makes it much harder (nearly impossible) to manipulate
pagedata without the help of DokuWiki.

As a Windows wimp I usually use "Beyond Compare" for diff/merge. BC is
able to compare nearly everything, comes with a binary mode, picture
comparison (including scaling) etc. - really great and worth every
cent. An import filter would be a simple hack, writing changes back to
the page data file is a bit more work.

If such tools already exist - maybe another diff/merge tool capable to
handle PmWiki page data files better, I wouldn't need to make a
converter.

>might be to simply create a new PageStore object that can read files
>from PhpWiki's dump format.  I've been very keen to come up with a
>PhpWiki converter (and will be happy to work on it again), but 
>I really haven't had a decent library of pages to work from.

Let me know if you need more than the samples I put in
http://oliverbetz.de/PhpWikiDump.zip

Since the PhpWiki content I will transfer some day (not so soon) has
to be refactored anyway, I don't depend on automatic import. But there
might be other PhpWiki users interested in such a feature. I don't
know whether it's worth the effort.

Oliver
-- 
Oliver Betz, Muenchen (oliverbetz.de)




More information about the pmwiki-users mailing list