[pmwiki-devel] Subversion PageStore Design

Tue Nov 13 05:43:32 CST 2007

Kathryn Andersen wrote:
> A subversion PageStore would have to implement some or all of the
> PageStore functions, to wit:
> - pagefile
> - read
> - write
> - exists
> - delete
> - ls
> 
> Obviously, since a subversion PageStore would still be file-based, then
> one could still use the normal PageStore versions of 'pagefile',
> 'exists' and 'ls'.
> 
> So that leaves 'read', 'write' and 'delete'.
> 
> One suggestion for a subversion pagestore was to have the page content
> in the file, and all the attributes as subversion properties.
> This would make reading a matter of
> - reading the file
> - listing the svn properties for the file
> 
> Writing would be a matter of
> - writing the file
> - updating the svn properties for the file
> 
> It would probably make sense to have some sort of prefix for the
> properties, such as "pmwiki:".  So the "author" attribute would be
> "pmwiki:author", for example.
> 
> Deletion would be a matter of doing a svn del on the file.

So far, this exactly matches my thinking on the subject.

> A few questions remain, however.
> 
> - How would one do diffs for "history" pages?  I'm not quite clear how
>   pmwiki does that stuff, I haven't looked.

Currently pmwiki stores diffs inside the page as attributes. We would want to
use the SVN diff functions instead, as this is one of the main reasons to move
to an SVN pagestore.

> - When would one do updates?
> - When would one do commits?

In order to have diffs created whenever pmwiki would have created them, we
would need to do an update and commit on every single page save. I don't think
this would usually be a problem though, as it would have one of two effects:

A) There had been no remote changes to the wiki since the last update/commit
cycle, so it would just work.

B) Since there had been no local changes since the last update/commit cycle
(assuming all changes are going through the wiki), any new updates for pages
other than the currently edited page would simply succeed. The currently
edited page may end up in conflict, in which case we use the same conflict
resolution method already in pmwiki (which was borrowed from tools like diff
and svn in the first place).

With SVK, I worry that there might be far more hairy scenarios, in which two
different and conflicting distributed changes have occurred to the same file,
which we only find out about when we do an update. I'm not sure how to resolve
that, other than have a way to mark a page as 'in conflict' in the pagestore,
and have the next person who looks at it deal with it in some way. I can
imagine some scenarios like:

A) Display all versions (serially, or with change marks or something) and hope
the next editor cleans up the mess.
B) Display the last non-conflicted version, but somehow visually flag the page
as conflicted. The next editor is required the resolve the mess.

I'd have to do some research into SVK to see what other possible problems we
could run into.

> - How would one know what repository to make commits to?

I'd want to make use of the fact that SVN stores the repository name in the
hidden directories it creates when it manages files. If you try to make a
pagestore in a directory that is already under SVN control, it just uses the
same repository, I would think.

For other cases, we would want there to be an optional extra parameter to the
new pagestore command, just as the current one takes a directory name. I
imagine we'd also have a config variable called something like $SVNRepository
to pass when we create new directories and such.

> - How would one deal with creating/committing new directories (for wiki page
>   formats which have pages in separate directories for example).

This seems very straight forward. Maybe I'm not seeing the problem you are
anticipating. Can you explain?

> The "when would one do commits?" question has a number of
> ramifications, including conflict resolution, and whether or not one
> could operate "detached" on a laptop (as someone suggested in the thread
> on the pmwiki list).  I can see a number of problems arising.
> 
> A. Commits are done immediately after writing and updating properties.
> However, this isn't an atomic action, so how do you prevent someone else
> overwriting your file before you do the commit?  I suppose the standard
> pmwiki conflict-resolution would do for that.

Yup. I think this is the way to go. The standard conflict resolution system
should work fine.

> However, this doesn't allow work to be done detached on a laptop.
> So maybe there would have to be some sort of global option which could
> be turned on to disable commits while that person is working detached,
> and then they would have to commit by hand when they were attached
> again.

If one wants to do detached work on a laptop, then I can think of several
possibilities:

A) Use SVK instead of SVN, as it has the concept of detached commits to a
local version of the repository.

B) The detached flag you mentioned. We would probably need to supply some sort
of automatic update system for resynchronizing though, and it could get hairy.
There is also a question of where we would store the diffs and metadata if we
aren't going through a repository in this mode. PmWiki won't really work if we
don't have things like title, author, etc stored SOMEWHERE. We'd pretty much
be forced to go back to the old pagestore method when in detached mode.

C) Insist on there being a laptop repository of its own, and provide some
mechanism for a repository-to-repository merge.

> Likewise, "when would one do updates?" could have some conflict problems
> -- does one do an update just before one reads a page?  After one does a
> write?  Both?

I think one does an update just before a write, and just before showing the
contents of a page for editing. If everyone is going through the same
repository, and there is reasonable amounts of editing happening in both
places, this should be sufficient.

Otherwise, I suggest we have some method of determining if any changes have
taken place in the repository since the last local update, and do an update
after displaying a page (as part of the PmWiki shutdown system, like the way
Notify works).

This latter system would mean that if you have one site where all your editors
work, and another copy which everyone else is reading, they would tend to stay
in synch.
-- 
Stirling Westrup -- Visionary, Technology Analyst, Researcher, Software
Engineer, IT Generalist

LinkedIn Profile: https://www.linkedin.com/e/fpf/77228

Website:       http://www.pooq.com
Tech Blog:     http://technaut.livejournal.com
Business Blog: http://willcodeforfood.livejournal.com
--
Spread the word: Its all a HOAX, memes don't exist!