[pmwiki-users] How to generate a list of categories?

christian.ridderstrom at gmail.com christian.ridderstrom at gmail.com
Wed Jul 19 15:19:27 CDT 2006


On Wed, 19 Jul 2006, Patrick R. Michaud wrote:

Oh... I thought there already was a mechanism for this. Guess not. In that 
case I'll take a step back and think out loud. With category tag I'll mean 
something like [[!subject]].

What is a category? I guess it can be thought of as something abstract, or 
simply as a set of pages. Either a many-pages-to-one-category, or simply a 
bunch of pages. I'm not sure if this makes a difference here, so I'll just 
assume it's a bunch of pages.

* We mark a page as belonging to a page by adding a category tag to it.
* Several pages can belong to the same category.

The existance of a category doesn't imply that a category page exists. The 
category page is optional.

* In order to create a category it is enough to for one page to be tagged. 
* In order to delete a category, all corresponding tags must be deleted

I'm guessing that today some kind of cache mechanism occurs when a tag is 
added to a page. Either that it's recored to a cache or to the page file. 
Then when (:pagelist:) looks for a category it scans for this tag, either 
in the index file or in all pages.

How can we generate a list of all categories? A slow way of doing it would 
be to let (:pagelist:) (or whatever command is used) scan *all* pages and 
make a set out of all [[!subject]]. This set would then be the list of 
categories. Talk about not being scalable...

It is of course possible to improve the generation of categories by 
noticing whenever a tag is added to a page and then update a cached list 
of categories.

But how can we then "delete" a category? Or rather, how do we know when to 
delete an entry from the cached list of categories.  In principle, pmwiki 
would have to check what categories the page belongs before the page was 
edited and compare this to the pages it belongs to after it was edited. 
For each category that the page no longer belongs to, pmwiki will then 
have to do a search for pages that belong to the corresponding category.
If only the current page belongs to the category, that category will no
longer exist once the current page has been saved, so the cache needs to 
be updated accordingly.

I'm not sure how time consuming this would be. Actually, I started out 
writing it thinking that it'd be too slow, but it might work. Especially 
as the time consuming parts will only happen when a category tag is 
removed (or changed).

Ok, I have to pack as I'm off on vacation for a week. I hope this gave you 
some inspiration Patrick. Now I'll respond to your message :-)

> I've been thinking about adjusting the [[!category]] markup so that it 
> automatically creates (blank) category target pages if they don't 
> already exist.  Then one can easily use (:pagelist:) to generate a list 
> of categories.  This also means we can eliminate the current hack that 
> suppresses Site.PageNotFound when accessing a page in the Category 
> group.

Hmm... for some reason I'm reluctant to automatically have the category 
pages created. Would they also be automatically deleted if a category is 
"deleted"?

Ideally I'd like the "automatic" creation be separate from the possibility 
of listing categories.

> However, if we want to avoid the creation of empty category pages, 
> there's not currently a way to do that.  When (:pagelist:) becomes smart 
> enough to be able to display missing and orphan pages, then it will be 
> possible with something like:
>
>    (:pagelist select=all group=Category:)

Eh... I don't understand this. Ok, reading further down I understand what 
a missing page is. Never mind.

> Other possibilities will include:
>
>    ##   display all categories w/o Category pages
>    (:pagelist select=missing group=Category:)
>
>    ##   display all Category pages w/o category links
>    (:pagelist select=orphan group=Category:)
>
> One stumbling block I have at the moment is coming up with an
> appropriate name for the "select=" option above; "status=" and
> "type=" have been proposed (but are probably too generic).
> The defined values for the option, whatever it ends up being
> called, will be:
>
>    existing       display only pages that exist
>    missing        display pages that don't exist but have incoming links
>    orphan         display pages that exist but have no incoming links
>    all            display all target and existing pages

Hmm... could we consider this a filter, i.e.
 	filter={existing|missing|orphan|_all_}	_existing_ is default

Should you be able to do something like
 	filter=missing,orphan
to get both missing and orhpaned pages?

Maybe this should be related to the syntax for pagelist in general?
What are the other options?

Then when that is know, how about summarizing it in another email and ask 
the list again for suggestions?

regards
/Christian

-- 
Christian Ridderström, +46-8-768 39 44               http://www.md.kth.se/~chr


More information about the pmwiki-users mailing list