[pmwiki-users] transliteration -> unicode markup for Indian languages
Patrick R. Michaud
pmichaud at pobox.com
Wed Aug 24 14:48:45 CDT 2005
On Wed, Aug 24, 2005 at 12:18:24PM -0700, Varadarajan Mani-A19487 wrote:
> [...] What I've tried is the following:
>
> Markup("{T=",'<split','/{T=(.*?)=T}/se', "Tamilize('$1')");
>
> which converts anything in between {T= and =T} into the Unicode
> characters for Tamil. For example:
>
> {T= tivviya pirapan^tham =T}
>
> would become
>
> ????????????????????? ???????????????????????????
>
> It seems to work for the most part, but I'm not sure whether "<split"
> is correct for this type of markup, and whether the markup delimiters
> are advisable.
First, I think the idea and these markup delimiters are excellent.
Seems like a very handy mechanism for writing Tamil.
Where things should be Tamilized is largely a matter of preference
(and probably trial and error). As you have it above, with "<split"
and the "/s" on the pattern, the {T=...=T} conversions will work
across multiple lines of text, as in
{T= tivviya
pirapan^tham
=T}
which may or may not be what you want.
You may also want/need to add PSS() into the Markup rule, as in
Markup("{T=",'<split','/{T=(.*?)=T}/se', "Tamilize(PSS('$1'))");
Otherwise, single and double quotes may end up with unwanted
backslashes in front of them.
Other than those two thoughts, I think it's a terrific idea
and hope to see a Cookbook recipe from it!
Pm
More information about the pmwiki-users
mailing list