nd set its astgrammar to 'ABC::PAST::Grammar' (from src/PAST/Grammar.tg) 18:59 that new compiler object can then be obtained via $P99 = compreg 'ABC' 18:59 The language method is just a wrapper around the compreg opcode? 18:59 at the moment, yes 18:59 eventually it's likely that one will be able to do the following: 18:59 $P0 = new ['HLLCompiler']; $P0.'init'('language'=>'ABC', 'parsegrammar'=>'ABC::Grammar', 'astgrammar'=>'ABC::PAST::Grammar') 19:00 Is there some reason to be mentioning "grammar" everywhere? 19:00 "parsegrammar" and "astrgrammar" specifically 19:01 allison is making a strong distinction between grammars and the items they grammarizes 19:01 er, "astgrammar" even 19:01 besides, we should probably make it clear that what HLLCompiler is expecting here is, in fact, a grammar 19:01 there's one grammar for converting source to a parse tree, and another grammar for converting the parse tree into an ast 19:01 okay 19:02 so, the 'HLLCompiler' object has a built-in 'compile' method, which knows how to call the parser, ast, etc. to compile source into a target 19:03 so, after creating the HLLCompiler object, one could do: $P99 = compreg 'ABC'; $P1 = $P99.'compile'(source) 19:03 and it will compile the source into executable code 19:03 the compiler method understands various options, thus 19:03 << someone pelase summarize this discussion and make sure it ends up in a readme or something. Danke. >> 19:04 $P1 = $P99.'compile'(source, 'target'=>'pir') # compile source into PIR 19:04 $P1 = $P99.'compile'(source, 'target'=>'past') # return the ast from source 19:04 so far so good? 19:04 yep 19:05 okay 19:05 now then, at present the typical invocation for any hll compiler is as follows: 19:05 parrot language.pbc [options] [sourcefile] 19:05 for abc, we typically do 19:05 parrot abc.pbc 19:05 or 19:05 parrot abc.pbc source.abc 19:06 The HLLCompiler object understands standard command line invocation, so that the compiler writer doesn't have to write a command line processor 19:06 this is what is happening in the 'main' subroutine of abc.pir 19:06 er 19:06 hmmm 19:06 there is no main 19:07 and now we come to a very important question ... where is main? :) 19:07 ...hey, what happened to main? 19:08 I'm guessing that since abc.pir doesn't have a :main, it does nothing? 19:08 (other than load stuff) 19:08 I obviously forgot to include :main 19:09 since the tests aren't invoking abc.pir from a parrot command line, we didn't notice :-) 19:09 just a sec, fixing. 19:09 (actually, what happened is I removed :main during the HLLCompiler refactor) 19:11 r15906 19:11 however, :main is just five lines long 19:11 -!- Irssi: Pasting 5 lines to #parrot. Press Ctrl-K if you wish to do this or Ctrl-C to cancel. 19:11 .sub 'main' :main .param pmc args $P0 = compreg 'ABC' .return $P0.'command_line'(args) 19:11 .end 19:11 yeah, I figured it would be short (looking at command_line in Parrot/HLLCompiler.pir) 19:11 grr, irssi paste didn't work 19:11 but you get the point -- we just pass control to the 'command_line' method of HLLCompiler 19:12 (N.B.: 'command_line' is about to go through a significant refactor, but the API remains the same) 19:12 command_line handles a lot of standard options for compilers 19:12 for example, the --target= option allows us to change what abc.pbc does 19:13 ./parrot abc.pbc # invoke an interactive compiler 19:13 ./parrot abc.pbc file.abc # compile and execute file.abc 19:13 ./parrot abc.pbc --target=PIR file.abc # compile file.abc to PIR 19:14 ./parrot abc.pbc --encoding=utf8 file.abc # compile and execute file.abc, file.abc is encoded as utf8 19:14 r15906 | pmichaud++ | trunk: 19:14 : [abc]: 19:14 : * Restore lost :main sub 19:14 ./parrot abc.pbc --combine file1.abc file2.abc file3.abc # combine three files into a single source unit and compile and execute that 19:14 abc doesn't seem to know about # comments 19:15 probably not. needs a grammar change :-) 19:15 (patches welcome!) 19:15 so, that's the high-level view of creating a compiler 19:16 the rest of abc.pir is simply including the various builtin functions, the parse grammar (compiled to pir by pgc), and the ast grammar (compiled to pir by tgc) 19:16 So ... I'm still fuzzy on tge. Where does "node" come from in PASTGrammar.tg? 19:16 it's automatic, kinda like 'self' 19:17 each transform rule automatically initializes 'node' to be the node being transformed, and 'tree' to be the object doing the transforming 19:17 and it appears to be an lvalue? (i.e. I can replace the current node with another) 19:18 actually, it's a transformation, so we return the transformed node 19:18 is there anything else that's just magically there like node? 19:18 just node and tree 19:19 in perl6, it would probably be something like transform past ($tree: ABC::Grammar $node) { ... } 19:19 i.e., tree is our "invocant", and node is the thing being transformed 19:19 so, let's look at a transformation 19:20 yeah, let's look at the ROOT transform :) 19:20 let's start with statement_list 19:20 when we enter this transform, 'node' is initialized to the match object corresponding to a subrule in the grammar 19:20 that rule looks like (from abc.pg) 19:21 rule statement_list { ? [ ; ? ]* } 19:21 so, a statement_list node will have $, an array of statement nodes 19:22 to transform this statement_list node into past, we transform each of nodes in $, and place those as children of a PAST::Stmts node 19:22 And the grammat guarantees that it's an array of statements so there's no error checking in the transform? 19:22 that's right 19:22 is pugs with PGE supposed to crash in the development version? 19:22 cognominal: I have no idea 19:23 cognominal: probably if the pugs people haven't kept up with the PGE changes :-) 19:23 oki 19:23 at the moment, there are just a few basic AST node types: PAST::Node (base), PAST::Var, PAST::Val, PAST::Op, PAST::Stmts, and PAST::Block 19:24 PAST::Node is really an abstract class, so we can discount it 19:25 so, the transform on ABC::Grammar::statement_list is simply creating a new PAST::Stmts node, transforming each child into past and adding it to the PAST::Stmts node. We then return the PAST::Stmts node, which is the ast representation of this particular statement_list 19:26 okay so far? 19:26 yes, I think. 19:26 What's the first arg to tree.'get'() for? 19:26 It's always 'past' 19:26 it's selecting which rule to apply 19:27 And the third arg? 19:27 the third arg is probably char *envp[] 19:27 the third arg is indicating the "type" of the second arg 19:27 why? 19:27 when TGE can't figure it out on its own 19:27 (more detail coming) 19:27 in a parse tree, all of the nodes end up with the same type (ABC::Grammar, in this case) 19:27 the fact that a particular node is a 'statement_list' is actually held in that node's parent 19:28 i.e., any given node in the parse tree doesn't know its own name 19:28 and they're all the same type 19:28 sooooo..... 19:29 in such cases we have to tell TGE the "type" of the second paramater, which we just artificially create by appending the subrule name to the type of the node 19:30 (yes, this feels wrong. The only alternatives I've come up with are to somehow create a separate class for every subrule (bad), or to change/augment the semantics of S05 so that every node can know the name its parent has given it.) 19:30 (This is also why I'm continually harping on mp6 to provide a concrete example of how this sort of transformation would be performed.) 19:30 okay, pmichaud. 19:30 purl, forget This 19:30 pmichaud: I forgot this 19:31 questions, or shall I proceed to another example? 19:32 Why does the ROOT transform do node = node['program'] ? 19:32 because the actual node that it starts with is the node 19:32 so, I just shift down to the node 19:33 so, that's a way to ignore nodes? 19:33 that's probably an artifact of the recent refactors 19:33 better might be to write: pnode = node['program'] and then use pnode through the remainder of the rule 19:33 so it doesn't look like an lvalue 19:34 but yes, I'm essentially skipping the node and jumping directly into the subnode 19:34 okay, keep going. It's slowly sinking in :) 19:34 okay, let's look at a more complex transform: ABC::Grammar::statement 19:35 the grammar rule is in http://svn.perl.org/parrot/trunk/languages/abc/src/abc.pg 19:35 at "rule statement", and we can see it's a sequence of alternations 19:35 yep 19:35 so, the transform simply checks for each alternative and dispatches to the correct subtransform 19:36 the next one is if_statement 19:37 the parse rule for if_statement is rule if_statement { if \( \) [ else ]? } 19:37 so, $ is an expression, and $ is an array of 1 or 2 statements 19:38 (if we didn't want the array, we could've aliased in the rule...) 19:38 for an if statement, we simply create a PAST::Op node that has 19:38 'pasttype' => 'if' 19:39 child 0 => expression to be evaluated 19:39 child 1 => what to do if child 0 was true 19:39 child 2 => what to do if child 0 was false 19:39 either child1 or child2 may be missing/null, in which case the node will do nothing (actually, it returns the result of child 0) 19:40 yeah, the code makes sense. (in PASTGrammar.tg) 19:40 I'll brush past while_statement, because it's cool -- it's the same concept as if_statement, except child 0 is the expression and child 1 is the thing to be done each time child 0 evaluates to true 19:41 we also have pasttype values of 'unless' and 'until', and I'll soon be adding 'repeat_while' and 'repeat_until' 19:42 (an interesting side note is that short-circuiting infix:&& is just 'pasttype'=>'if', and short-circuting infix:|| is just 'pasttype'=>'unless' 19:43 okay so far? next we delve into expressions 19:44 yeah, I'm already looking there :) 19:45 The ctype stuff is documented where again? 19:45 it's going to change, but it's documented in compilers/past-pm/PAST/Node.pir 19:45 ctype will become 'consttype', and vtype will become 'valuetype' 19:46 okie. 19:46 (I've forgotten what the + means) 19:46 + means numeric, ~ means string 19:46 (think perl6) 19:46 so, i+ is? 19:46 integer or float? 19:46 i+ means the value can be used as-is in any int or numeric context 19:47 n+ would mean the value can be used as-is in any num or numeric context 19:47 Hm. "const type" makes more sense to me as "context type" now :) 19:47 that makes sense 19:48 at any rate, it's just a compiler hint. If no ctype is given, then past-pm will put the constant into a PMC (according to the vtype) before doing anything with it 19:49 per discussions on p2p with allison, we're likely to move ctype completely out of the individual PAST::Val nodes and into a table of type => consttype 19:49 so then past-pm would figure out the ctype based on the vtype 19:50 (i.e., vtype == '.Integer' means it can be a constant in any int or numeric context) 19:50 I'm still figuring out how to get language-specific tables into past-pm 19:52 sounds like you've got the hang of things from here -- any other areas you want me to comment on? 19:52 not at the moment. 19:53 now all I need is a nice smallish language to try this stuff out on so that it can solidify in my head a little better 19:53 the perl6 stuff obviously has a few more intricacies, but thus far it's just abc with more stuff in it 19:53 I was thinking about trying my hand at javascript. 19:54 that would be very cool. I'm curious to see what ruby/Cardinal could do with this 19:55 * pmichaud goes to see if he has a capture log of this somewhere