coral implementation guide Wei Xiao ---------------------------------------------------------- Table of Contents I. Conversations II. Hacking history ----------------------------------------------------------- I. Conversations Read the following as things you need to figure out. Or questions to ask in order to understand the code. It is truly a log of questions I asked in understanding the system. wei xiao 6/29/94 this document is simply a summary of the conversation between praveen and wei. hopefully it will help future programmers to maintain coral. section 1 and 2 are grammer and parser. section 3 is data structure. section 4 is execution of program. 1. the language and terms used: clauses = rules e.g. p(x,y) :- q(y,z), r(x,z). predicate = relation name e.g. p, q, r literal = predicate ( arglist) e.g. p(1,2) arglist = arg [, arg]* set = { arglist } Note: the class names in the source code is not always consistent, but they are very close to what they should be. 2. arg: the base class of a large number of classes. numberical : 1,2 symbol: "string" Functor: name(arglist) e.g. f(1,2) List: [arglist] e.g. [1,2] = L(1, L(2)) ExprArg: 1+2 VarArg: X 3. internal storage of rules, facts, etc: coral has a list of workspaces(or databases): default ... exodus ... builtin ... each workspace: list of rules each rule: four subclasses, list of 1)builtin relations 2)a simple fact, it is a hash-relation e.g. p(1,2). 3)a derived relation. a module definition, a view of a db. module q(X,Y) :- p(X,Y), r(Y,Z). 4)pipelined relation, same as 3), different eval method 5)exodus relation 4. Execution when coral get started, coral_init is called, buitin relations are inserted first. Then the parser started. After parsing each line or multiple lines(in case of a module): 1)a simple fact(left side only) is inserted into hash-relation(if it is not a builtin, not a query, not a derived relation). e.g. p(1,2). if the relation is builtin, insert calls get_next_tutple, which calls the solver for that builtin. e.g. help(help). a query is converted to a builtin: ? p(x,y) => PrintBindings(x,y) := p(x,y). (look for function BindingsPrintSolver in the source Solver.C) see the following 2) for execute imperative rules. if the relation is a derived relation name, insert generate an err. 2)an imperative rule e.g. q(X,Y) := p(X,Y), r(X,Y). the evaluation sequence on the right side is guranted: set up a bind Env(which is a set of variables for the Value of X and Y, see below) get iterator get_next_tuple of p; returns a changed bind Env with some variables binded. go to the next literal on the right side, which is r repeat till no more on the right. insert the BindEnv into q undo the last change and repeat till no more tuple can be inserted into q [also see 5 for more] if there are more than one literal on the right side, backtrack is used to get all the answers. 3)a module is read in and compiled-- SCC are identified, execution order is optimized, a table of relations in the module is created. If the module specifies magic rewriting, it is done. when get_next_tuple applies to a derived relation, a temporary space is generated and semi-naive eval is used for each relation. no execution sequence can be assumed in a module. thus no imperative rule is allowed. e.g. module p p :- r,q q :- r,p r :- s SCC: 2.(p,q) 1.(r) 0.(s) SN eval starts from 0. new facts s,r,p,q are inserted into temporary workspace. only p is returned as a result of get_next_tuple. 5. Evaluate a rule rule: head = single literal body = list of literal env = list of all variables in the list take first body literal get iterator get_next_tuple - binds the env variable move to next literal in body repeat at end of body, extract head args from env, and insert into head relation. backtrack. Wei Xiao 6/30 data structures of coral: Execution.Env: collects all defaults each module has its own defaults during parsing: clause : if in_query_mode, execAction else add clause to parser.struct in_query_mode is actually not_in_module parser.struct: list of rules: structRule, texture representation collection of defaults execution command: insert, compile magic rewriting takes structRule and unparse it. using the same parser. parser.struct get compiled into QFM or insert fact exec query see previous day's discussion on the three execution option. QFM: lists of SCCs SCC: list of Semi-Naive rules. SN-rules: binding env : value of VarArg 7/5/94 Interface to c++ 1. embeded coral: translator look at embeded code, initialize the parserStruct, fill in execution code, then program is compiled by cc, linked with libcoral. coral code is essentially parsed by translator. init_coral() and exit_coral() must be called 2. user defined predicates: 1) embeded code in user application, within the same file as init_coral. a new solver function is inserted, code inserted after init_coral to init the solver 2) dynamic loading. consult the c code interactively. code is converted, compiled via cc, then dynamically loaded, solver function address resolved and initialized. doesn't work on HP 3. extensibility 1) define new relations. ie. pipline, etc.. 2) define new Arg type. ie. array, NULL, etc.. basically recompile coral. 7/9/94 source guide global/main.C -- the starting point. --------------------------------------------------------------------- II. Hacking history *************************************************************** * A Hacking History * 11pm 7/12 - 7am 7/13, 1994 * * the following is pretty much a nights story of Wei hacking * coral. Wooh! one can never over-emphasize the importance of * proper documentation. *************************************************************** NOTE: The following is not intended to be a documentation. Instead, I would think of them as pointers that you can follow to read the code in order to understand the system in the shortest time. A complicated software system is not economical to be explained in detail. The best way to understand it is trying to figure it out yourself. I recommend the use of emacs TAGS file, because it brings me to the definition of a function among over 100 files not written by me in one command. I have to admit the later part of this article is better written:-) what are: Association--typedef Association (*Func)(Pointer, Association*); Term: Arg + BindEnv exEnv.print_file Gram.y MODULE id PERIOD { parserStruct.init() ; init_collection(parserStruct.rules, 1); parseEnv.C_at_end_module = 0; parseEnv.in_query_loop = 0 ; parserStruct.CurModule.name = $2; strcpy(parseEnv.prompt_name, parseEnv.prompt); strcat(parseEnv.prompt_name, "*"); parseEnv.prompt = parseEnv.prompt_name; } in_query_loop==0:inside a module. void QueryFunction(ParserStruct& parserStruct) { // check to make sure it is not inside a module. if (!parseEnv.in_query_loop) { fprintf(exEnv.error_file, "CORAL::error :"); fprintf(exEnv.error_file, "Imperative statement inside module!\n"); return; } //print out format is selected if (exEnv.C_answer_style_default == COR_EMPTY_ANSWER_PRINT) headSymbol = PrintEmptySymbol; else if (exEnv.C_answer_style_default == COR_TUPLE_ANSWER_PRINT) headSymbol = PrintTupleSymbol; else headSymbol = PrintBindingSymbol; ... PrintTupleSymbol is defined in BuiltinSymbols.h #define DeclareKeyword(x,s) extern Symbol *x; DeclareKeyword(PrintTupleSymbol, "print_tuple") .. typedef struct Symbol : public ConstArg int execute_single_rule(ParserStruct &parserStruct) { int SemiNaive::evaluate(Table *F, RMarkArray *rm_array) { where is prompt for yes/no/all? BindingsPrintSolver(BuiltinRelation& , TupleIterator& iterator) { user_continue() how get down there from top? the global prompt parseEnv.print_prompt(); exEnv: the global options, may be changed via set(), clear() parseEnv: options within a module. initialized according to exEnv, stuff like @pipline, etc when a query is entered, ?p(X,Y), q(X,Y). parserStruct.curRule points to the right side(rhs), and a head (lh) is inserted in QueryFunction , then complete_rule is called to fill in cur_rule. then execute_single_rule is called to evaluate a rule, which creates a SN structure(SN rule) and fills in a table, get ready for SemiNaive::evaluate to do the real job. Note that SemiNaive::evaluat is the same proc that evals a module relation. In a module, there are SCCs, which contains one , or more SN rules. An SN rule has a table of relations. To support suspending SN rule eval, all local variable of SemiNaive::evaluate has to be saved and restored in a table of evaluations which is associated with each workspace. RuleEnv *rule_env; eval options for this rule; init according to parserEnv SemiNaive::evaluate(Table *F, RMarkArray *rm_array) /* it took me ten minutes trying to figure out what F is in a later part of this function -WeiX*/ 1. TupleIterator *tuple_its_index; one TupleIterator created for each literal on the rhs, 2. // Nested Loops Join evaluation using backtracking on failure. basically evaluate each literal(get_next_tuple) on rhs. 3. change order of literals on the rhs if needed: index = predOrders[current]; tuple_its_index = tuple_its[index]; 4.calling get_next, decide if to go forward or backward(backtrack): tuples[index] = tuple_its_index->get_next_tuple(); if (tuple_its_index->no_match() ^ negated[index]) { backtrack_index = btDecision[direction][current]; } else { current++; direction = FORWARD; continue; } 5.once head is satisfied, new bindings are inserted: if ( current >= rInfo_num_literals ) { index = current; ...... if (!CurrModuleInfo->NonGroundFacts) { // Insert the new head fact into the relation if (rel->insert_new( rInfo->arg_list[index], env, &UnusedEnv, tuples[0], 0)) { HANDLE_SUCCESS; answer_count++; } } else { /** NON_GROUND FACTS **/ INSERT_NG_FACT } // Continue with the nested loops join current = rInfo_num_literals -1; backtrack_index = successBacktrack; direction = BACKWARD; continue; } NOTE: For BuiltinRelation, insert_new calls get_next. Otherwise it may be a real insert. int BuiltinRelation::insert_new(ArgList &args, BindEnv *env, BindEnv *, Tuple *, int ) ..... it.get_next(); // Each builtin-relation is associated with a 'solver' // function that is invoked when a get_next() occurs it. class BuiltinRelation : public Relation { public: SolveProc solver; $$ builtin-rel looking at Scanner.l, built-in rel name are not reserved word. look at this tricky code under BuiltinSymbols.C. called by init_coral, called by main. void init_builtin_symbols() { #define DeclareKeyword(x,s) x = EnterSymbol(s); #undef _BuiltinSymbols_H_ #include "BuiltinSymbols.h" #undef DeclareKeyword } In EnterSymbols, after a bunch of TabInsert, which i don't understand, i believe builtins are inserted in SymbolTab, which is HashTable SymbolTab[1] defined in SymbolTable.C Now the conclusion: To add a new symbol, all i need to do is to add it to the "BuiltinSymbols.h", then write a Solver function for it.(this is not right, see below when i discovered initBuiltins. The remaining question is: how does coral tell a builtin from a fact when the user types 'help.' Look at this one: void FactFunction(ParserStruct& parserStruct) it has something to do with builtin, // use the kludge that ensures that all builtins have count = -1 what does kludge mean? -- see below, constructor for BuiltinRelations Coral does not make a difference. just insert the relation! rel = make_relation(parserStruct.cur_rule->head->pred, parserStruct.cur_rule->head->arity()); rel->insert_new(parserStruct.cur_rule->head->args, insert_env) //make_relation look for the head->pred in the table, if not found, then create a new one. Association *ptr = SymbolLookup(CurDB->RelationTable, new_name); if (HashNone(ptr1->arg)) { ptr->arg = new_name; ptr->val = AllocateRelation(new_name, arity, delta_indexed); TabInserted(CurDB->RelationTable, ptr); } there is a function find_r_kind(Name name), which is called by AllocateRelation, RelationKind find_r_kind(Name name) { if (is_magic(name)) return COR_R_MAGIC; else if (is_supp(name)) return COR_R_SUPPLEMENTARY; else return COR_R_ANSWER; } Constructor for a builtin function BuiltinRelation::BuiltinRelation(int a, Name n, SolveProc solve_proc) { count = -1; // NOTE: this is a kludge to easily identify a builtin relation It's interesting now: void initBuiltins() { BuiltinRelation *temp; temp = new BuiltinRelation(-1, EnterSymbol("set_break"), BreakSolver); temp = new BuiltinSetRelation(-1, MemTupleSymbol, MemTupleSolver); temp = new BuiltinRelation(-1, HelpSymbol, HelpSolver); temp = new BuiltinRelation(1, ConsultSymbol, ConsultSolver); temp = new BuiltinRelation(-1, DisplaySymbol, PrintSolver); !look at this grep: global/CoralInit.C: initBuiltins(); // Initialize builtin database. BuiltinDB.name = EnterSymbol("builtin_ws"); initBuiltins(); CurDB = &BuiltinDB; So, to add a builtin, this is another place to go. Now, here is a db // Create default database DatabaseStruct *db = createDB(EnterSymbol(COR_DEFAULT_DB)); guess i can put another field in this DatabaseStruct to store suspended queries. When a query(rule eval) is suspended, the rule, iterator, local vars are stored in this structure called *"SuspendedEvaluation"*. When the builtin pred "resume(id)" is given, it will call the SemiNaive::evaluate with all the appropriate value reset to SuspEval(short for "SuspendedEvaluation"), which is deleted itself. Let's look at the Query yes/no/all part . In BindingSolver: BindEnv* BindingsPrintSolver(BuiltinRelation& , TupleIterator& iterator) { // ... print out the args if (exEnv.C_interactive_mode_current) user_continue(); return iterator.bindenv; } and user_continue: static void user_continue() { char input[100]; fprintf(exEnv.error_file, "\t... next answer ? (y/n/all)[y]"); gets(input); // CHNG::wX if (input[0] == 0) gets(input); if ((input[0] == 'n') || (input[0] == 'N')) { // prevent any further computation exEnv.C_interrupt_raised = -1; } else if ((input[0] == 'a') || (input[0] == 'A')) { // prevent further answer prompting exEnv.C_interactive_mode_current = 0; } } Aha! for suspension, i will save the query data here, and treat it as 'n'. Since evaluation data are not available here, i might need a global signal so that SemiNaive::evaluate knows to save stuff. guess i can test this signal right here: (in SemiNaive::evaluate) // Continue with the nested loops join current = rInfo_num_literals -1; backtrack_index = successBacktrack; direction = BACKWARD; continue; i didn't see code to handle exEnv.C_interrupt_raised. it does not matter. the control has to come back here after a new relation is inserted. /* this is how to insert something(n for name, r is a pointer) to a table(RelationTable) which is created by AllocTab */ void InsertBuiltin(Name n, Relation *r) { if (!BuiltinDB.RelationTable) BuiltinDB.RelationTable = AllocTab(COR_RELATION_TABLE_INCR, NULL); Insert(BuiltinDB.RelationTable, n, r); } /* this is how to lookup a symbol in a table.DatabaseTable is the global for all the databases. each has a table of all the relations. */ DatabaseStruct *lookupDB(Name name) { Association *ptr; if (DatabaseTable == NULL) DatabaseTable = AllocTab(COR_DATABASE_TABLE_INCR, NULL); ptr = SymbolLookup(DatabaseTable, name); if (HashNone(ptr->arg)) return 0; else return (DatabaseStruct *)(ptr->val); } ==================== global variables, it is hard to find what are their types, unless you do grep. Globals.h extern ParserStruct parserStruct; extern ExecutionEnv exEnv; extern ParserEnv parseEnv; Globals.C ParserStruct parserStruct; ExecutionEnv exEnv; ParserEnv parseEnv; =======================