The VSEP SPEC Section 0: Terminology The word "tree" is used in this document with "inner representation of everything associated with an expression" This is because VSEP uses a binary tree structure to represent expressions. Section 1: Parsing VSEP Expressions take a very simple syntax. Below are some valid vsep expressions. Note that any occurance of ";" below signifies the start of a comment, but that vsep expressions do not have any commenting features (yet) foo foo + bar ; Note the fact that there needs to be a whitespace both left and right of the '+', this is for readability. (foo) 'test' ; In this case, the string delimiter is "'". The string delimiter character is, however, configurable. foo + 1 foo + -1.000e1 foo(bar,1,'test') (1 + foo) - (foo + 1) (((foo))) However, these examples are only examples. Below is a (relatively complex) definition of the (relatively simple) vsep expression: The following defines expr, an expression that is parsable by vsep. We use a syntax to define expr that has some similarities to the syntax of regular expressions. It has the following rules a) For any list of equals signs with the same left hand value, at least one of the right hand values 'matches' the left hand value. Therefore if a = b and a = c then a matches either b or c. b) w? means zero or more whitespaces, \w. means one whitespace. c) The characters '(' and ')' represent the '(' and ')' characters respectively. c) Recursion is allowed. Hence foo = (foo) and foo = 1 would mean foo could be 1 or (1) or ((1)) or (((1))) etc... d) The "'" character represents the string delimiter character, which is configurable. e) Anything after the ';' character is a comment. expr = strtol ; where strtol is a string parsable by strtol expr = strtod ; where strtod is a string parsable by strtod expr = 'string'; where string is any sequence of characters in which any string delimiter is preceded by an escape character. The default string literal is ' The default escape character is | These can be changed by passing a pointer to a vsep_tree_context struct to vsep_parse that has the fields string_delimiter and escape set to whatever one wishes to use as string delimiter and escape character respectively. Any characters except for digits, whitespaces, brackets, and the '.' and ',' characters can be used for the string delimiter. Any character can be used as the escape character, provided that it is different from the string delimiter character. expr = variable; where variable is any sequence of characters that does not contain whitespaces, operators or string delimiters the first character or variable may also not be a digit nor a '.' character expr = \w?expr\w? ; By extension, this means that any amount of padding with whitespaces is fine. expr = variable(function_arguments) ; function_arguments is defined below. function_arguments = \w? ; empty function arguments are allowed function_arguments = expr function_arguments = \w?function_arguments\w? function_arguments = function_arguments,expr ; Note that hence function_arguments matches expr,expr,expr is true expr = (expr) expr = (expr\w.operator\w.expr); Where operator is one of the operators described below. The operators characters are: =+-*/><%&|^ Parsing anything else should cause VSEP to given an error. Note that it is a bug if a non-valid string is parsed without an error. Section 2: VSEP Types VSEP Supports The following types (the code (e.g. VSEP_TYPE_INT) on the left is the value of the "type" field the vsep_data C struct for that type. Hence if one has vsep_data* d and d->type == VSEP_TYPE_INT then d->data is a pointer to long int The types are: Code Description VSEP_TYPE_NULL NULL, represents missing data or an error. Any meaningful operations on NULL should return NULL. However, foo | NULL should return true if foo exists. foo & NULL should return 0. VSEP_TYPE_INT Integer. Internally represented by a "long int" VSEP_TYPE_STRING String. Internally represented by char* VSEP_TYPE_FLOAT Floating Point. Internally represented by a "double" Section 3: Operators VSEP has no operator precedence. (a | b | c) is parsed as (a | (b | c)) Note that for all numeric operatos involving both integers and floating point numbers, the integer is converted to a floating point number and the calculations are then performed as if there were two floating point numbers. VSEP checks for overflow before doing calculations. Note that hence for integer calculations that result in overflow, or for division by zero, the result of the calculation is NULL Note that for any operators that do not act on strings, passing a string as a lvalue or rvalue will result with NULL For the following operators, if NULL is an operand NULL is also the result of the operation: =+-*/><% for the other operators, NULL is equivalent to 0. Character Name Description = Comparison For numbers, the C == operator is used. For strings, !strcmp(first,second). Do not mix types. + Addition For two numbers, the C '+' operator is used. Overflow with two integer types results in NULL Mixing numbers with strings causes the number to be represented as a string. The string representation of the number is then concatenated with the string. Number->string conversion has the following rules: Preceding '-' character only for negative types Integers are converted using sprintf's %ld Finite floating point numbers using sprintf's %1.6f i.e. there are 6 characters after the decimal point And at least one character before the decimal point Nonfinite floating point numbers are represented as "nan" "inf" and "-inf" For two strings, the two strings are concatenated. - Subtraction For numbers only. The C '-' operator is used. * Multiplication For numbers only. The C '*' operator is used. See section on overflow above. / Division For numbers only. The C '/' operator is used. See section on overflow above. > Greater than For numbers only. The C '>' operator is used. < Less than For numbers only. The C '<' operator is used. % Modulus for integers only, the C '%' operator is used. & Logical AND Lazy evaluation is supported. For strings, strlen(string) is used as the operand. | Logical OR Self explanatory. Works like the Logical AND operator, but does logical OR. ^ Logical XOR Does logical XOR. There is no built-in way to override the behaviour of these operators. Anyone wishing to do so should fork libvsep, as operators do not exist to be overloaded The supported way to "overload" operators is to provide functions. For example, if someone wishes to change the way the + operator works, they should stop using it and create an "add" function instead. Section 4: Variables and Functions VSEP gets the values for the variables and functions in an expression through C callback functions provided to it by vsep_tree_add_callback. It calls each of these callbacks with arguments described below in reverse order that the callbacks are added, until one of them returns something that is not the C pointer NULL. By default, the vsep_std callback function is added to every tree. This function provides functions such as "!" (Logical NOT) and variables such as "NULL" - A vsep variable of type VSEP_TYPE_NULL. Note that, very importantly, VSEP will only fetch variables *once*, whereas functions are fetched as many times as the function appears. Hence, if one has a variable called "rand" whose associated callback function provides a random number, the expression "rand = rand" will certainly return true. However, "rand() = rand()" will probably not, if "rand()" is implemented the same way as "rand" was earlier. Note that importantly, "rand" and "rand()" are different. A different callback function may provide rand as that which provides rand(). Callback functions should return a pointer to a vsep_data type allocated using one of the vsep_data_new* functions. The callback functions should take 4 arguments: These are Argument # Name Type Description 1 name const char* The name of the variable or function that is being requested. 2 argc int If it is a variable that is being requested, this is -1. Else, it is the number of function arguments for the function being requested. 0 Is for 0 arguments, 1 for 1 argument, etc.. 3 argv vsep_data** The function arguments for the function being requested. This is undefined if a variable is being requested. argv[0] is a pointer to the first function argument, argv[1] to the second, etc... Do not try to access argv[>=argc]. 4 context void* A user-defined pointer for this particular callback function, set by calling vsep_tree_set_callback_context To remove the callback function added last to the tree, call vsep_tree_pop_callback Avoid calling your callbacks names beginning with '_', as this may result in them being called when you do not expect them to. The function "_init" is requested from any callback function when the associated "context" pointer is set to a non-NULL value. The function "_terminate" is requested when the context pointer is unset, including before it is changed The function "_terminate" is also requested when the tree is freed. These functions are to help callback functions with memory management. VSEP ignores their return value, hence they should probably return C NULL. Section 5: Laziness and Optimisation VSEP is lazy. This means it parses the expression in a way that requires as little processing as possible. To help VSEP be lazy, VSEP allows you to specify a "difficulty function" for every callback function you provide. This function is of the form int difficulty_function(const char*,int) and is passed to vsep when you run vsep_tree_add_callback as the second argument. The first parameter of difficulty_function is the callback name The second parameter is 1 if the callback is a variable and 0 if it is a function. The larger the return value of difficulty_function the more VSEP tries to avoid getting that particular callback. Return values may not be negative, except for the case below. See Section 6 for side-effects of the difficulty function returning 0. If no difficulty function returns a nonzero value, VSEP tries to avoidi getting that function or variable by setting the difficulty to INT_MAX. Importantly, if the associated callback function for a given difficulty function does not provide the variable or function of the type being requested of it, the difficulty function must return -1. For example, consider the following: There exist two variables, foo and bar, within an expression (foo & bar()). If difficulty_function("foo",1) is greater than difficulty_function("bar",0), then only the function "bar()" will be fetched. If the variable "baz" does not exist, difficulty_function("baz",1) must return -1. If bar is 0, NULL, or a 0-length string, then foo will not be fetched at all in the case above, due to the way VSEP lazily evaluates the expression. Section 6: Memory management and Tree Reconstruction To avoid having to parse and re-evaluate a given VSEP expression, VSEP allows you to "reconstruct" an expression after it has been evaluated, to allow you to reevaluate it, perhaps this time with different return values of the callbacks The relevant function for this is vsep_tree_reconstruct. Note that variables and functions with difficulty of 0 are not reconstructed. The way VSEP does this is best illustrated with an example: Let "foo" be a variable of difficulty 0 that initially has value 2 Let "bar" be a variable of difficulty 1 that initially has value 3 Parsing ((foo + 1) + bar), and solving, will result in 6 If the expression is then, however, reconstructed, it will reconstruct to 3 + bar Of course, if the expression is then evaluated again but with foo as 5, this change will not be reflected in the answer, as foo is never fetched. However, not that bar is refetched the second time, as it had a non-zero difficulty.