This part of the manual is a somewhat in-depth explanation of the Yacas programming language and environment. It assumes that you have worked through the introductory tutorial. You should consult the function reference about how to use the various Yacas functions mentioned here.
Generally, all core functions have plain names and almost all are not "bodied" or infix operators. The file corefunctions.h in the source tree lists declarations of all kernel functions callable from Yacas; consult it for reference. For many of the core functions, the script library already provides convenient aliases. For instance, the addition operator "+" is defined in the script scripts/standard while the actual addition of numbers is performed through the built-in function MathAdd.
There is one exception to the strategy of delayed loading of the library scripts. Namely, the syntax definitions of infix, prefix, postfix and bodied functions, such as Infix("*",4) cannot be delayed (it is currently in the file stdopers.ys). If it were delayed, the Yacas parser would encounter 1+2 (typed by the user) and generate a syntax error before it has a chance to load the definition of the operator "+".
The type of an object is returned by the built-in function Type, for example:
In> Type(a); Out> ""; In> Type(F(x)); Out> "F"; In> Type(x+y); Out> "+"; In> Type({1,2,3}); Out> "List"; |
Internally, atoms are stored as strings and compounds as lists. (The Yacas lexical analyzer is case-sensitive, so List and list are different atoms.) The functions String() and Atom() convert between atoms and strings. A Yacas list {1,2,3} is internally a list (List 1 2 3) which is the same as a function call List(1,2,3) and for this reason the "type" of a list is the string "List". During evaluation, atoms can be interpreted as numbers, or as variables that may be bound to some value, while compounds are interpreted as function calls.
Note that atoms that result from an Atom() call may be invalid and never evaluate to anything. For example, Atom(3X) is an atom with string representation "3X" but with no other properties.
Currently, no other lowest-level objects are provided by the core engine besides numbers, atoms, strings, and lists. There is, however, a possibility to link some externally compiled code that will provide additional types of objects. Those will be available in Yacas as "generic objects." For example, fixed-size arrays are implemented in this way.
Internally, all expressions are either atoms or lists (perhaps nested). Use FullForm() to see the internal form of an expression. A Yacas list expression written as {a, b} is represented internally as (List a b), equivalently to a function call List(a,b).
Evaluation of an atom goes as follows: if the atom is bound locally as a variable, the object it is bound to is returned, otherwise, if it is bound as a global variable then that is returned. Otherwise, the atom is returned unevaluated. Note that if an atom is bound to an expression, that expression is considered as final and is not evaluated again.
Internal lists of atoms are generally interpreted in the following way: the first atom of the list is some command, and the atoms following in the list are considered the arguments. The engine first tries to find out if it is a built-in command (core function). In that case, the function is executed. Otherwise, it could be a user-defined function (with a "rule database"), and in that case the rules from the database are applied to it. If none of the rules are applicable, or if no rules are defined for it, the object is returned unevaluated.
Application of a rule to an expression transforms it into a different expression to which other rules may be applicable. Transformation by matching rules continues until no more rules are applicable, or until a "terminating" rule is encountered. A "terminating" rule is one that returns Hold() or UnList() of some expression. Calling these functions gives an unevaluated expression because it terminates the process of evaluation itself.
The main properties of this scheme are the following. When objects are assigned to variables, they generally are evaluated (except if you are using the Hold() function) because assignment var := value is really a function call to Set(var, value) and this function evaluates its second argument (but not its first argument). When referencing that variable again, the object which is its value will not be re-evaluated. Also, the default behavior of the engine is to return the original expression if it could not be evaluated. This is a desired behavior if evaluation is used for simplifying expressions.
One major design flaw in Yacas (one that other functional languages like LISP also have) is that when some expression is re-evaluated in another environment, the local variables contained in the expression to be evaluated might have a different meaning. In this case it might be useful to use the functions LocalSymbols and TemplateFunction. Calling
LocalSymbols(a,b) a*b; |
Consider the following example:
In> f1(x):=Apply("+",{x,x}); Out> True |
The function f1 simply adds its argument to itself. Now calling this function with some argument:
In> f1(Sin(a)) Out> 2*Sin(a) |
yields the expected result. However, if we pass as an argument an expression containing the variable x, things go wrong:
In> f1(Sin(x)) Out> 2*Sin(Sin(x)) |
This happens because within the function, x is bound to Sin(x), and since it is passed as an argument to Apply it will be re-evaluated, resulting in Sin(Sin(x)). TemplateFunction solves this by making sure the arguments can not collide like this (by using LocalSymbols:
In> TemplateFunction("f2",{x}) Apply("+",{x,x}); Out> True In> f2(Sin(a)) Out> 2*Sin(a) In> f2(Sin(x)) Out> 2*Sin(x) |
In general one has to be careful when functions like Apply, Map or Eval (or derivatives) are used.
A function is identified by its name as returned by Type and the number of arguments, or "arity". The same name can be used with different arities to define different functions: f(x) is said to "have arity 1" and f(x,y) has arity 2. Each of these functions may possess its own set of specific rules, which we shall call a "rule database" of a function.
Each function should be first declared with the built-in command RuleBase as follows:
RuleBase("FunctionName",{argument list}); |
So, a new (and empty) rule database for f(x,y) could be created by typing RuleBase("f",{x,y}). The names for the arguments "x" and "y" here are arbitrary, but they will be globally stored and must be later used in descriptions of particular rules for the function f. After the new rulebase declaration, the evaluation engine of Yacas will begin to really recognize f as a function, even though no function body or equivalently no rules have been defined for it yet.
The shorthand operator := for creating user functions that we illustrated in the tutorial is actually defined in the scripts and it makes the requisite call to the RuleBase() function. After a RuleBase() call you can specify parsing properties for the function; for example, you could make it an infix or bodied operator.
Now we can add some rules to the rule database for a function. A rule simply states that if a specific function object with a specific arity is encountered in an expression and if a certain predicate is true, then Yacas should replace this function with some other expression. To tell Yacas about a new rule you can use the built-in Rule command. This command is what does the real work for the somewhat more aesthetically pleasing ... # ... <-- ... construct we have seen in the tutorial. You do not have to call RuleBase() explicitly if you use that construct.
Here is the general syntax for a Rule() call:
Rule("foo", arity, precedence, pred) body; |
All rules for a given function can be erased with a call to Retract(funcname, arity). This is useful, for instance, when too many rules have been entered in the interactive mode. This call undefines the function and also invalidates the RuleBase declaration.
You can specify that function arguments are not evaluated before they are bound to the parameter: HoldArg("foo",a) would then declare that the a arguments in both foo(a) and foo(a,b) should not be evaluated before bound to a. Here the argument name a should be the same as that used in the RuleBase() call when declaring these functions. Inhibiting evaluation of certain arguments is useful for procedures performing actions based partly on a variable in the expression, such as integration, differentiation, looping, etc., and will be typically used for functions that are algorithmic and procedural by nature.
Rule-based programming normally makes heavy use of recursion and it is important to control the order in which replacement rules are to be applied. For this purpose, each rule is given a precedence. Precedences go from low to high, so all rules with precedence 0 will be tried before any rule with precedence 1.
You can assign several rules to one and the same function, as long as some of the predicates differ. If none of the predicates are true, the function is returned with its arguments evaluated.
This scheme is slightly slower for ordinary functions that just have one rule (with the predicate True), but it is a desired behavior for symbolic manipulation. You can gradually build up your own functions, incrementally testing their properties.
In> RuleBase("f",{n}); Out> True; In> Rule("f", 1, 10, n=0) 1; Out> True; In> Rule("f", 1, 20, IsInteger(n) \ And n>0) n*f(n-1); Out> True; |
This definition is entirely equivalent to the one in the tutorial. f(4) should now return 24, while f(a) should return just f(a) if a is not bound to any value.
The Rule commands in this example specified two rules for function f with arity 1: one rule with precedence 10 and predicate n=0, and another with precedence 20 and the predicate that returns True only if n is a positive integer. Rules with lowest precedence get evaluated first, so the rule with precedence 10 will be tried before the rule with precedence 20. Note that the predicates and the body use the name "n" declared by the RuleBase() call.
After declaring RuleBase() for a function, you could tell the parser to treat this function as a postfix operator:
In> Postfix("f"); Out> True; In> 4 f; Out> 24; |
There is already a function Function defined in the standard scripts that allows you to construct simple functions. An example would be
Function ("FirstOf", {list}) list[1] ; |
which simply returns the first element of a list. This could also have been written as
Function("FirstOf", {list}) [ list[1] ; ]; |
As mentioned before, the brackets [ ] are also used to combine multiple operations to be performed one after the other. The result of the last performed action is returned.
Finally, the function FirstOf could also have been defined by typing
FirstOf(list):=list[1] ; |
Function("ForEach",{foreachitem, foreachlist,foreachbody}) [ Local(foreachi,foreachlen); foreachlen:=Length(foreachlist); foreachi:=0; While (foreachi < foreachlen) [ foreachi++; MacroLocal(foreachitem); MacroSet(foreachitem, foreachlist[foreachi]); Eval(foreachbody); ]; ]; Bodied("ForEach"); UnFence("ForEach",3); HoldArg("ForEach",foreachitem); HoldArg("ForEach",foreachbody); |
Functions like this should probably be defined in a separate file. You can load such a file with the command Load("file"). This is an example of a macro-like function. Let's first look at the last few lines. There is a Bodied(...) call, which states that the syntax for the function ForEach() is ForEach(item,{list}) body; -- that is, the last argument to the command ForEach should be outside its brackets. UnFence(...) states that this function can use the local variables of the calling function. This is necessary, since the body to be evaluated for each item will probably use some local variables from that surrounding.
Finally, HoldArg("function",argument) specifies that the argument "argument" should not be evaluated before being bound to that variable. This holds for foreachitem and foreachbody, since foreachitem specifies a variable to be set to that value, and foreachbody is the expression that should be evaluated after that variable is set.
Inside the body of the function definition there are calls to Local(...). Local() declares some local variable that will only be visible within a block [ ... ]. The command MacroLocal() works almost the same. The difference is that it evaluates its arguments before performing the action on it. This is needed in this case, because the variable foreachitem is bound to a variable to be used as the loop iterator, and it is the variable it is bound to that we want to make local, not foreachitem itself. MacroSet() works similarly: it does the same as Set() except that it also first evaluates the first argument, thus setting the variable requested by the user of this function. The Macro... functions in the built-in functions generally perform the same action as their non-macro versions, apart from evaluating an argument it would otherwise not evaluate.
To see the function in action, you could type:
ForEach(i,{1,2,3}) [Write(i); NewLine();]; |
Note: the variable names "foreach..." have been chosen so they won't get confused with normal variables you use. This is a major design flaw in this language. Suppose there was a local variable foreachitem, defined in the calling function, and used in foreachbody. These two would collide, and the interpreter would use only the last defined version. In general, when writing a function that calls Eval(), it is a good idea to use variable names that can not collide with user's variables. This is generally the single largest cause of bugs when writing programs in Yacas. This issue should be addressed in the future.
While (i < 10) [ Write(i); i:=i+1; ]; |
This scheme allows coding the algorithms in an almost C-like syntax.
Strings are generally represented with quotes around them, e.g. "this is a string". Backslash \ in a string will unconditionally add the next character to the string, so a quote can be added with \" (a backslash-quote sequence).
This is accomplished using functions MacroRuleBase, MacroRule, MacroRulePattern. These functions evaluate their arguments (including the rule name, predicate and body) and define the rule that results from this evaluation.
Normal, "non-Macro" calls such as Rule() will not evaluate their arguments and this is a desired feature. For example, suppose we defined a new predicate like this,
RuleBase("IsIntegerOrString, {x}); Rule("IsIntegerOrString", 1, 1, True) IsInteger(x) And IsString(x); |
Consider however the following situation. Suppose we have a function f(arglist) where arglist is its list of arguments, and suppose we want to define a function Nf(arglist) with the same arguments which will evaluate f(arglist) and return only when all arguments from arglist are numbers, and return unevaluated Nf(arglist) otherwise. This can of course be done by a usual rule such as
Rule("Nf", 3, 0, IsNumericList({x,y,z})) <-- "f" @ {x,y,z}; |
However, this will have to be done for every function f separately. We would like to define a procedure that will define Nf, given any function f. We would like to use it like this:
NFunction("Nf", "f", {x,y,z}); |
Here is how we could naively try to implement NFunction (and fail):
NFunction(new'name, old'name, arg'list) := [ MacroRuleBase(new'name, arg'list); MacroRule(new'name, Length(arg'list), 0, IsNumericList(arg'list) ) new'name @ arg'list; ]; |
Now, this just does not do anything remotely right. MacroRule evaluates its arguments. Since arg'list is an atom and not a list of numbers at the time we are defining this, IsNumericList(arg'list) will evaluate to False and the new rule will be defined with a predicate that is always False, i.e. it will be never applied.
The right way to figure this out is to realize that the MacroRule call evaluates all its arguments and passes the results to a Rule call. So we need to see exactly what Rule() call we need to produce and then we need to prepare the arguments of MacroRule so that they evaluate to the right values. The Rule() call we need is something like this:
Rule("actual new name", <actual # of args>, 0, IsNumericList({actual arg list}) ) "actual new name" @ {actual arg list}; |
Note that we need to produce expressions such as "new name" @ arg'list and not results of evaluation of these expressions. We can produce these expressions by using UnList(), e.g.
UnList({Atom("@"), "Sin", {x}}) |
"Sin" @ {x}; |
UnList({IsNumericList, {1,2,x}}) |
IsNumericList({1,2,x}); |
Here is a second version of NFunction() that works:
NFunction(new'name, old'name, arg'list) := [ MacroRuleBase(new'name, arg'list); MacroRule(new'name, Length(arg'list), 0, UnList({IsNumericList, arg'list}) ) UnList({Atom("@"), old'name, arg'list}); ]; |
Finally, there is a more concise (but less general) way of defining NFunction() for functions with known number of arguments, using the backquoting mechanism. The backquote operation will first substitute variables in an expression, without evaluating anything else, and then will evaluate the resulting expression a second time. The code for functions of just one variable may look like this:
N1Function(new'name, old'name) := `( @new'name(x_IsNumber) <-- @old'name(x) ); |
In> x:=y Out> y; In> `(@x:=2) Out> 2; In> x Out> y; In> y Out> 2; |
This is useful in cases where within an expression one sub-expression is not evaluated. For instance, transformation rules can be built dynamically, before being declared. This is a particularly powerful feature that allows a programmer to write programs that write programs. The idea is borrowed from Lisp.
As the above example shows, there are similarities with the Macro... functions, that serve the same purpose for specific expressions. For example, for the above code, one could also have called MacroSet:
In> MacroSet(x,3) Out> True; In> x Out> y; In> y Out> 3; |
The difference is that MacroSet, and in general the Macro... functions, are faster than their back-quoted counterparts. This is because with back-quoting, first a new expression is built before it is evaluated. The advantages of back-quoting are readability and flexibility (the number of Macro... functions is limited, whereas back-quoting can be used anywhere).
When an @ operator is placed in front of a function call, the function call is replaced:
In> plus:=Add Out> Add; In> `(@plus(1,2,3)) Out> 6; |
Application of pure functions is also possible (as of version 1.0.53) by using macro expansion:
In> pure:={{a,b},a+b}; Out> {{a,b},a+b}; In> ` @pure(2,3); Out> 5; |
Pure (nameless) functions are useful for declaring a temporary function, that has functionality depending on the current environment it is in, or as a way to call driver functions. In the case of drivers (interfaces to specific functionality), a variable can be bound to a function to be evaluated to perform a specific task. That way several drivers can be around, with one bound to the variables holding the functions that will be called.
When entering a block, a new stack frame is pushed for the local variables; it means that the code inside a block doesn't see the local variables of the caller either! You can tell the interpreter that a function should see local variables of the calling environment; to do this, declare
UnFence(funcname, arity) |