Programming your own script language

By: Christian Berger

Abstract: This article describes how to write a simple stack-based script language.

Programming your own script language


Concept:

First we have to think of a proper concept for such a language. It should be simple to use and to implement, as well as relatively fast. Stack based languages seem to fit these criteria best.

Stack based languages:

Unlike languages like Pascal, BASIC or C, Stack based languages like Forth store their data on a stack instead of variables. This stack provides fast and simple access to the data, without having to go thought a database of variable names. It also reduces memory consumption by letting you automatically reusing memory without any special programs. In case the stack isn't flexible enough some languages also provide you with the ability to use variables.

Well some people now might ask what a stack is. Well it's quite easy. Imagine a stack of cans in real life. You can only do 3 things, put something on top of the stack, take something off the top of the stack and look at the top of the stack without taking it. Those actions are usually named push, pop and peek.
I think there are lots of implementations of Tstack objects avaliable which do everything a stack needs to do. However those might introduce quite some overhead.
A simpler way of making a stack is like this:
VAR stackitems:ARRAY[0..1023] OF variant;
    stackpointer:integer;

PROCEDURE push(v:variant);
BEGIN
  Inc(stackpointer);
  stackitems[stackpointer]:=v;
END;

FUNCTION peek:variant;
BEGIN
  peek:=stackitems[stackpointer];
END;

FUNCTION pop:variant;
BEGIN
  pop:=peek;
  dec(stackpointer);
  IF stackpointer <0 THEN stackpointer:=-1;
END;

The stack should be able to hold all kinds of values: strings, integers, floats maybe even objects. The variant seems to be the best type of variable for it, but in case you only want to work with numbers, double or extended or maybe integer might suit your wishes.

Here's my suggested list of operators:

 +         : pops 2 values from the stack and puts the result back
 -         : pops a value from the stack then pops another and subtracts the first value from it, the results goes back to the stack
 *         : like + but multiplies
 /         : like - but divides (eventually you could do a div and mod)
 #         : pops a value from the stack and pushes it back twice
 %         : pops 2 values from the stack and pushes them so they changed places
 <name     :pops a value from the stack and stores it in a variable named name
 >name     :gets a value from a variable and pushes it onto the stack
 .         :pops a value from the stack and prints it
 ,         :lets the user enter a value and pops it on top of the stack
Of course you may need more operators, but this is probably a good set for doing basic calculations.

Let's look at what we already can do:
Imagine you wanted to calculate (x*(1+y/100)*z)/(1-f/100)
In Pascal you'd need to write a complicated parser for that, but with a script language you could simply execute a little script like this:

<x                         (pushes x)
    1                      (pushes  1 onto the stack)
     <y 100 /              (leaves y/100)
    +                      (adds 1 and y/100)
<z                         (pushes z onto the stack)
    *                      (multiplies z and 1+y/100)
*                          (multiplies x and z*(1+y/100))
 1                         (pushes 1 onto the stack)
    <f 100 /               (leaves f/100)
    -                      (subtracts f/100 from 1)
/                          (divides x*z*(1+y/100) by (1-f/100) )

That leaves the result of x*z*(1+y/100)/(1-f/100), which essentially is the same as above, on the stack. With a simple >r it can be stored into another variable. All commands in this script can be parsed by a simple Tstringlist.Commatext function. The interpreter could, for example, look like this:
PROCEDURE interpret(_program:string;VAR  _variables:string);
VAR program:Tstringlist;
    variables:Tstringlist;
    n:integer;
    command:string;
    x,y:variant;
BEGIN
  program:=Tstringlist.Create;
  TRY
    program.Commatext:=_program;
    variables:=Tstringlist.Create;
    TRY
      variables.Commatext:=_variables;
      FOR n:=0 TO program.count-1 DO
      BEGIN
        command:=program[n];
        IF command<>'' THEN //discard empty commands
	BEGIN
	  CASE command[1]  //only 1 character commands possible right now
	    '+': BEGIN pop(x); pop(y); push(x+y); END;
	    '-': BEGIN pop(x); pop(y); push(y-x); END;
	    '*': BEGIN pop(x); pop(y); push(x*y); END;
	    '/': BEGIN pop(x); pop(y); push(y/x); END;
	    '#': BEGIN pop(x); push(x); push(x); END;
	    '%': BEGIN pop(x); pop(y); push(x); push(y); END;
	    '>': BEGIN variables.values[copy(command,2,255)]:=pop; END;
	    '<': BEGIN push(variables.values[copy(command,2,255)]); END;
	    '.': BEGIN pop(x); writeln(x); END;
	    ',': BEGIN readln(x); push(x); END;
	    ELSE push(command);
	  END;
	END;
      END;
      _variables:=variables.Commatext;
    FINALLY
      variables.free;
    END;
  FINALLY
    program.free;
  END;

ND;

This code is of course only a guide, I couldn't even test it since I currently don't have a computer capable of running Delphi. As you see, adding your own commands is easy, just alter the CASE statement. In case it doesn't understand a command it pops it onto the stack which is a simple way to use constants. In this example you can get you values back by looking at the _variables parameter which contains all the variables in a <name>=<value>,<name... format.



Possible extensions which are easy to add are:
string function (pos, copy, insert...)
aritmetric functions (sin, cos, tan, sqr, sqrt...)

More difficult extensions I plan to discuss in further articles are:
Interaction with objects (create destroy...)
loops, conditions (not as easy as it looks at first)

Hope to see you next time
Christian Berger (Casandro)

Server Response from: ETNASC01