ReplicaNet and RNLobby  1
RNSimpleScriptCompiler

Introduction

RNSimpleScriptCompiler compiles script files and produces binary output. The binary output uses a virtual machine instruction set and is designed to be simple to execute without much internal state needed.

Using the RNSimpleScriptCompiler

The compiler is called RNSimpleScriptCompiler.exe and is in the RNSimpleScriptCompiler directory.

Command line parameters

The compiler accepts options followed by input file(s). If more than one input file is supplied then they are all linked into one output in the input order.
-h[L] Display help [with language]
-o <file> Place the output into "file". If no output file is supplied then the first input file is used as a root name and the extensions bin and bnd are appended.
-I <path> Include files from the supplied path. The compiler will automatically look for "*.lang" files in a "lang" directory near the executable and the source file being parsed.<.br> -vI Output verbose information about include files searched. -O1 Optimisation level 1. Optimises some common instruction sequences.
This also strips unused procedures if the procedure is enclosed in in labels beginning with "._startOfSection_" and "._endOfSection_" and there are no external references to the function.

The SimpleScript language and using it with development tools

There is a high level language similar to BASIC which is compiled into the virtual machine instruction set. Both the high level language and the machine instruction set are available for use.

The text file parsing includes directives such as:
#Include <filename>
#Define source destination
#pragma once

The high level language includes commands such as:
Command: Data value[,value][,value...]
Stores data in memory either using integer or floating point depending on the type of the data
Command: Decl <variable name>="">
Declares a variable name either as a global variable or a local stack based variable.
Command: DefProc <procedure name>="">[(<variable>[,<variable>][,<variable>...])]
Defined a procedure name with an optional list of input variable names used as parameters.
Command: EndIf
Command: EndProc
Command: Goto label
Command: If (value OP value)
Where: value can be a value or a variable. OP can be > >= < <= == !=
Command: Label label
Where: label can be prefixed with a '.' to turn off name decoration
Command: Return [value]
Where value can be a variable or a value

The virtual machine has 32 registers, program counter and a stack. r0 is always a constant 0. The register name "sl" is special and corresponds to the last register r31.
Command: AsmAdd rd,rx,ry
Command: AsmAddr rd,label or AsmAddr rx,#<int>
Where label is the label name or <int> is an offset from the start of this instruction in memory.
Command: AsmBranch [EQ/NE/LT/GT/LE/GE] [PP] .label
Command: AsmCmp rd,rx
Command: AsmDiv rd,rx,ry
Divide always promotes the result to a float value.
Command: AsmExternal #int
Command: AsmGetFloat rd,#float
Command: AsmGetInt rd,#int
Command: AsmGetNData rd,"string"
Command: AsmLoadReg rd,rx
Where rx is used as the address of the register to load from in memory. Each register is 64 bits long.
Command: AsmMul rd,rx,ry
Command: AsmNOP
Command: AsmPop rd
The sl register will also be decremented by one. If the target register is r0 then this will cause the original pushing register to be updated.
Command: AsmPush rs
The sl register will also be incremented by one.
Command: AsmPushPC #<int>
Where <int> is a value to add to the pushed PC value based from the start of the instruction address
The sl register will also be incremented by one.
Command: AsmRet
The sl register will also be decremented by one.
Command: AsmStoreReg rd,rx
Where rx is used as the address of the register to load from in memory. Each register is 64 bits long.
Command: AsmSub rd,rx,ry



This code demonstrates a procedure (DefProc) with variables, loops (Goto) and conditional expressions (If/EndIf).

DefProc testDoFibonacci(start)
    Decl var1,var2,temp
    var1 = start
    var2 = start
Label loop
    temp = var1 + var2
    var1 = var2
    var2 = temp
    If temp > 1000
        Return var1
    EndIf
    Goto loop
EndProc

Compiling and examining the debug output will show the above code has been translated into code like:

Label ._proctestDoFibonacci
AsmPush r31
AsmAdd r31,r0,r0        // Same as move
AsmPush r9
AsmPush r10
AsmPush r11
AsmAdd r9,r1,r0     // Same as move
AsmAdd r10,r1,r0        // Same as move
Label .inFunc_testDoFibonacciloop
AsmAdd r11,r9,r10
AsmAdd r9,r10,r0        // Same as move
AsmAdd r10,r11,r0       // Same as move
AsmPush r12
AsmGetInt r12,#1000
AsmCmp r11,r12
AsmPop r12
AsmBranch LE inFunc_testDoFibonacciloop
Label .inFunc_testDoFibonaccimdif5l1e
AsmAdd r1,r9,r0     // Same as move
AsmBranch _proc_PopStackFrames
Label .inFunc_testDoFibonaccienif5l1e
AsmBranch inFunc_testDoFibonacciloop
Label ._outprocinFunc_testDoFibonacci
AsmBranch _proc_PopStackFrames

Label ._proc_PopStackFramesPop
AsmPop r0
Label ._proc_PopStackFrames
AsmCmp r31,r0
AsmBranch GT _proc_PopStackFramesPop
AsmPop r31
AsmRet

Note how the Asm prefixed instructions are being used. Note how r31 is used as a function prologue to maintain a count of registers pushed and popped onto the stack during this function. The common function epilogue starts at _proc_PopStackFrames and this pops the original registers for the number of values in r31. Also note how r0 is used with AsmAdd. Since r0 is always 0 this is the same as moving the other source register into the destination.
r1 is used as a return value for a procedure and the registers r1, r2, r3... in sequence are used for entry parameters.



Example12 contains Example12Script.txt which demonstrates StatServer specific code.

A sister tool for the compiler is the debugger called RNSimpleScriptDebugger.exe and located in the same directory. This tool will display register and debug information when running a compiled script and can aid in debugging script problems.

Using GCC to compile C++/C for SimpleScript

The SimpleScript compiler can now accept ARM code in GCC assembler format. This means any language supported by GCC ARM builds can be used by SimpleScript.

The SimpleScript compiler will switch to GCC ARM assembler mode when it encounters ".cpu arm2" and will return to normal SimpleScript mode when it encounters the ".ident" followed by some quoted text. Since SimpleScript register r0 is always zero then all GCC ARM instructions will start shift the registers up one, so ARM R0 becomes SimpleScript R1, GCC ARM R1 becomes SimpleScript R2 and so on. It is possible to mix SimpleScript and GCC ARM assembler code in the same file or include or pass in ARM ASM files. For example:

// Tests ARM ASM interop
AsmBranch RunTest

Label .GCCAValue                    // Setup a SimpleScript label so that GCC can access it.
Data 100

// Some GCC style ARM code
    .cpu arm2
    .section    .text.AnARMSection,"axG",%progbits,_ZN3BarD5Ev,comdat

RunTestARM:
    .fnstart

    mov r1 , #250
    ldr r2 , AValue
    add r0 , r0 , r1
    add r0 , r0 , r2
    bx lr

    .fnend

    .ident  "some text"

// Now back to SimpleScript code due to the above .ident
Label .RunTest
// Standard function prolog
AsmPush sl
AsmAdd sl , r0 , r0

AsmGetInt r1,#1                     // Effectively setup ARM register R0, remember ARM registers start at SimpleScript register R1 since R0 is always 0
AsmAddr r15 , _proc_PopStackFrames  // Effectively point the ARM r14 (lr) which is really SimpleScript r15
AsmBranch GCCRunTestARM


#include <FunctionEpilogue.lang>

If the above code is compiled and then executed with the RNSimpleScriptDebugger starting at address 0 then at exit SimpleScript R1 (GCC ARM r0) will be 351, as expected. GCC ARM code at GCCRunTestARM. Note all GCC ARM labels are automatically prefixed by the compiler with "GCC" to make them obvious. If GCC ARM code needs to access labels outside of the GCC ARM code then prefix the label with "GCC". This is why the code uses a label GCCAValue. The GCC ARM code at RunTestARM is then executed, it moves 250 to GCC ARM r1, loads AValue (GCCAValue) into GCC ARM r1, it then calculates r0 = r0 + r1 + r2. Lastly it uses "bx lr" to return, as normal ARM code does. Since the LR has been setup to be the function prolog then the code will exit normally. Using the debugger the registers can be seen to be:

r 0 =            0         0.000000     r 1 =          351       351.000000
r 2 =          250       250.000000     r 3 =          100       100.000000

The debugger shows SimpleScript register numbering.
To use C++ the GNU Tools for ARM Embedded Processors ( https://launchpad.net/gcc-arm-embedded ) can be used. The GCC ARM compiler needs some extra parameters to generate suitable ARM code, these are: -O3 -S -mcpu=arm2 -fno-rtti -fno-exceptions
The extra -O3 is optional of course, it just helps produce more optimal ARM code that results in better SimpleScript execution. The resulting ARM source files can then be compiled by the SimpleScript compiler. Including the following files ( BootStatGCC, Stat, ARMGCCstdlib ) it is possible to compile ARM code for use with the StatServer system. BootStatGCC includes ARMInterop and this creates a very small and simple ARM environment with some ARM stack register space and also handles calling the GCC global data initialiser functions if needed. ARMGCCstdlib will create a small amount of heap memory and support some memory allocation. In efect the whole output binary file is the whole extent of any virtual machine memory used by SimpleScript. It is even possible to include STL code, but the relevant STL library files must first be compiled by ARM GCC to ASM files and included. For example std::list needs libstdc++-v3\src\c++98\list.cc compiled with the version that is compatible with the compiler.