Links

Categories

Tags


« | Main | »

JewelScript 1.0 changes log

By Jewe | February 8, 2010

This post lists all the changes in JewelScript 1.0 compared to the previous public release, v0.9.2.4. The most recent change is at the top of this post.

Improvements, Changes, New Features

New built-in class “runtime”

I’ve added a new built-in class, named “runtime”, which contains a number of useful global functions. It allows for example to get version number, stack-size, type-ids from type-names and vice-versa, and other stuff.

There’s also a function to get the current value of the instruction counter, which may be useful for performance measurements (measure instructions per second). See this example script.

Basic support for exceptions

JewelScript now has basic support for exceptions. You can throw your own exception classes, however they have to implement the built-in exception interface. This post discusses this new feature in more detail.

Garbage collector added

There is now an integrated mark and sweep garbage collection mechanism in the runtime. It’s use is entirely optional. If your application is going to use it, it’s native code must respond to the GC’s MARK message. All built-in classes and the binding-code generator have been updated to support the GC.

See this post for more information on the garbage collector.

Co-functions revised

The way the compiler handles co-functions and the syntax of co-functions has been improved for better adherence to JewelScript’s strong typing paradigm, and to be more consistent with the rest of the language. For more information about the changes, see this post.

JewelScript API Revised

The way how references to script functions, methods and delegates are given out to the user, as well as the way how script functions can be called by the native application has been unified and simplified. Take a look at this post for detailed information about what has changed.

New reference handling

One of the most often criticised things in JewelScript was it’s C++ style reference handling. Meaning all values were by default passed around as copies, no matter if they were simple integral values or fat objects. If you wanted to pass an object as reference, you had to explicitly declare the left-hand variable as a reference using the ‘&’ modifier – just like it is in C++.

While this had some nice benefits, like for example being able to very easily copy all kinds of deeply nested object trees, being able to modify the caller’s variables passed to a function, or being able to directly assign new values to function results, there was the downside that you had to explicitly declare 95% of your variables as references in order to avoid unnecessary copying.

In an effort to remove this major inconvenience, as well as make the language be easier to understand and use by developers who have learned other modern languages, this has been redesigned.

The new version of the compiler distinguishes between value-types and reference-types, and will always and automatically pass a value-type as a copy, and a reference-type as a reference. Value-types are currently the int and float types, all other user-defined or built-in types are reference-types.

Person a = new Person("Bud");
Person b = a;                 // take a reference from Bud
Person c = new Person(a);     // clone Bud

True copy-constructors

In former versions of the runtime, the copy-constructor of a class was only invoked, if the script developer “manually” coded the respective new-statement. If, for any reason, the runtime automatically copied a script- or native class, the copy-constructor was not invoked.

This has been improved. The copy-constructor of a class is now always invoked, if the class has defined one. Even if the object is a script object and being copied as the result of a native function call. If a script class defines no copy-constructor, the runtime will automatically create a copy of it. If a native class defines no copy-constructor, it means the object cannot be copied. In this case the compiler will issue an error at compile-time.

In addition, objects are no longer deep-copied by the runtime, neither in the built-in native types, nor when the runtime automatically copies script objects. The built-in native container classes array, list, and table now only copy references to the source object’s elements (except value-types) when copied. This has been done for better performance and to have a consistent, generalized behavior throughout the whole runtime.

If a script class is being automatically copied, it’s members will be references to the members of the source object (except value-types). To override this behavior, and create physical copies of the source’s members, a class can implement a copy-constructor.

Changes to the native type interface

The redesign of the reference handling also affects / simplifies the native type interface. The messages “NTL_CopyObject” and “NTL_SetObject” are now obsolete. Copying, if the native type wishes to handle it, is now done “properly” by defining a copy-constructor method.

It is no longer possible to “set” an existing object to the values of a source object, so any code for handling the “NTL_SetObject” message can simply be removed.

Likewise, it is no longer possible for a native type to modify arguments passed to the native function on the stack, therefore the NTLSetArg…() methods have been removed. Native types that have used this method to return multiple values now need to return a small data-object, or modify a data-object passed to the function as an argument.

Integrated C++ binding code generator

The compiler now has an integrated binding code generator that outputs C++ binding code to easily use existing native functions and classes from within JewelScript. The binding code generator is discussed in detail here.

Predefined aliases ‘bool’ and ‘char’

The compiler now has the built-in type aliases ‘bool’ and ‘char’, which are both aliases for the type int. There is no real benefit in using these types instead of int in function declarations, except maybe making the code clearer to read and understand.

Improved built-in classes

The built-in classes string, array, list and table have been improved. Some new methods have been added, for example the enumerate() method that allows you to call a delegate for each element stored in an array, list or table. The string and array classes have a process() method that takes a delegate to process every character / element in the string / array and returns the result as a new instance.

Also, a deepCopy() method has been added to array, list and table, that returns a deep-copy of the respective object, which might come in handy in cases where you really want a completely independent* physical copy of the container.

The table has a new constructor that allows to initialize a table from a given array, making the table much easier to fill with elements. The array is interpreted in key / value pairs. The key has to be a string, the value can be any type:

table t = { "key1", 30, "key2", 40, "key3", 50 };

(* It depends of course on the actual objects in the container and how their copy-constructors are implemented how deep and independent from the source container the copy will be.)

Improved (C++ conform) behavior of && and || operators

Finally, I have revised the parsing code for the && and || operators. They now behave like developers are used to from virtually all other C-style programming languages.

If the first operand’s result of an && expression is false, the second operand will not be evaluated.

If the first operand’s result of an || expression is true, the second operand will not be evaluated.

This means, that this code will now work in JewelScript like it should:

Person a = null;
if( a != null && a.Name == "Bud" )
{
}

Reference re-initialization operator removed

Due to the redesign of the reference handling, the reference re-initialization operator is no longer required and has been removed. In former versions, you could not simply assign a new value to a reference variable after it’s initialization (because this was compiled into a “set”).

// Old version:
Person& a = new Person("Bud");
& a = null;

// New version:
Person a = new Person("Bud");
a = null;

Compiler options removed / changed

The compiler option “auto-copy-const” has been removed. Constants are now always implicitly copied, if they are assigned to variables that are not declared const.

The compiler option “legacy-for-scoping” has been removed.

The compiler’s default warning level has been set to “3” in release builds to filter out unimportant (level 4) warnings.

The compiler’s default optimization level has been set to “3” (maximum) in release builds.

It is no longer necessary to specify “stack-locals=yes” to enable maximum optimization. Instead, that option is automatically enabled, if you enable maximum optimization.

A new runtime option “log-garbage” has been added. This option allows you to log detailed information on objects reclaimed by the garbage collector in debug and release builds of the runtime.

Operator new and basic types

The old version tolerated omitting the empty pair of parentheses when you instantiated a basic type with operator new:

int i = new int;
float f = new float;
string s = new string;
array a = new array;

This has been changed for consistency with all other types. The empty parentheses are now mandatory.

Automatic instantiation / default initialization

The old version only allowed regular (non-reference) variables to be automatically instantiated:

Person a;  // automatically construct 'a'
string s;  // construct empty string

If a variable was not initialized with a value, the language would try to create and assign an instance by calling the classes default constructor. This, however, was only allowed with regular variables, not with reference variables.

Since in the new version all class-type variables are implicitly references, this has been changed. The default initialization now works with reference variables.

Typeless variables could not be automatically initialized in older versions, since the language had no knowledge which constructor to call. The new version will automatically initialize typeless variables with null.

Finally, the old version auto-constructed an instance regardless whether or not the classes default constructor was declared explicit. This has been fixed. Classes that declare the default constructor explicit can no longer be automatically instantiated.

Strict classes and interfaces

Up to now, the language always issued a compile-time error when you declared a function, but never implemented it. During the linking process, you would get an error if a function has been declared, but not implemented. This was done to ensure that you didn’t by mistake declare a function differently from its definition.

However, there are cases where you probably want to make the implementation of a certain function optional, meaning script developers can implement certain functions, but don’t have to.

Therefore there is now a change in the behavior of the language: By default, if you declare a function, but never implement it, the compiler will notify you with a warning, and then create a default function implementation. For functions that don’t have a result, this default implementation just returns. For functions that return a value, the default implementation returns null.

This allows you to inherit from an interface and implement only the functions you need, while the compiler default implements the missing ones, so the native application can rely on a complete interface.

To return to the old behavior, and enforce at compile time that all functions in the interface or class are implemented, you can now declare an interface, a class, or even individual functions using the strict modifier keyword.

strict interface IFoo {
    method int Get();
    method Set(int);
}
class Foo : IFoo {
    method Foo() {
        m_Value = 0;
    }
    method int Get() {
        return m_Value;
    }
    method Set(int i) {
        m_Value = i;
    }
}

This class Foo must specify all the functions inherited from IFoo to avoid the compiler from generating an error during compile time. Note that the class is not declared strict, so if we declare new functions for Foo, we don’t have to implement them and can rely on the linker to create a default implementation for us.

We could also make the interface not strict and declare the class strict instead. Then the compiler would also enforce that we implement any new functions declared in the class. Finally, we could declare both interface and class not strict, but declare individual functions in either the interface or the class strict, if we are certain that a default implementation would not be sufficient for this kind of function and want to enforce that script developers implement it.

Delegates – first class function support

Finally, first class function support made it into the language, something probably no language really can do without. First class functions are functions that can be treated like values in the language – they can be assigned to variables, passed as arguments to functions, and called indirectly through the variable holding a reference to the function.

I’ll admit that I shamelessly ripped from C# when adding this new feature, well to some extent at least.

The new version allows you to take a reference from a global function, global class member function, or – and that’s the cool thing – instance class member function, assign it to a variable and call it indirectly and type-safely through that variable.

In order to do so, the developer first needs to declare a delegate type. A delegate type is a type that simply describes the signature of a function or method. It specifies the result type, number of arguments and argument types.

delegate int Comparator(string, string);

This statement defines a new type to the language which can then be used to define a regular variable. You can define global variables, class member variables and local variables.

function int StringLessOrEqual(string a, string b)
{
    return b.length <= a.length;
}

function test(string a, string b, Comparator compare)
{
    if( compare(a, b) )
    {
        print("The comparator returned true");
    }
}

function main()
{
    test("Hello", "foo", StringLessOrEqual);
}

Note that a delegate variable does not distinct between global functions and instance member functions – if the specified function’s result type and arguments match the delegate, the delegate variable will accept the function reference. So we can assign a global function and an instance member function to the very same delegate variable at different times during runtime of a program.

Additionally the delegate variable does not distinct between native functions or those implemented in script. It is absolutely no problem to assign a function or method from a native class to the same delegate variable that has been used for a script function. This makes using delegate variables fun, uncomplicated and very flexible.

When assigning an instance member function to a delegate variable, the language will store both, a reference to the specified function, AND a reference to the instance of the object in the delegate variable.

Foo foo = new Foo();
EventHandler handler = foo.ReceiveEvent;
handler("open");    // calls foo.ReceiveEvent("open");

This opens up a lot of possibilities. For example, it enables you to create “hybrid” classes, classes that can contain references to member functions defined in another class. This lets you also add functions from a native type to your script class, for example. Just add a delegate variable for every function of a native class you need in your script class and you can call them directly from your script class, as if they were implemented there. For a detailed example, see this file.

Another interesting use of delegates is in combination with arrays. This lets you create delegate arrays, which you can use for random access to functions, or a function stack. For example you can create an array of callback functions if your object wants to notify other objects about an event:

delegate int ClickFn(int x, int y);
ClickFn[] onClick = { foo.OnClick, bar.OnClick, hello.OnClick };

In addition operator += can be used to subscribe to an event by adding the event handler function to the array:

onClick += myObj.OnClick;

You can use the new array::enumerate function to call all the delegates in an array, or simply use a for-loop:

for(int i = 0; i < onClick.length; i++)
    onClick[i](x, y);

Delegate arrays are also a good alternative to large switch-statements, and actually faster than a switch:

const int kInit = 0;	// enumerate our states
const int kWelcome = 1;
const int kMain = 2;
const int kGameOver = 3;
const int kNextLevel = 4;

delegate GameState();
GameState[] State = { OnInit, OnWelcome, Main, GameOver, NextLevel };

int CurrentState = kInit;

function RenderFrame()
{
    State[CurrentState]();
}

Hybrid class support

If you have looked into the “hybrid” example script file I linked above, you will probably have thought “well that’s a lot of work, defining a delegate type, a delegate variable and the assignments in every class constructor for every function I want to reference in my script class.”

In fact, I thought that too. And since I thought having methods of other classes directly in your script class, especially if they are methods from a native type, would be a cool and useful feature, I decided to have the compiler do all this work automatically for you. The keyword I introduced for this is hybrid.

While declaring your class, you can add the clause hybrid typename to your class declaration. This will instruct the compiler to automatically “derive” all suitable functions from “typename”. Specifically it will automatically define the required delegate types and add the required member variables to your class for every suitable function in “typename”. Suitable functions are global class member functions and instance class member functions, but not constructors, convertors, accessors or cofunctions.

Likewise, you have to specify hybrid(expr) to every constructor that you define in your class. This will specify the instance of “typename” that the compiler should use to obtain instance member functions from. The compiler will use this to generate code in your constructor that initializes your delegate variables.

Here is the example from above implemented using the hybrid keyword.

Local functions / anonymous delegates

If your program is really using callback functions and delegates a lot, having to define a delegate and a regular function or method in your class can become quite an effort, especially if you don’t really need the code of the function for other purposes. Sometimes, you just want to pass a piece of code as a delegate to a function, without having to think off a name for the function or method, because you know you won’t use it anywhere else.

For these cases, JewelScript allows you to pass an anonymous function or method as a literal to a function and assign it to a local, member or global variable. To make this as easy as possible, the language will derive the function’s signature from the left-hand type of the assignment:

delegate int OnClick(int x, int y);

OnClick clickHandler = function
{
    print("Click at " + x + ", " + y + "\n");
    return true;
};

In this little example we define a global variable ‘clickHandler’ that gets assigned a reference to an anonymous function. The language will determine the signature of the function to generate from the value left from the assignment, which is a delegate of type ‘OnClick’, which defines an integer result and two integer arguments ‘x’ and ‘y’.

Note, that in this case it is mandatory to specify names for the arguments in the delegate declaration, whereas it is optional if you don’t use the delegate for anonymous functions.

Obviously, this method of determining the signature of the function by looking at the left hand value implies that the left hand value’s type must be a delegate. You cannot rely on automatic type conversion when assigning anonymous functions, and no typeless variables can be used left from the assignment.

Furthermore, note the semicolon at the end of the assignment statement. Unlike ‘regular’ named functions and methods, which are (block-) statements that don’t require a closing semicolon, an anonymous function is actually an expression and thus the assignment statement requires you to complete it with a semicolon, just like you would with any other assignment.

Obviously, anonymous functions are mostly useful to pass small pieces of code to functions that expect a delegate as an argument, code that is so small, that defining a new function for it isn’t really worth the effort.

delegate int Comparator(string a, string b);

function main()
{
    test("hello", "foo", function { return b.length <= a.length; } );
}

Last but not least, since anonymous functions are expressions and can be used in functions, they can be nested:

delegate VoidFunction();
function main()
{
    VoidFunction subFunc = function
    {
        VoidFunction subSubFunc = function
        {
            print("This is subSubFunc()\n");
        };
        print("This is subFunc()\n");
        subSubFunc();
    };
    subFunc();
}

Finally, look at this funny little construct:

VoidFunction[] States =
{
    function
    {
        print("This is an anonymous function.\n");
    },
    function
    {
        print("This is another anonymous function.\n");
    },
    function
    {
        print("This is the third anonymous function.\n");
    }
};

function main()
{
    for(int i = 0; i < States.length; i++)
        States[i]();
}

You can even do things like this:

((VoidFunction) function { print("Hello!\n"); })();

Can you guess what this does? I won’t tell you, because I really don’t recommend doing things like this in JewelScript. Heck, it almost looks as absurd as the things you can do in C! :-)

User-defined type aliases

In order to make the delegate feature work, I had to implement the possibility to add alternative names to existing types. So the thought to make this a feature available to the user was not too far away. The keyword alias allows you to define alias names for existing types.

alias int bool;
alias int char;

bool IsEqual(int a, int b)
{
    return a == b;
}

char GetChar(int i, string s)
{
    return s.CharAt(i);
}

An alias is not a new type. Both the alias and the originating type share the same type-id in the compiler and runtime, and they are always implicitly convertible to each other.

Aliases are not only “nice to have” though – there are cases where you actually may need them. One scenario is when you have defined a delegate type inside the name space of a class.

class Foo {
    delegate int FooProc(int);
}

By default, you cannot access types defined in classes from outside of the class. Not because they are “private” or “protected”, but simply because the language does not support declaring a variable like this:

Foo::FooProc myVariable;

The compiler will not recognize this as a valid variable declaration. You can still get what you want by defining an alias, though:

alias Foo::FooProc GlobalFooProc;

GlobalFooProc myVariable;

Simpler declaration of typed arrays

The compiler now accepts the term typename[] as syntactical sugar for typename array, this has been done to make developers that are used to other languages feel more at home with JewelScript.

function string[] Convert( string[] arr );

Switch statement supports strings

The switch statement can now also be used to compare a string variable against a set of string constants. Not a big innovation, but can come in handy.

function Command(string cmd)
{
    switch( str )
    {
        case "open":
            Open();
            break;
        case "read":
            Read();
            break;
    }
}

Accessor methods support all assignment operators

Until now, accessor methods were only invoked when using a normal assignment statement.

accessor Foo::value(int v)
{
    m_Value = v;
}

The above statement defines a handler method for the case where we assign an integer to a class member property we called “value”. The following expression will call this handler:

Foo myFoo = new Foo();
myFoo.value = 17;

However, in previous versions you could not use all assignment operators like this. For example to increase “value” by 10, you needed to use the following workaround:

myFoo.value = myFoo.value + 10;

With the changes made today, finally all assignment operators work like expected with accessor functions.

myFoo.value += 10;

This will call accessor int Foo::value(), add 10 to its result, and call accessor Foo::value(int).

Automatic type conversion revised

In earlier versions, automatic type conversion was not fully supported by all operators. For example, the & (binary and) and | (binary or) operators, did not do automatic type conversion at all. This means if you specified an instance of a class to one of those operators, and the class defined a convertor method to type int, then this conversion method never got called. Instead the operator just issued a compile-time error.

This has been vastly improved. All operators now are able to automatically convert one or both of their operands, if they are not of a suitable type, but there is a way to convert them to a suitable type.

class Foo {
    method Foo(int i)      { m_Val = i; }
    method int convertor() { return m_Val; }
    int m_Val;
}
function Foo GetFoo(int i) { return new Foo(i); }

// create 2 Foo instances, convert to int, and compare
if( GetFoo(32) > GetFoo(27) )
{
}

Warning when assigning null to non-reference variables

Previous versions of the language did tolerate if you assigned null to variables that are not references. To notify you of a probable mistake, the compiler will now warn you if it can detect that you are trying to assign null to a variable that is not a reference.

function SetFloat( float f ) { ... }

SetFloat( null );

Warning level of int/float conversions set to 4

The warning level for “implicit conversion of float to int” or “implicit conversion of int to float” respectively, has been set to 4, which is the lowest level at the moment. This lets you easily filter out those warnings by setting the warning level to 3 in the option statement.

I decided against the possibility to “mute” these warnings when using the cast operator, like it is possible in C or C++. The problem with casting away warnings is that once you did, there is no way of bringing back the warnings if you need to find a bug caused by implicit conversion and data truncation / data loss.

Using the warning level to filter out these warnings has the advantage that you can switch back to a lower warning level if you need to find a conversion bug. However, its not perfect – being able to switch off individual warnings would be even better, but would also cause developers more work, specifying all the warnings they don’t want to see.

Up / down cast of objects after operator “.”

It is now possible to cast an object to a super-class or sub-class in order to access member variables or methods. When casting an object from a super-class down to a sub-class, a rtchk instruction is generated that will ensure during runtime that the object in the variable really is of a compatible type.

interface IFoo { ... }    /* super-class */
class Foo : IFoo { ... }  /* sub-class */

function Test( IFoo& ifoo )
{
    // cast down to 'Foo' in order to access
    // a method not defined in interface
    ifoo.Foo::SomeMethod( 27 );
}

New optimization algorithms added

New optimization algorithms have been added that simplify the code generated for the not operator when invoked by an if statement, as well as removing unnecessary branches in certain cases:

    if( !myVar )
    {
    }
    return;
    copy    (sp+0),r3   ;load variable
    unot    r3          ;perform not
    tsteq   r3,3        ;test if 0 and jump to next instruction
    ret                 ;done

There are two things that can be optimized in this snippet. 1) We can omit the unot instruction by using ‘tstne’ instead of ‘tsteq’. 2) Since the if block contains no code, the branch just jumps to the next instruction, so the entire test can be removed:

    copy    (sp+0),r3
    ret

A new optimization has been added that detects when an operation followed by a move can be simplified by swapping source and destination operands of the operation.

    jsr    189       ;call some function
    addl   r1, r3    ;add result to some temporary result
    move   r3, r1    ;make this our result
    ret              ;return

This can be simplified as follows:

    jsr    189       ;call some function
    addl   r3, r1    ;add some temporary result to result
    ret              ;return

Bugs Fixed

Along with adding these new features to the language a number of bugs have been fixed, some of which were introduced when adding the new features, some having been there for a long time and just been discovered. I will only list those bugs that probably have been nagging script developers, or where fixing them might affect code from previous versions.

Local variables defined in switch / case

A serious bug has been fixed that caused the compiler to mess up the stack when you tried to define a local variable in a case tag of a switch statement. It is kind of amazing I never discovered this bug earlier, which is probably because I’m so used to programming in C, that I tend to wrap every case tag into a block statement, or declare all variables outside of the switch, which didn’t expose this bug.

Anyway, a nice side effect of fixing this bug is, that it is now possible to define the same local variables in each case tag of a switch statement.

switch( e )
{
    case 1:
        int i = 7;
        int g = i + 2;
        print( g );
        break;
    case 2:
        int i = 9;
        int g = i + 3;
        print( g );
        break;
}

Checking for multiply defined accessor functions

JewelScript allows functions and methods to have the same name, but different argument and / or result types. Accessor methods however, “look” like normal member variables to the outside of the class, so their name should not be associated to more than one argument or result type.

Previous versions did not check if you defined multiple accessor methods with the same name, but different argument types. Here is an example:

accessor Foo::value(int v) {...}
accessor Foo::value(string s) {...}

Foo myFoo = new Foo();
myFoo.value = 17;		// invoke accessor no. 1
myFoo.value = "hello";		// invoke accessor no. 2 ???

You probably already see the problem. We refer to the same property name in those last two assignments, but with different types right from the assignment operator. Actually, this never worked in previous versions, and never was intended to. Unlike methods and functions, accessors with different types and the same name were never meant to be supported.

I’m sure some readers regret this decision and think it would have been cool to allow to define a single property name that accepts multiple different types, handled by different accessor methods. However, that feature would not be very useful and obscures the code to some extent.

Additionally, there would be the more serious problem of accidentally invoking the wrong accessor method when automatic type conversion is performed or typeless values / variables are used.

Topics: docs | Comments Off on JewelScript 1.0 changes log

Comments are closed.