Links

Categories

Tags

bass bindings blog changes log code debugger demo discussion documentation embedding example jewel++ native types news overview release resumee tutorial update

« Blog reopened for subscribers | Main | Basic exceptions added »

Garbage collection in JewelScript

By jewe | February 14, 2010

The headline says it all, I’ve finally decided to add a garbage collector to the runtime.

Personally, I never liked the idea of a language that “cleans up after the programmer”, and in regard to JewelScript, it was always my opinion that with a bit of discipline and the correct use of weak references, reference cycles – the only way to create memory leaks in JewelScript – could be avoided.

Just to recap, JewelScript uses reference counting to “know” when an object or value is no longer used and can safely be destroyed. However, the crux with reference counting is that you can create reference cycles. This happens when e.g. object A has a reference to object B, while object B also has a reference to object A.

In such a case, reference counting is actually useless and the two objects cannot be destroyed, unless one of them uses a weak reference, which is a reference that isn’t counted. So since JewelScript supports weak references, script programmers actually have a tool to avoid memory leaks.

On the other hand, avoiding reference cycles by using weak references or any other method requires that the script programmer actually has some fundamental knowledge of the inner workings of the script language, and that’s exactly the issue why I finally decided to add GC to JewelScript.

Because whether or not the script programmers are skilled and knowledgeable enough to avoid reference cycles, very much depends on the actual nature of application, and the scripting interfaces “target audience”.

For example, you could embed JewelScript in a book keeping or word processing software to allow users to program little automated tasks or “macros”. In such an application scenario, you wouldn’t want to ask that your users know all the finer details of weak references, reference counting, and reference cycles.

And the very idea of a “scripting language” as opposed to a “real” programming language probably is, that it should hide technical issues like that from the user, and make writing programs as easy and failsafe as possible.

Garbage collection in JewelScript

Anyway, so now JewelScript has an integrated garbage collector. However, this feature comes with a price. Because in order to use GC, all native types and, to some degree, also your application code needs to support the garbage collector by responding to it’s MARK message.

JewelScript’s GC is a simple “mark and sweep” garbage collector. The basic working principle is this: When the GC is run, it runs over all objects still on the VM’s stack, in registers and in global memory, sending them the MARK message. This will mark these objects as “still used”. Then these objects will propagate the MARK message down to the objects they still use, and so on. In the end, all of the objects that are still in use somewhere in the application will be marked. The GC will then run over all objects on the heap and find those that are not marked. Those will be considered “leaked” and destroyed.

Obviously, this would end up really messy if only one native type or application class would not propagate the MARK message on to the objects it still uses. This would cause the destruction of objects that are still in use and sooner or later lead to a crash. So if the GC is going to be used, it is imperative that all code properly supports the MARK message.

Note that using the GC in JewelScript is optional, your application is not forced to use it and GC is never run automatically. It is up to the application to decide whether memory leaks can be expected from the user or not. It is also up to the application to decide when it is time to run the garbage collector.

If you don’t want to use the GC, you don’t have to change any code. Note however, if your application should then run the GC, it will fail during the marking process, because your native types won’t respond to the NTL_MarkHandles message. That’s because usually type procs are written to return an error when they receive an unknown message.

If you do want to support the GC mechanism, then all of your native types need to correctly respond to the NTL_MarkHandles message by calling the new NTLMarkHandle() API function for every JILHandle pointer that your native class uses. Obviously, simple native classes that do not store any JILHandle pointers, don’t need to do anything, except return good status on that message.

With that, all native types are ready for GC, but what about your application itself?

There are cases, where one or more of your native C++ classes probably obtain JILHandle pointers from the runtime, but they don’t have a type proc, because they are not a native type for JewelScript. This applies for example to classes that obtain function references from the runtime in order to call script functions, or create script objects and store their reference.

These are classes that have objects from the VM, but they cannot be objects in the virtual machine, because they don’t have a type proc. Therefore, such classes won’t get the NTL_MarkHandles message from the GC. As a result, those JILHandle pointers would not get marked, regarded as unused, and destroyed by the GC.

To prevent this, I’ve added new application side functionality to the API, a mechanism to register for GC events, and unregister from them. Basically, if a C++ class obtains and stores handles from the runtime, it now has to register for GC events in order to get notified about the MARK message.

While I was adapting all my projects to support the GC, I even came across some hybrid types of native C++ classes - classes that have objects from the VM, and can be objects in the VM, because they implement a type proc.

In such a case it is best to handle both the MARK message sent to the type proc, and register for GC events. Because often there is no guarantee that a reference to such a class is currently on the VM’s stack when the GC runs.

As a rule of thumb, you cannot mark an object too often. It’s harmless to try to mark the same object twice, but it is fatal to not mark an object at all.

Topics: news | No Comments »

Comments

You must be logged in to post a comment.