Introduction to Regions

C gives programmers complete control over how memory is managed. An expert programmer can exploit this to write very fast and/or space-efficient programs. However, bugs that creep into memory-management code can cause crashes and are notoriously hard to debug.

Languages like Java and ML use garbage collectors instead of leaving memory management in the hands of ordinary programmers. This makes memory management much safer, since the garbage collector is written by experts, and it is used, and, therefore, debugged, by every program. However, removing memory management from the control of the applications programmer can make for slower programs.

Safety is the main goal of Cyclone, so we provide a garbage collector. But, like C, we also want to give programmers as much control over memory management as possible, without sacrificing safety. Cyclone’s region system is a way to give programmers more explicit control over memory management.

In Cyclone, objects are placed into regions. A region is simply an area of memory that is allocated and deallocated all at once (but not for our two special regions; see below). So to deallocate an object, you deallocate its region, and when you deallocate a region, you deallocate all of the objects in the region. Regions are sometimes called “arenas” or “zones.”

Cyclone has four kinds of region:

Stack regions

As in C, local variables are allocated on the runtime stack; the stack grows when a block is entered, and it shrinks when the block exits. We call the area on the stack allocated for the local variables of a block the stack region of the block. A stack region has a fixed size–it is just large enough to hold the locals of the block, and no more objects can be placed into it. The region is deallocated when the block containing the declarations of the local variables finishes executing. With respect to regions, the parameters of a function are considered locals–when a function is called, its actual parameters are placed in the same stack region as the variables declared at the start of the function.

Lexical regions

Cyclone also has lexical regions, which are so named because, like stack regions, their lifetime is delimited by the surrounding scope. Unlike stack regions, however, you can can add new objects to a lexical region over time. You create a lexical region in Cyclone with a statement,

  region _ identifier_; _ statement_

This declares and allocates a new lexical region, named identifier, and executes statement. After statement finishes executing, the region is deallocated. Within statement, objects can be added to the region (see Region Allocation).

Typically, statement is a compound statement:

  { region _identifier_;
    _ statement_1
    ...
    _ statement__n_
  }

The heap region

Cyclone has a special region called the heap. There is only one heap, whose type is denoted `H, and it is never deallocated. New objects can be added to the heap at any time (the heap can grow). Cyclone uses a garbage collector to automatically remove objects from the heap when they are no longer needed. You can think of garbage collection as an optimization that tries to keep the size of the heap small. (Alternatively, you can avoid garbage collection all together by specifying the -nogc flag when building the executable.)

Dynamic regions

Stack and lexical regions obey a strictly last-in-first-out (LIFO) lifetime discipline. This is often convenient for storing temporary data, but sometimes, the lifetime of data cannot be statically determined. Such data can be allocated in a dynamic region. A dynamic region supports deallocation at (essentially) any program point. However, before the data in a dynamic region may be accessed, the dynamic region must be opened. The open operation fails by throwing an exception if the dynamic region has already been freed. Note that each data access within a dynamic region does not require a check. Rather, you can open a given dynamic region once, access the data many times with no additional cost, and then exit the scope of the open. Thus, dynamic regions amortize the cost of checking whether or not data are still live and localize failure points. We describe dynamic regions in detail in Section Dynamic Regions.

Cyclone forbids dereferencing dangling pointers. This restriction is part of the type system: it’s a compile-time error if a dangling pointer (a pointer into a deallocated region or to a deallocated object) might be dereferenced. There are no run-time checks of the form, “is this pointing into a live region?” As explained in Region Common Uses, each pointer type has a region and objects of the type may only point into that region.