Pointers with Restricted Aliasing

Introduction

The main benefit of the regions described thus far is also their drawback: to free data you must free an entire region. This implies that to amortize the cost of creating a region, one needs to allocate into it many times. Furthermore, the objects allocated in a region should be mostly in use until the region is freed, or else memory will be wasted in the region that is unused by the program.

In an attempt to alleviate each of these problems Cyclone provides a mechanism by which individual objects in a region may be freed prior to freeing the entire region. Pointers to objects that may be freed early must obey an aliasing discipline to prevent dangling pointers. A static analysis ensures that such objects can only ever be accessed through one pointer at any time. At the time it is freed, this pointer is invalidated, thus preventing all future accesses to the object.

Pointer types in Cyclone can be qualified by their aliasability. As of now, there are four alias qualifiers. Each of these qualifiers must appear as arguments to the @aqual() pointer qualifier mentioned in section Pointers. The four alias qualifiers are:

Aliasable Pointer types are by default qualified as ALIASABLE. These may be freely aliased, but can never be freed. For instance, int @@aqual(ALIASABLE) is an aliasable non-null pointer to an int.

Unique The UNIQUE qualifier on a pointer allows the object pointed to by that pointer to be deallocated individually, using the function rufree. For freeing objects to be safe, all accesses to such objects must be made through a single UNIQUE pointer. That is, only a single pointer may be used to access the object at any given time; this trivially guarantees that if the object is freed through its unique pointer, no other access to the object beyond that point is possible. Objects that become unreachable but are not freed manually will be freed by the garbage collector (assuming it’s not removed with -nogc). For instance, int ?@aqual(UNIQUE) is a unique fat pointer to an int.

Reference-counted The REFCNT qualifier reference-counted qualifier also permits freeing individual objects. Unlike the UNIQUE qualifier, multiple pointers to a single object are permitted, the number of which is tracked dynamically via a hidden reference count stored with the object. Additional pointers to an object are created explicitly via a call to alias_refptr, which increases the reference count. Individual pointers are removed via a call to rdrop_refptr; when the last pointer is removed (i.e., the reference count is 0), the object is freed. Like the unique region, objects that become unreachable will be freed by the garbage collector. For instance, int *@aqual(REFCNT) is a nullable pointer to a reference counted int.

Restricted All the alias qualifiers are arranged in a subtyping hierarchy. The RESTRICTED qualifier is a super-type of all the other qualifiers. A RESTRICTED pointer may not be freed nor can any aliases of the pointer be created. For instance, int *@aqual(RESTRICTED) is a restricted nullable pointer to an int. The subtyping hierarchy is as below:

               RESTRICTED
               /   |   \
              /    |    \
             /     |     \
     ALIASABLE   UNIQUE   REFCNT

Allocation and Freeing

Unique and reference-counted pointers can be allocated in any region. However, to support freeing individual objects from a region we use a special allocator (see Reap Allocator Implementation) that maintains additional metadata. We use the term “reap” (= region + heap) to refer to regions from which individual objects can be deleted. Any region can contain objects that can be individually freed. Thus all regions are in fact reaps.

The allocation mechanism for unique and reference-counted pointers follows a pattern similar to that used for normal region allocation described in the previous section. The allocation functions rely on handles for alias qualifiers which are of type aqual_t<`q::Q>. Here `q is a type variable of alias qualifier kind and may be instantiated with the alias qualifiers above. The Cyclone libraries provide the following global variables:

 aqual_t<ALIASABLE> Core::aliasable_qual
 aqual_t<UNIQUE> Core::unique_qual
 aqual_t<REFCNT> Core::refcnt_qual

which are handles for the aliasable, unique and reference-counted qualifiers respectively. Note that there is no handle for the restricted qualifier.

The following expressions are used for allocation:

The qnew functions allocate in the `H region, i.e., the heap. They expect an alias qualifier handle. For instance, to allocate a unique integer in the heap, one might use

  int @@aqual(UNIQUE) a = qnew(Core::unique_qual) 0;

To allocate an alias-free object in a region other than the heap the rnew functions can be used. For instance,

  {
    region r;
    int @@aqual(UNIQUE) `r a = rnew(r, Core::unique_qual) 0;
  }

If aq-identifier is not specified it defaults to the ALIASABLE handle.

Similar to qnew these functions allocate in the heap.

Similar to rnew these functions can be used to allocate alias restricted pointers in the specified region. If aq-identifier is omitted, it defaults to ALIASABLE.

Since there is no handle for the restricted qualifier, RESTRICTED qualified pointers cannot be allocated. As with normal region allocation, we use very simple type inference to ease the burden of writing qualifiers on types. For instance,

  int *a = rnew(r, Core::unique_qual) 0;

suffices to allocate a unique integer in the region r.

The Cyclone library provides the following functions to free alias-free pointers.

Unique Pointers

To ease programming with unique pointers and allow reuse of library code, unique pointers can be aliased temporarily within a designated lexical scope using a special alias pattern. If this kind of aliasing is not sufficient, users can choose to allocate reference-counted objects; this idea is explained in the next subsection. We also define syntax a :=: b to allow two unique pointers a and b to be atomically swapped. Careful use of the swap operator allows us to store unique pointers in objects that are not themselves uniquely pointed to. We also introduce bounded polymorphism over alias qualifiers and add an additional kind to specify type variables over these qualifiers. In practice, all of these mechanisms are necessary for writing useful and reusable code.

Simple Unique Pointers

Having a unique pointer ensures the object pointed to is not reachable by any other means. When pointers are first allocated, e.g., using malloc, they are unique. Such pointers are allowed to be read through (that is, dereferenced or indexed) but not copied, as the following example shows:

  char @fat @aqual(UNIQUE) buf = malloc(3*sizeof(char));
  buf[0] = 'h';
  buf[1] = 'i';
  buf[2] = '\0';
  ...
  ufree(buf);

This piece of code allocates a unique buffer, and then indexes that buffer three times to initialize it. Because the process of storing to the buffer does not copy its unique pointer, it can be safely freed.

When a unique pointer is copied, e.g., when passed as a parameter to a function or stored in a datastructure, we say it has been consumed. We ensure that consumed pointers are not read through or copied via a dataflow analysis. When a consumed pointer is assigned to, very often it can be unconsumed, making it accessible again. Here is a simple example that initializes a datastructure with unique pointers:

 1  struct pair { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; } p;
 2  int *@aqual(UNIQUE) x = new 1;  // initializes x
 3  p.x = x;            // consumes x
 4  x = new 2;          // unconsumes x
 5  p.y = x;            // consumes x

If an attempt was made to read through or copy x between lines 3 and 4 or after line 5, the flow analysis would reject the code, as in

  int *@aqual(UNIQUE) x = new 1;  // initializes x
  p.x = x;            // consumes x
  p.y = x;            // rejected! x has been consumed already

When a multi-word data structure is assigned to another one, all of the unique pointers it contains are consumed. For example:

 1  struct pair { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; } p, q;
 2  p.x = new 1; p.y = new 2;
 3  q = p;              // consumes p.x and p.y

By default, when a unique pointer is passed to a function, we expect that the function will not free the pointer, store it in a data structure, or otherwise make it unavailable to the caller. If a function violates any of thse assumptions its type must be augmented with the attribute consume, which indicates that a particular argument is no longer available to the caller after the call. For example:

void foo(int *@aqual(UNIQUE) x) __attribute__((consume(1))) {
  ufree(x);
}
void bar() {
  int *@aqual(UNIQUE) x = malloc(sizeof(int));
  foo(x);
  *x;//<--- this dereference is not allowed
}

Here, the consume(1) attribute in the definition of foo indicates that the first argument is consumed within the function body.

Note that if you fail to free a unique pointer, it will eventually be garbage collected.

Unique pointers have some restrictions. In particular:

Nested Unique Pointers

Directly reading a unique pointer is only allowed along a unique path. A unique path u has the form

u ::= xu.m ∣ u->m ∣ *u

where x is a local variable, and u is a unique pointer. To appreciate the unique-path restriction, consider this incorrect code:

int f(int *@aqual(UNIQUE) *`r x) {
  int *@aqual(UNIQUE) *`r y = x;  //x and y are aliases
  int *@aqual(UNIQUE) z = *y;
  ufree(z);
  return **x;  //accesses deallocated storage!
}

Here, x is a pointer into a conventional region `r and thus its value can be freely copied to y. We then extract a unique pointer pointed to by y and free it. Then we attempt to access the deallocated storage through x.

If a unique pointer is not accessible via a unique path, it must be swapped out atomically to be used; in Cyclone this is expressed with syntax :=:. In particular, the code a :=: b will swap the contents of a and b. We can use this to swap out a nested unique pointer, and replace it with a different one; we will often swap in NULL, since this is a unique pointer that is always unconsumed. For example, in the code below, we define a queue type for queues that contain unique pointers, and a function take for removing the first element from the queue.

struct Queue<`a,`r> { 
  list_t<`a *@aqual(UNIQUE),`r> front; 
  list_t<`a *@aqual(UNIQUE),`r> rear; 
};
typedef struct Queue<`a,`r> *`r queue_t<`a,`r>;

`a *@aqual(UNIQUE) take(queue_t<`a> q) {
  if (q->front == NULL)
    throw &Empty_val;  // exception: def not shown
  else {
    let elem = NULL;
    elem :=: q->front->hd;
    q->front = q->front->tl;
    if (q->front == NULL) q->rear = NULL;
    return elem;
  }
}

Here, in order to extract the element stored in the queue (the hd portion of the underlying list), we need to use swap, because q->front is a non-unique pointer, and therefore q->front->hd is not a unique path.

Note that this code is not as polymorphic as it could be. In particular, the above queue definition requires its elements to be nullable unique pointers, when they could just as easily be non-unique pointers, or even reference-counted pointers (illustrated later), and the code for take would still work. This problem can be addressed, and its solution is described in Qualifier Polymorphism.

Pattern Matching on Unique Pointers

As described in Pattern Matching, Cyclone supports pattern matching on structured data with let declarations and switch statements. Unique pointers, or structures containing unique pointers, can be matched against, while still ensuring that only one legal pointer to a unique object exists at any given time.

In the simplest case, when a unique pointer to a structure is matched against, the matching operation is treated just like a dereference. Therefore, the pointer itself is not consumed. For example:

struct pair { int x; int y; };
void foo() {
  struct pair @@aqual(UNIQUE) p = new pair(1,2);
  let &pair{.x=x, .y=y} = p;
  ufree(p);
}

Here, we match against the unique pointer p’s two fields x and y. Because we don’t make a copy of p, but rather only of its fields, p is not consumed. Therefore, p can be safely freed.

Because each of the fields matched against is assigned to the pattern variables, unique paths through the original pointer are consumed by virtue of being assigned. At the conclusion of the scope of the pattern, we can unconsume any location whose pattern variable has not been consumed or assigned to, as long as the parent pointer has not been consumed or assigned to. Here’s an example:

struct Foo { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; };
void foo(struct Foo *@aqual(UNIQUE) p) {
  { let &Foo{.x=x, .y=y} = p; // consumes p->x and p->y
    ufree(x);                 // consumes x
  }                           // p->y is unconsumed
  ufree(p->y);                // p->y consumed
  ufree(p);                   // p consumed
}

The initial match against p consumes p->x and p->y, whose contents are copied to x and y, respectively. At the conclusion of the block, p->y is unconsumed because it did not change, whereas p->x is not, because x was freed within the block.

Note that the following code is illegal:

void foo(struct Foo *`H p) {
 let &Foo{.x=x, .y=y} = p; // non-unique path!
 ...
}

To see why, notice that this is equivalent to

void foo(struct Foo *`H p) {
 let x = p->x;
 let y = p->y;
 ...
}

This code is illegal because neither p->x nor p->y is a unique path. We also do not allow * patterns to create aliases of the original unique pointer, for the same reason we forbid &e when e is a unique pointer. Unfortunately, this means we don’t provide a way to assign to matched-against fields. However, in the case of the matched-against struct, we can just do this with regular paths. In the above example pattern block, we could do p->y = new 1 or something like that (even within the scope of the pattern).

Matching against tagged unions is essentially like matching against structures, as just described. Since we do not allow unique pointers to be stored within datatypes, there is no change to how datatypes are matched.

Reference-counted Pointers

Cyclone also supports reference-counted pointers, which are treated quite similarly to unique pointers. Reference-counted objects may be allocated in any region. We define the constant Core::refcnt_qual, having type aqual_t<REFCNT>, for creating reference-counted pointers. The caveat here is that when you allocate something in this region, an extra word will be prepended to your data, which contains the reference count, initialized to 1.

As with unique pointers, no pointer arithmetic is allowed, for similar reasons: it can occlude where the “head” of the object is, and thus make it impossible to find the hidden reference count. The reference count can be accessed via the routine Core::refptr_count:

  int refptr_count(`a::TA ?@aqual(REFCNT) ptr);

The constant NULL is allowed to have type `a::A ?@aqual(REFCNT), and its reference count is always 0. Like unique pointers, implicit aliasing is not allowed. Aliases are created explicitly using the routine Core::alias_refptr:

  `a ?@aqual(REFCNT) alias_refptr(`a::TA ?@aqual(REFCNT) ptr);

This routine returns a copy of its argument, which is itself not consumed. Furthermore, the reference count will be incremented by one. Reference counts are reduced explicitly by the routine drop_refptr:

  void drop_refptr(`a::TA ?@aqual(REFCNT) ptr) 
                         __attribute((consume(1)));

In the case that the provided object’s reference count is 1 (and is thus dropped to zero), the provided pointer is freed. The flow analysis will consume the passed pointer (as is always the case for function arguments), so you won’t be able to use it afterwards. Just like unique pointers, you can “forget” reference-counted pointers without decrementing the count; this just means you’ll never be able to free the pointer explicitly, but the GC will get it once it becomes unreachable.

Just like unique pointers, reference-counted pointers can be stored in normal, aliasable datastructures, and accessed using swap (e.g. x :=: y). Because NULL is a `a::TA ?@aqual(REFCNT) pointer, we can always cheaply construct a pointer to swap in.

A good example of the use of unique pointers and reference-counted pointers is in the Cyclone distribution’s tests directory—the file streambuff.cyc. This is an implementation of a packet manipulation library with a representation for packets (called streambuff_t’s) that is similar to Linux’s skbuff_t’s. It uses a combination of unique header structures and reference-counted data structures.

Qualifier Polymorphism

To allow the writing of reusable code we support both subtyping and bounded polymorphism over alias qualifiers. Type variables that range over the set of alias qualifiers are of kind Q are used in addition to the other kinds.

The list data structure in the Cyclone libraries illustrates many features of qualifier polymorphism. It has the following declaration:

  struct List<`a::B,`r::R,`q::Q>{ 
    : RESTRICTED >= aquals(`a), RESTRICTED >= `q
    `a hd; 
    struct List<`a,`r,`q> *@aqual(`q) `r tl;
  };
  typedef struct List<`a,`r,`q> *@aqual(`q) `r list_t<`a,`r,`q>;

Here, the structure is parameterized by three type variables. The first, of boxed-kind, admits instantiation by pointer types. Since pointer types may be qualified according to their aliasability, we require a bound on this aliasability. For this purpose, we use the construct aquals(`a). Similar to the regions(`a) construct, aquals(`a) evaluates to the top-level alias qualifier of the type that instantiates the variable `a. For instance, aquals(int *@aqual(ALIASABLE) *@aqual(UNIQUE)) = UNIQUE. The bound in the list declaration RESTRICTED >= aquals(`a) states that `a can be instantiated with a boxed kind with an aliasability that is a subtype of RESTRICTED. Since RESTRICTED is at the top of the subtyping hierarchy, this is the most general bound on the aliasability of the type.

The second type variable is of region kind. These types may not be qualified by aliasability and thus do not appear in the bounds at all.

Finally, we have the type variable `q::Q, of alias qualifier kind. Thus it can be instantiated with any type that reduces to one of the alias qualifiers. That is, one of ALIASABLE, UNIQUE, REFCNT, RESTRICTED or aquals(`a). The bound on `q also is the most general bound.

A list that uses unique pointers on the “spine” with reference counted elements might be constructed as follows:

  int *rc_int = qnew(Core::refcnt_qual) 0;
  int *rc_int2 = qnew(Core::refcnt_qual) 1;
  list_t<int*@aqual(REFCNT),`H,UNIQUE> l =
      rnew(Core::heap_region, Core::unique_qual) 
          List{rc_int, 
        rnew(Core::heap_region, Core::unique_qual) 
           List{rc_int2, NULL}};

We can also quantify over alias qualifiers in function types. For instance, a function that copies a list can be defined as follows.

list_t<`a,`r,`q> rqcopy(region_t<`r> r,aqual_t<`q> q,
                        list_t<`a,`r2,`p> l
                        : RESTRICTED >= `q,
                          ALIASABLE >= aquals(`a),
                          RESTRICTED >= `p) {
  if(l == NULL)
    return NULL;
  _ tl = NULL;
  tl :=: l->tl;
  list_t<`a,`r,`q> result = rnew(r,q) List{l->hd, rqcopy(r,q,tl)};
  l->tl :=: tl;
  return result;
}

This function copies a list allocated in a region r2 into a region r. As previously, the function is polymorphic in both the source and destination regions. Furthermore, the aliasability of the new list is specified by the handle q which has type aqual_t<`q>. As previously, since `q is of alias qualifier kind, a bound can be specified for it — this appears after the argument list along with any effects for this function. Note that the list that is being copied may also have a spine that is RESTRICTED as specified by the RESTRICTED >= `p bound. This bound requires that we use the swap operator (:=:) to ensure that no aliases are manufactured. Finally, since we are copying the list, the elements of the list itself must be aliasable — this is specified by the ALIASABLE >= aquals(`a) bound.

Aliasing Unique Pointers

Programmers often write code that aliases values temporarily, e.g. by storing them in loop iterator variables or by passing them to functions. Such reasonable uses would be severely hampered by “no alias” restrictions on unique pointers. To address this problem, we introduce a special alias pattern variable that permits temporary aliasing of a unique pointer. Here is a simple example:

  char *@fat@aqual(UNIQUE) dst, *@fat@aqual(UNIQUE) src = ...
  { let alias <`r>char *@fat`r x = src; // consumes src
    memcpy(dst,x,numelts(x)); }
  // src unconsumed
  ...
  ufree(src);

In general, an alias pattern has form alias <`r>_t_ x, where `r is a fresh region variable, and t is the type of x, which may mention `r. The alias pattern introduces a region `r, copies src to x which is treated as having the designated type char *@fat`r. Because `r is non-unique, x can be freely aliased. As such, we can pass x to the memcpy function. The matching operation consumes src during the block, and unconsumes it upon exit, so that src can be ultimately freed.

Alias pattern variables are similar to regular pattern variables. Like regular pattern variables, the matched-against expression (i.e. src in the above example) must be a unique path, and is consumed as a result of the match. As well, this expression can be unconsumed at the conclusion of the surrounding block as long as it hasn’t been overwritten. However, in the case of regular pattern variables, unconsumption also requires that the pattern variable itself (i.e. x in the above example) hasn’t changed within the block, while this requirement is unnecessary for alias patterns.

Intuitively, alias pattern variables are sound because we cast a unique pointer to instead point into a fresh region, for which there is no possibility of either creating new values or storing existing values into escaping data structures. As such we cannot create aliases that persist beyond the surrounding scope. However, we must take care when aliasing data having recursive type. For example, the following code is unsound:

  void foo(list_t<`a, `r1, UNIQUE> l) {
    let alias <`r> x = (list_t<`a, `r1, UNIQUE>)l;
    x->tl = x; // unsound: creates alias!
  }

In this case, the alias effectively created many values in the fresh region `r: one for each element of the list. This allows storing an alias in an element reachable from the original expression l, so that when the block is exited, this alias escapes.

For improved programmer convenience, the Cyclone typechecker optimistically inserts alias blocks around function-call arguments that are unique pointers when the formal-parameter type is polymorphic in the pointer’s region. If this modified call does not type-check, we remove the inserted alias. For example, the alias pattern in the foo function above could be inferred, so we could instead write:

  int foo() {
    list_t<int,`H,UNIQUE> l = new List(1,new List(2,NULL));
    return length(l);
  }

Right now, alias inference in Cyclone is fairly primitive, but could be extended to more contexts. We plan to improve this feature in future releases.

Dynamic Regions

Dynamic regions combine reference-counted or unique pointers and lexical regions together to essentially create reference-counted or unique regions; that is, the region is completely first class, and can be created or freed at conceptually any program point. This is done by representing a dynamic region as a unique (or reference-counted) pointer to an abstract struct DynamicRegion (which internally just contains the handle to a lexical region). The unique (or ref-counted) pointer is called the key. The key serves as a run-time capability that grants access to the region. At run-time, a key can be presented to a special open primitive, described later, that grants lexical access to the region.

The operation new_ukey() creates a fresh dynamic region and returns a unique key for the region; new_rckey() creates a fresh dynamic region and returns a ref-counted key for the region. The operations free_ukey() and free_rckey() are used to destroy unique and ref-counted keys respectively. The free_ukey() operation reclaims the key’s region, as well as the storage for the key. The free_rckey() operation decrements the reference count, and if it’s zero, reclaims the key’s region as well as the storage for the key. Because ref-counted keys are pointers, you can use alias_refptr to make a copy of a ref-counted key. (Obviously, you can’t make a copy of a unique key.) By the same token, you can pass a ref-counted key to drop_refptr (and you can pass a unique key to ufree), but doing so won’t actually deallocate the region, but rather only the key. To obtain a dynamic reap (i.e. a dynamic region that supports the deletion of individual objects) use new_reap_ukey and new_reap_rckey instead.

Given a key k, a user can access the contents of its region by temporarily “opening the region” within a lexical scope. This is done with the syntax region r = open k. That is, within the remainder of the current scope, the region handle r can be used to access k’s region. The key k is temporarily consumed throughout the scope, and then unconsumed at its conclusion. This prevents you from opening up the dynamic region, and then freeing it while it’s still in use. Note that open is very similar to alias in this way.

Here is a small example of the use of dynamic regions.

int main() {
  // Create a new dynamic region
  let NewDynamicRegion{<`r> key} = new_ukey();

  // At this point, we refer to the region `r to
  // specify types, but we cannot actually access
  // `r (i.e. it's not in our "static capability,"
  // a concept explained later)

  list_t<int,`r> x = NULL;

  // We can access x by opening the region, which
  // temporarily consumes the key
  { region h = open(key);
    x = rnew(h) List(3,x);
  }

  // Now we can access the key again, but not x.
  // So we have to open the region to increment
  // its contents
  { region h = open(key);
    int i = x->hd + 1;
    x = rnew (h) List(i,x);
  }

  // Finally, destroy the key and the region
  free_ukey(key);
}

First, we allocate a new unique key and open it up, to reveal the name of the key’s region (`r), and the key itself. Because `r is now in scope, we can declare a variable x that refers to it. However, because the key key must be opened before `r becomes accessible, we cannot actually do anything with x yet (like dereference it).

Next, we open up the region using key, assigning its handle to the vairable h. Now, key is inaccessible (consumed) in the surrounding block, which prevents us from doing anything that might cause it to be freed while it’s in use. We can use h to allocate into `r, so we allocate a list element and store it in x.

At the conclusion of the block, the region `r becomes inaccessible again, so once again we cannot dereference x. However, key can now be accessed again, so we can open it again in the following block, to add a new list cell to x. At the conclusion of this block, key is unconsumed once again, so we legally call free_ukey. This frees the key and the region `r.

You can “share” a dynamic region key by placing it in some shared data structure, like a global variable. Of course, you’ll then have to swap with NULL to get it in and out of the shared data structure, as the following code demonstrates:

struct MyContainer { <`r>
  uregion_key_t<`r> key;
  list_t<int,`r> data;
} *\U `H global = NULL;

int main() {
  // allocate a dynamic region, and create a list
  let NewDynamicRegion{<`r> key} = new_ukey();
  list_t<int,`r> x = NULL;
  { region h = open(key);
    x = rnew(h) List(3,x);
  }

  // Stick the key and list in a global data
  // structure. We've now lost direct access to
  // the key and x.
  global = new MyContainer{key,x};

  // But we can regain it by swapping for the
  // container.
  struct MyContainer *@aqual(UNIQUE) p = NULL;
  global :=: p;

  // Now we can use it as above
  let MyContainer{<`r2> key2, data2} = *p;
  list_t<int,`r2> d = data2;
  { region h = open(key2);
    int i = d->hd + 1;
    d = rnew (h) List(i,d);
  }
}

Here, we define a global variable having type MyContainer, which consists of a key and some data into that key’s region. The main function allocates a unique as before, and allocates some data into its region. Then we create a container for that key and data, and store it into the global variable; this consumes key, making it inaccessible, and effectively preventing access of x as well.

But we can then get the container back out of the global variable by swapping its contents with NULL. Then we can open up the container, and use the key and data as before. This way, a single dynamic region can be used by many different functions in the program. They simply swap out the global when they need it, operate on it, and then swap in the result.

One problem with using this technique with unique keys arises when you need to open the same region multiple times. The problem, of course, is that if you swap in NULL, then whoever tries to swap it out will fail. In other words, you can’t really do recursive opens with UNIQUE keys. However, you can do this with REFCNT keys! Swap out the key, make a copy of it, swap it back in, and use the copy for the open (making sure to destroy the copy after the open).

One disadvantage of dynamic regions, which is inherited from unique and reference-counted pointers, is that if you put a key in some shared storage in a region r, then it is not the case that whenr is deallocated that the key will be destroyed automatically. It’s up to you to do the right thing or let the GC eventually collect it. In the long run, the right thing to do is add a finalizer interface for regions so that we can register a routine to deallocate a dynamic region whenever we put it in a shared data structure. The same goes for any unique pointer – we ought to have a way to register a finalizer. This is on our To-do list.

Defaults and Shorthands

As described so far, the notation for alias qualifiers is extremely verbose. The default qualifier annotations and bounds are intended to capture the most common cases and reduce the burden on the programmer. Where the defaults do not suffice, shorthand versions allow explicit types to specified in a more compact manner.

The shorthand notation is as follows:

Alias Qualifier Specifier The strings \A, \U, \RC, \T can be used as substitutes for @aqual(ALIASABLE), @aqual(UNIQUE), @aqual(REFCNT), and @aqual(RESTRICTED) respectively. For example

    void cons(int *\T a, int *\U b)

is equivalent to

    void cmp(int *@aqual(RESTRICTED) a, int *@aqual(UNIQUE) b)

Type Variable Bound The bound on a type variable, as shown previously, generally appears together with the effects in a function type, or with the outlives relations for an aggregate type. However, when the strings \A, \U, \RC, \T succeed a type variable they are interpreted as a bound on the type variable. This bound is only legal within an struct declaration or a function type, i.e., bounds may not appear within a local variable declaration, typedefs etc. If the type variable is of alias qualifier kind (kind Q) then this is interpreted as a qualifier bound. If the variable is of boxed, memory or abstract kind (::B, ::M, or ::A) then it is interpreted as an aquals bound. For instance, the function rqcopy above can be written as

  list_t<`a,`r,`q> rqcopy(region_t<`r> r,aqual_t<`q\T> q,
                          list_t<`a\A,`r2,`p\T> l);

The same convention applies to aggregate types. For instance, we could define List as follows

  struct List<`a::B\T,`r::R,`q::Q\T>{ 
    `a hd; 
    struct List<`a,`r,`q> *@aqual(`q) `r tl;
  };

Note that if a type variable is used to specify the alias qualifier on a pointer, then the @aqual(.) notation must be used. That is, int *@aqual(`q\T) cannot be substituted with int *`q\T.

Heap Pointers For unique and reference counted pointers that reside in the heap we overload the region notation to provide a further shorthand. This is also for backward compatibility with previous versions of Cyclone where unique and reference counted pointers always resided in a separate region. Thus the type T *@aqual(\U) `H can be written more compactly as T *`U; the type T *@aqual(\RC) `H can be written as T *`RC. The default qualifier bounds are as follows:

Function Types The aliasability of all parameters and return types in a function type is ALIASABLE by default. If a formal parameter is declared as T *@aqual(\T) then all subtypes of T *@aqual(RESTRICTED) may be passed as an argument. For parametricity T *@aqual(`q\T) should be used instead.

Struct Types For pointers within a struct the default aliasability is ALIASABLE. For qualifier variables `q::Q the default bound is RESTRICTED. The aquals bounds for type variables is RESTRICTED by default. Thus the following declarations are equivalent

    struct PtrList<`a::A, `q> {
      `a *hd; 
      struct PtrList<`a,`q> *@aqual(`q) tl;
    };

    struct PtrList<`a::A\T,`q\T> { 
      `a *@aqual(\A) hd; 
      struct PtrList<`a,`q> *@aqual(`q) tl;
    };

Type Instantiations When a type variable is instantiated with a pointer type we have to decide the aliasability of pointer. When instantiated in a function type or an aggregate type the aliasability is by default ALIASABLE. For instance, the following are equivalent:

    void int_list(list_t<int*> l);
    void int_list(list_t<int *@aqual(\A)> l);

    struct Wrapper {
      list_t<int*> l;
    }
    struct Wrapper {
      list_t<int*@aqual(ALIASABLE)> l;
    }

In variable declarations the default is RESTRICTED to allow for more aggressive unification. For instance

    void int_list(list_t<int*@aqual(\T)> l) {
      list_t<int*> cp = l;
    }

In the local variable declaration the bound defaults to RESTRICTED so that unification with the formal succeeds.

Subtyping for Effect Qualifier and Reaps

We mentioned in Allocation and Freeing that there was more to the type of Core::rufree. The full type of the function is given below.

 void rufree(region_t<`r> h, `a::A\T *\U `r ptr 
             : single(`r))  __attribute__((noliveunique(2)));

This type states that rufree expects a handle into region `r as its first argument. The `r effect qualifier on the second argument is meant to indicate that ptr is a pointer into the region `r, the handle of which is the first argument. The \U qualifier indicates that ptr must be a unique pointer into that region. The type of the element that is pointed to by ptr can be anything at all (`a::A), and may itself contain unique or reference-counted pointers, as indicated by the qualifier bound \T.

The attribute noliveunique that appears in the type is a variant of the consume attribute shown earlier. This attribute states that after the call to the rufree, the pointer passed as the second parameter is consumed. Furthermore, if the element pointed to by ptr contains a live unique pointer, then Cyclone issues a warning. The reason is that since ptr is the only pointer to the underlying element, if this element contains a unique or reference-counted pointer, then the last reference to those pointers are lost leading to a memory leak.

The last component of the type of rufree is the single(`r) constraint. The effect qualifier annotation of `r on both the handle as well as the pointer is meant to force the pointer to point into the region of the provided handle. However, in the presence of subtyping over effect qualifiers (as described in The Truth About Effects, Capabilities and Effect Subset Constraints) the annotations on the parameters alone don’t suffice to enforce this invariant. This is illustrated by the example below.

void bad_rufree(region_t<`r>, `a *\U `r); 
void foo() {
  region r;
  int *`r ptr = rnew(r) 1;
  region_t<`r+`H> a = Core::heap_region;
  bad_rufree(a, ptr);
}

Subtyping of effect qualifiers means that it is permissble to treat int *`r as int*`r+`H and so the call to bad_rufree typechecks even though pointer ptr does not point into the region for which a is the handle.

To avoid this behavior, the single(`r) constraint is used to limit subtyping on the effect qualifier variable `r. In particular, an effect qualifier variable that appears in a single(`r) constraint is guaranteed to always be instantiated with a single region name, and not a set of region names.

Reap Allocator Implementation

To support deallocation of objects within a region the allocator must maintain some meta-data associated with each object. The current implementation uses a version of the bget allocator adapted for use with reaps. When deallocation is rare, bget behaves much like a simple pointer bumping allocator and we expect performance to be competitive. However, the metadata does consume two additional header words for each object allocated.

For normal regions (i.e. those declared as region r, or those created using new_ukey, or new_rckey) the simple pointer-bumping allocator is used. Thus, if your application does not use reaps at all, then the overhead due to bget therefore does not apply.