Records in Scheme have a contentious history. Scheme didn’t have a standardized system for creating record types until 1999. Another, incompatible record system was standardized in 2007, which was very controversial1. R7RS decided to adopt SRFI-9 instead of R6RS’s system, and ever since people have been trying to combine the two in a way that makes everyone equally happy.2
Record types in Scheme serve two purposes:
- They make a type disjoint from all other types.
- They allow for values to be stored and referenced by name in an object efficiently.
All of the different proposals accomplish these goals3, and differ in how much control the programmer has in creating and dealing with records.
I’m not going to compare the multiple different proposals, mostly because I only use the standard R7RS system, and also because other people with more experience than me have argued about this for decades.
Instead, I want to discuss an alternative way of designing records that is inspired by C structs. This design would
- Allow records to be accessed by name (a symbol) so that records with a common field do not need dedicated procedures to extract the value stored in that field.
- Throw away the idea of inheritance entirely in favor of embedding. This allows for composite records like R6RS’s composite conditions. Acccessing a field in an embedded record does not require any more indirection than accessing a regular field. Passing an embedded record as a parameter can be as simple as pointer arithmetic.
- Give fine-grained control over opaque records. Accessing the fields of a record and embedding a record type into other record types requires a record type descriptor (rtd). Opaque records do not carry their record type descriptor, but can be manipulated if their rtd is supplied. This means opaque-ness can be controlled using scope. This also allows for implementing “sealed” records, although the concept is not needed because subtyping is not allowed.
- Support non-generative records. Two non-generative records are of the
same type if
- They both have the same number of fields,
- The
n
th field in the first type has the same name as then
th field in the second type, - If the
n
th field in the first type is an embedded record, then
th field in the second type is an embedded record with the same type.
The Problem of Field Names
Let’s write some code to handle an online shop. The two principal objects we deal with are customers and items. First we create our customer record:
(define-record-type <customer>
(make-customer name current-cart order-history)
customer?
(name get-name)
(cart get-current-cart set-current-cart!)
(order-history get-order-history set-order-history!))
The getters and setters for each field of the record are pretty verbose. We run into an issue when we try to make our item record:
(define-record-type <item>
(make-item name price quantity)
item?
(name get-name)
(price get-price set-price!)
(quantity get-quantity set-quantity!))
We now have a name clash between get-name
for items and get-name
for
customers. The idiomatic way to fix this is to prefix the getters and
setters with the name of the record. Let’s try again:
(define-record-type <customer>
(make-customer name current-cart order-history)
customer?
(name customer-name)
(cart customer-current-cart set-customer-current-cart!)
(order-history customer-order-history set-customer-order-history!))
(define-record-type <item>
(make-item name price quantity)
item?
(name item-name)
(price item-price set-item-price!)
(quantity item-quantity set-item-quantity!))
Now we have a mouthful on our hands (set-customer-order-history!
).
Of course we might be able to get away with shortening these names
if we know there will never be a conflict, but the idiomatic way is
the best way to ensure no conflict will ever occur.
The reason for this verbosity is because field getters and setters are
all procedures. Using a procedure to abstract access is, of course, the
Scheme Way. If I want to refactor the code so that the current cart
is in an SQL database cached by redis valkey, then a change to
customer-current-cart
(and the <customer>
record) is all that’s
needed.
In most cases, however, these getters and setters will be for record types. A get in this case is a typecheck followed by a dereference. An optimizing Scheme compiler should be able to merge typechecks and inline dereferences, which it cannot do if the procedure can truly be anything.
I think that in the case of records, it is better to have special syntax
for getting and setting fields, just like in other languages. For instance,
to get the name
field of a <customer>
or <item>
, you would write
(get customer name)
(get item name)
This would emphasize that the object is a record, and that the operation is almost side-effect-free (except for exceptions). If the type of the object is already known, then this reduces to a dereference. If the type is not known, then the operation turns into a dereference, a table lookup, and a dereference. Most records only have a few fields so the table lookup will consist of a few comparisons. All records have static fields so those comparisons can be compiled ahead of time, and records with a large number of fields can employ a static perfect hash function.
Inheritance and Compound Records: The Example of C Structs
With that construct in hand, let’s look to C structs. Structs are continguous blocks of memory declared ahead of time4:
struct customer {
char *name;
struct item **current_cart;
struct order **order_history;
};
Let’s say I have a child account that has a parent account that can
control it. The child account has a separate order history and cart.
How would I make a struct that extends struct customer
? In standard
C I would put a struct (as opposed to a struct pointer) at the start
of the struct:
struct child_customer {
struct customer customer;
struct customer *parent;
};
All struct child_customer *
are now castable to struct customer *
.
In some dialects, like Plan 9 C, an anonymous tagged
struct can be placed into a struct description, which causes all
the fields of a named structure to be placed in the new struct:
struct child_customer {
struct customer;
struct customer *parent;
}
-
“Over 30% of those who voted against ratification [of R6RS] mentioned the record systems as one of their reasons.” (SRFI-99) ↩︎
-
SRFI 256, 240, 237, 222, 150, 136, 131, 99, 76, 57, and 9 all concern records. There is even arguments about Scheme records in mailing list archives prior to the SRFI system. ↩︎
-
With the notable mention of SRFI-137 as a low-level type-creation and subtype-creation system. ↩︎
-
current_cart
andorder_history
are NULL terminated arrays of pointers for simplicity. In a real program I would use an array of structs with an explicitsize_t
length. ↩︎