In which I attempt to TDD a syntactically sweet Ruby enum with limited success.
Update Nov 2 2007: See my post of moments ago for a gem that implements something like the enum implemented in this post.
The other day I was thinking about the syntactic sugar you can add to Ruby yourself, and my mind wandered to the type-safe enumeration pattern. One of the bits of syntactic sugar Java 5 introduced was to convert the type-safe enum from a best practice that required some pretty verbose code to a first-class language feature. In Java, you've got to wait for things like that, but in Ruby, we can do a lot on our own. I decided to see how clean a syntax I could come up with that would give me the same functionality in Ruby.
[Of course it can't be quite the same: Ruby's not statically typed, and type-checking is one of the main touted benefits of typed enums. Still, the enum does a number of handy tricks that I'd like to have in my Ruby projects.]
Pre-Java5, a type-safe enum looked something like this:
// Java < 5
public class Suit extends TypeSafeEnum { // You have to write that superclass yourself.
public static final Suit SPADES = new Color("SPADES");
public static final Suit CLUBS = new Color("CLUBS");
public static final Suit DIAMONDS = new Color("DIAMONDS");
public static final Suit HEARTS = new Color("HEARTS");
private Suit(String value) {}
}
That TypeSafeEnum
class should provide a mechanism to look up constants by their value, a reasonable toString()
method, and some ugly stuff to trick out serialization so that you can do things like suit==SPADES
and get the expected results. (Note you could sometimes get bit by your class getting loaded by multiple classloaders, breaking ==
. I won't get into the details because we use Ruby now and don't have to worry about it. Phew!)
Java 5 cleaned that up by allowing you to just write this:
// Java >= 5
public enum Suit { SPADES, CLUBS, DIAMONDS, HEARTS }
It's way better. In fact, it's got so little in the way of distracting keywords and punctuation that it barely feels like Java! I can't let Ruby be outdone. Surely we can be just as terse and clear and still get the features.
Here's what I'd see as the ideal Ruby syntax for declaring an enumeration of values:
enum Status
NOT_STARTED
IN_PROGRESS
COMPLETE
end
I can't see a quick way to make that valid Ruby though, so let's start with the following and hope we don't have to tweak it unless it's in the direction of the even loftier goal above:
class Status < EnumeratedValue
NOT_STARTED
IN_PROGRESS
COMPLETE
end
So what do we want this thing to do? Here's a first test:
def test_constant_values_are_instances_of_enumerated_value_type
assert_equal Status, Status::NOT_STARTED.class
end
Before we can start working on the goal of the test, we have to get past some NameError
s. First we create EnumeratedValue
, then we override Module.const_missing
.
class EnumeratedValue
def self.const_missing sym
end
end
With an empty implementation (implicitly returning nil) we get a useful failure: NOT_STARTED is a NilClass instead of a Status. So we want that const_missing
to create us a Status and stick it in a constant. Easy enough.
class EnumeratedValue
def self.const_missing sym
const_set sym, self.new
end
end
Next up, let's get to_s
reasonable.
def test_to_s_gives_fully_qualified_constant_name
assert_equal 'Status::NOT_STARTED', Status::NOT_STARTED.to_s
end
That passes once we do
class EnumeratedValue
def self.const_missing sym
const_set sym, self.new(sym)
end
def initialize name
@name = name
end
def to_s
"#{self.class.name}::#{@name}"
end
end
What next? Comparison operators would be nice, as would making each EnumeratedValue class Enumerable over its values. But before I start implementing those features, I realize there's something pretty uncool I haven't dealt with yet. This test fails.
def test_misspelled_constant_name_raises_NameError
assert_raises(NameError) { Status::NOTS_TARTED }
end
Oops. With the uber-open const_missing
approach, using a constant that isn't there creates it just in time for you to get behavior you weren't expecting. One of the best reasons to use constants instead of symbols is that you catch misspellings with a big loud NameError instead of some strange test failure. Our current implementation loses this checking. What are we going to do?
I'd be happy to flip a bit once I'm finished defining my enumeration of values and check for that in const_missing
, but I don't want to clutter my enum with that.
class Status < EnumeratedValue
NOT_STARTED
IN_PROGRESS
COMPLETE
done! # Poo.
end
I thought for a moment I could use Module.nesting
to save my bacon, setting it up so you'd get lazy constant creation inside the enumerated value class but not from outside. Unfortunately I need the nesting at the point the missing constant was encountered, not the nesting of the const_missing
method. (What is that method for anyway?)
There's no hook method that will kick in when the class definition ends, so I turn to the Ruby feature I always think of when I need to set up some state for some code to run then clean that state up: block methods. With a minor change in the enum declaration, we should have the hook we need to make sure constants are only declared intentionally.
I change Status to this:
class Status < EnumeratedValue
values do
NOT_STARTED
IN_PROGRESS
COMPLETE
end
end
Tests pass! Exciting. But that declaration is pretty wordy &emdash; wordier than the one I marked as "Poo" above. Can't we collapse the whole thing into just one method call, passing a block full of constants to be defined?
I change Status to this:
enum :Status do
NOT_STARTED
IN_PROGRESS
COMPLETE
end
and create this method at the top level:
def enum sym, &block
type = Class.new(EnumeratedValue)
Object.const_set sym, type
type.class_eval &block
end
We run our tests and ... failure! The block passed to the enum
method is a closure bound to the top level where it occurs in the code. The class_eval
method changes the value of self
but not the binding, so Ruby is looking for the constants in Object, not in Status.
Can we get around this? Kernel.eval allows us to provide an arbitrary binding, but will only accept a string, not a block or Proc. With that restriction, it appears we can't be as brief and clean as we'd like to be. Either we go with the double-nested solution above, or we settle for declaring with symbols or strings, like this:
enum :Status, %w{ NOT_STARTED IN_PROGRESS COMPLETE }
where enum
is defined as something like:
def enum type_name, values
eval <<-END
class #{type_name} < EnumeratedValue
values do
#{values.join(';')}
end
end
END
end
It's pretty good, though I'm sad to have come so close to our ideal and been unable to make it work without compromise. If you see something I've missed, let me know.
Are we defeated? Java still has the cleaner enum declaration. But it took them years and a new compiler. We were able to come up with something just as terse, even if it wasn't quite as clean as we'd dreamed it might be, in a matter of minutes. On the way we learned a little about Ruby's eval methods, constant lookup, const_missing, and ourselves.
Let's call it a draw and be thankful we get to work with Ruby.