Hello everybody! It’s been, what, two years since I last blogged? Not my best performance, I’m sorry to say. So for all of my 3 pageviews that are probably bots, I appologize for such a long delay on updating my blog. I got to say I’ve been pretty inspired by the great Julia Evans (who I hope we can someday get back to working on rust stuff). She’s an epic blogger, and I hope I can get somewhere near that speed.
Anyway, on to the post. My main on-again-off-again project this past year has
been working Rust’s generic serialize
library. If you haven’t played with it yet, it’s really nifty. It’s a generic
framework that allows a generic Encoder
serialize a generic Encodable
, and
the inverse with Decoder
and Decodable
. This allows you to write just one
Encodable
impl that can transparently work with our
json library,
msgpack,
toml, and etc. It’s simple to use
too in most cases as you can use #[deriving(Encodable, Decodable)]
to
automatically create a implementation for your type. Here’s an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
There are some downsides to serialize though. Manually implementing can be a bit of a pain. Here’s the example from before:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
As you can see, parsing compound structures requires these recursive closure
calls in order to perform the handshake between the Encoder
and the
Encodable
. A couple people have run into bugs in the past where they didn’t
implement this pattern, which results in some confusing bugs. Furthermore, LLVM
isn’t great at inlining these recursive calls, so serialize
impls tend to not
perform well.
That’s not the worst of it though. The real problem is that there are types
that can implement Encodable
, there’s no way to write a Decodable
implementation. They’re pretty common too. For example, the
serialize::json::Json
type:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The Json
value can represent any value that’s in a JSON string. Implied in
this is the notion that the Decodable
has to look ahead to see what the next
value is so it can decide which Json
variant to construct. Unfortunately our
current Decoder
infrastructure doesn’t support lookahead. The way the
Decoder
/Decodable
handshake works is essentially:
Decodable
asks for a struct named"Employee"
.Decodable
asks for a field named"name"
.Decodable
asks for a value of typeString
.Decodable
asks for a field named"age"
.Decodable
asks for a value of typeuint
.- …
Any deviation from this pattern results in an error. There isn’t a way for the
Decodable
to ask what is the type of the next value, so this is why we
serialize generic enums by explicitly tagging the variant, as in:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
That’s probably good enough for now. In my next post I’ll go into in my approach to fix this in serde.