2015-06-08 12:00:00

My initial experience with Rust

First, a digression about superhero movies

I am apparently incapable of hating any movie about a comic book superhero.

I can usually distinguish the extremes. Yes, I can tell that "The Dark Knight" was much better than "Elektra". My problem is that I tend to think that the worst movies in this genre are still pretty good.

And I have the same sort of unreasonable affection toward programming languages. I have always been fascinated by languages, compilers, and interpreters. My opinions about such things skew toward the positive simply because I find them so interesting.

I do still have preferences. For example, I tend to like strongly typed languages more. In fact, I think it is roughly true that the stricter a compiler is, the more I like it. But I can easily find things to like in languages that I mostly dislike.

I've spent more of my career writing C than any other language. But in use cases where I need something like C, I am increasingly eager for something more modern.

I started learning Rust with two questions:

How successful might Rust become as a viable replacement for C?
If I enjoy functional programming, how much of that enjoyment can I retain while coding in Rust?

The context

My exploration of Rust has taken place in one of my side projects: https://github.com/ericsink/LSM

LSM is a key-value database with a log-structured merge tree design. It is conceptually similar to Google LevelDB. I first wrote it in C#. Then I rewrote/ported it to F#. Now I have ported it to Rust. (The Rust port is not yet mentioned in the README for that repo, but it's in the top-level directory called 'rs'.)

For the purpose of learning F# and Rust, my initial experience was the same. The first thing I did in each of these languages was to port LSM. In other words, the F# and Rust ports of LSM are on equal footing. Both of them were written by someone who was a newbie in the language.

Anyway, although Rust and F# are very different languages, I have used F# as a reference point for my learning of Rust, so this blog entry walks that path as well.

This is not to say that I think Rust and F# would typically be used for the same kinds of things. I can give you directions from Denver to Chicago without asserting they are similar. Nonetheless, given that Rust is mostly intended to be a modern replacement for C, it has a surprising number of things in common with F#.

The big comparison table

	F#	Rust
Machine model	Managed, .NET CLR	Native, LLVM
Runtime	CLR	None
Style	Multi-paradigm, functional-first	Multi-paradigm, imperative-first
Syntax family	ML-ish	C-ish
Blocks	Significant whitespace	Curly braces
Exception handling	Yes	No
Strings	.NET (UTF-16)	UTF-8
Free allocated memory	Automatic, garbage collector	Automatic, static analysis
Type inference	Yes, but not from method calls	Yes, but only within functions
Functional immutable collections	Yes	No
Currying	Yes	No
Partial application	Yes	No
Compiler strictness	Extremely strict	Even stricter
Tuples	Yes	Yes
Discriminated unions	type Blob = \| Stream of Stream \| Array of byte[] \| Tombstone	enum Blob { Stream(Box), Array(Box<[u8]>), Tombstone, }
Mutability	To be avoided	Safe to use
Lambda expressions	let f = (fun acc item -> acc + item)	let f = \|acc, &item\| acc + item;
Higher-order functions	`List.fold f 0 a`	`a.iter().fold(0, f)`
Integer overflow checking	No `open Checked`	Yes
Let bindings	let x = 1 let mutable y = 2	let x = 1; let mut y = 2;
if statements are expressions	Yes	Yes
Unit type	`()`	`()`
Pattern matching	match cur with \| Some csr -> csr.IsValid() \| None -> false	match cur { Some(csr) => csr.IsValid(), None => false }
Primary collection type	Linked list	Vector
Naming types	CamelCase	CamelCase
Naming functions, etc	camelCase	snake_case
Warnings about naming conventions	No	Yes
Type for integer literals	Suffix (`0uy`)	Inference (`0`) or suffix (`0u8`)
Project file	foo.fsproj (msbuild)	Cargo.toml
Testing framework	xUnit, NUnit, etc.	Built into Cargo
Debug prints	`printf "%A" foo`	`println!("{:?}", foo);`

Memory safety

I have written a lot of C code over the years. More than once while in the middle of a project, I have stopped to explore ways of getting the compiler to catch my memory leaks. I tried the Clang static analyzer and Frama-C and Splint and others. It just seemed like there should be a way, even if I had to annotate function signatures with information about who owns a pointer.

So perhaps you can imagine my joy when I first read about Rust.

Even more cool, Rust has taken this set of ideas so much further than the simple feature I tried to envision. Rust doesn't just detect leaks, it also:

frees everything for you, like a garbage collector, but it's not.
prevents access to something that has been freed.
prevents modifying an iterator while it is being used.
prevents all memory corruption bugs.
automatically disposes other kinds of resources, not just allocated memory.
prevents two threads from having simultaneous access to something.

That last bullet is worth repeating: With Rust, you never stare at your code trying to figure out if it's thread safe or not. If it compiles, then it's thread safe.

Safety is Rust's killer feature, and it is very compelling.

Mutability

If you come to Rust hoping to find a great functional language, you will be disappointed. Rust does have a bunch of functional elements, but it is not really a functional language. It's not even a functional-first hybrid. Nonetheless, Rust has enough cool functional stuff available that it has been described as "ML in C++ clothing".

I did my Rust port of LSM as a line-by-line translation from the F# version. This was not a particularly good approach.

Functional programming is all about avoiding mutable things, typically by using recursion, monads, computation expressions, and immutable collections.
In Rust, mutability should not be avoided, because it's safe. If you are trying to use mutability in a way that would not be safe, your code will not compile.

So if you're porting code from a more functional language, you can end up with code that isn't very Rusty.

If you are a functional programming fan, you might be skeptical of Rust and its claims. Try to think of it like this: Rust agrees that mutability is a problem -- it is simply offering a different solution to that problem.

Learning curve

I don't know if Rust is the most difficult-to-learn programming language I have seen, but it is running in that race.

Anybody remember back when Joel Spolsky used to talk about how difficult it is for some programmers to understand pointers? Rust is a whole new level above that. Compared to Rust, regular pointers are simplistic.

With Rust, we don't just have pointers. We also have ownership, borrows, and lifetimes.

As you learn Rust, you will reach a point where you think you are starting to understand things. And then you try to return a reference from a function, or store a reference in a struct. Suddenly you have lifetime<'a> annotations<'a> all<'a> over<'a> the<'a> place<'a>.

And why did you put them there? Because you understood something? Heck no. You started sprinkling explicit lifetimes throughout your code because the compiler error messages told you to.

I'm not saying that Rust isn't worth the pain. I personally think Rust is rather brilliant.

But a little expectation setting is appropriate here. Some programming languages are built for the purpose of making programming easier. (It is a valid goal to want to make software development accessible to a wider group of people.) Rust is not one of those languages.

That said, the Rust team has invested significant effort in excellent documentation (see The Book). And those compiler error messages really are good.

Finally, let me observe that while some things are hard to learn because they are poorly designed, Rust is not one of those things. The deeper I get into this, the more impressed I am. And so far, every single time I thought the compiler was wrong, I was mistaken.

I have found it helpful to try to make every battle with the borrow checker into a learning experience. I do not merely want to end up with the compiler accepting my code. I want to understand more than I did when I started.

Error handling

Rust does not have exceptions for error handling. Instead, error handling is done through the return values of functions.

But Rust actually makes this far less tedious than it might sound. By convention (and throughout the Rust standard library), error handling is done by returning a generic enum type called Result<T,E>. This type can encapsulate either the successful result of the function or an error condition.

On top of this, Rust has a clever macro called try!. Because of this macro, if you read some Rust code, you might think it has exception handling.

// This code was ported from F# which assumes that any Stream
// that supports Seek also can give you its Length.  That method
// isn't part of the Seek trait, but this implementation should
// suffice.
fn seek_len(fs: &mut R) -> io::Result where R : Seek {
    // remember where we started (like Tell)
    let pos = try!(fs.seek(SeekFrom::Current(0)));

    // seek to the end
    let len = try!(fs.seek(SeekFrom::End(0)));

    // restore to where we were
    let _ = try!(fs.seek(SeekFrom::Start(pos)));

    Ok(len)
}

This function returns std::io::Result<u64>. When it calls the seek() method of the trait object it is given, it uses the try! macro, which will cause an early return of the function if it fails.

In practice, I like Rust's Result type very much.

The From and Error traits make it easy to combine different kinds of Result/Error values.
The distinction between errors and panics seems very clean.
I like having the compiler help me be certain that I am propagating errors everywhere I should be. (I dislike scanning library documentation to figure out if I called something that throws an exception I need to handle.)

Nonetheless, when doing a line-by-line port of F# to Rust, this was probably the most tedious issue. Lots of functions that returned () in F# changed to return Result in Rust.

Type inference

Rust does type inference within functions, but it cannot or will not infer the types of function arguments or function return values.

Very often I miss having the more complete form of type inference one gets in F#. But I do remind myself of certain things:

The Rust type system is far more complicated than that of F#. Am I holding a Foo? Or do I have a &Foo (a reference to a Foo)? Am I trying to transfer ownership of this value or not? Being a bit more explicit can be helpful.
F# type inference has its weaknesses as well. Most notably, inference doesn't work at all with method calls. This gives the object-oriented features of F# a very odd "feel", as if they don't belong in the language, but it would be unthinkable for a CLR language not to have them.
Rust has type inference for integer literals but F# does not.
The type inference capabilities of Rust may get smarter in the future.

Iterators

Rust iterators are basically like F# seq (which is an alias for .NET IEnumerable). They are really powerful and provide support for functional idioms like List.map. For example:

fn to_hex_string(ba: &[u8]) -> String {
    let strs: Vec = ba.iter()
        .map(|b| format!("{:02X}", b))
        .collect();
    strs.connect("")
}

This function takes a slice (a part of an array) of bytes (u8) and returns its representation as a hex string.
Vec is a growable array
.iter() means something different than it does in F#. Here, it is the function that returns an iterator for a slice
.map() is pretty similar to F#. The argument above is Rust's syntax for a closure.
.collect() also means something different than it does in F#. Here, it consumes the iterator and puts all the mapped results into the Vec we asked for.
.connect("") is basically a join of all the resulting strings.

However, there are a few caveats.

In Rust, you have a lot more flexibility about whether you are dealing with "a Foo" or "a reference to a Foo", and most of the time, it's the latter. Overall, this is just more work than it is in F#, and using iterators feels like it magnifies that effect.

Performance

I haven't done the sort of careful benchmarking that is necessary to say a lot about performance, so I will say only a little.

I typically use one specific test for measuring performance changes. It writes 10 LSM segments and then merges them all into one, resulting in a data file.
On that test, the Rust version is VERY roughly 5 times faster than the F# version.
The Rust and F# versions end up producing exactly the same output file.
The test is not all that fair to F#. Writing an LSM database in F# was always kind of a square-peg/round-hole endeavor.
With Rust, the difference in compiling with or without the optimizer can be huge. For example, that test runs 15 times faster with compiler optimizations than it does without.
With Rust, the LLVM optimizer can't really do its job very well if it can't do function inlining. Which it can't do across crates unless you use explicit inline attributes or turn on LTO.
In F#, there often seems to be a negative correlation between "idiomatic-ness" and "performance". In other words, the more functional and idiomatic your code, the slower it will run.
F# could get a lot faster if it could take better advantage of the ability of the CLR to do value types. For example, in F#, option and tuple always cause heap allocations.

Integer overflow

Integer overflow checking is one of my favorite features of Rust.

In languages or environments without overflow checking, unsigned types are very difficult to use safely, so people generally use signed integers everywhere, even in cases where a negative value makes no sense. Rust doesn't suffer from this silliness.

For example, the following code will panic:

let x: u8 = 255;
let y = x + 2;
println!("{}", y);

That said, I haven't quite figured out how to get overflow checking to happen on casts. I want the following code (or something very much like it) to panic:

let x: u64 = 257;
let y = x as u8;
println!("{}", y);

Note that, by default, Rust turns off integer overflow checking in release builds, for performance reasons.

Miscellany

F# is still probably the most productive and pleasant language I have ever used. But Rust is far better than C in this regard.
IMO, the Read, Write, and Seek traits are a much better design than .NET's Stream, which tries to encapsulate all three concepts.
'cargo test' is a nice, easy-to-use testing framework that is built into Cargo. I like it.
crates.io is like NuGet for Rust, and it's integrated with Cargo.
If 'cargo bench' wants to always report timings in nanoseconds, I wish it would put in a separator every three digits.
I actually like the fact that Rust is taking a stance on things like function_names_in_snake_case and TypeNamesInCamelCase, even to the point of issuing compiler warnings for names that do not match the conventions. I don't agree 100% with their style choices, and that's my point. Being opinionated might help avoid a useless discussion about something that never really matters very much anyway.
I miss printf-style format strings.
I'm not entirely sure I like the automatic dereferencing feature. I kinda wish the compiler wouldn't help me in this manner until I know what I'm doing.

Bottom line

I am seriously impressed with Rust. Then again, I thought that Eric Bana's Hulk movie was pretty good, so you might want to just ignore everything I say.

In terms of maturity and ubiquity, C has no equal. Still, I believe Rust has the potential to become a compelling replacement for C in many situations.

I look forward to using Rust more.