{
  "$type": "site.standard.document",
  "canonicalUrl": "https://deterministic.space/secret-life-of-cows.html",
  "path": "/secret-life-of-cows.html",
  "publishedAt": "2018-06-02T00:00:00.000Z",
  "site": "at://did:plc:x67qh7v3fd7znbdhauc45ng3/site.standard.publication/3mjcd2t6afe25",
  "textContent": "A lot of people at RustFest Paris mentioned Cows\n-- which may be surprising if you've never seen [std::borrow::Cow][std::borrow::Cow]!\n\nCow in this context stands for \"Clone on Write\" and\nis a type that allows you to reuse data if it is not modified.\nSomehow, these bovine super powers of Rust's standard library\n[appear][kevins-tweet] to be a well-kept secret\neven though they are [not new][llogiqs-post].\nThis post will dig into this very useful pointer type by\nexplaining why in systems programming languages you need such fine control,\nexplain Cows in detail,\nand compare them to other ways of organizing your data.\n\n[std::borrow::Cow]: https://doc.rust-lang.org/1.26.1/std/borrow/enum.Cow.html\n[kevins-tweet]: https://twitter.com/KevinHoffman/status/1001075501358776322\n[llogiqs-post]: https://llogiq.github.io/2015/07/10/cow-redux.html\n\nOrganizing Data\n\nThis is what it all comes down to:\nPeople want to have a good, precise way to organize their data.\nAnd they want their programming language to support them.\nThat's why a lot of newer languages include a bunch of data structures\noptimized for different use cases,\nand that is also why software developers are dealing with API documentation so often.\nTo ensure that your code has the performance characteristics you expect,\nit is essential to know which piece of data is represented in which way.\n\nIn systems programming languages,\nthis is in some regards even more important.\nYou want to know:\n\n1. _exactly_ where your data lives,\n2. that it is efficiently stored,\n3. that it is removed as soon as you stop using it,\n4. and that you don't copy it around needlessly.\n\nEnsuring all these properties is a great way to write fast programs.\nLet's look at how we can do this in Rust.\n\nWhere Does Our Data Live\n\nIt is quite explicit where your data lives.\nBy default, primitive types and structs containing primitive types are allocated on the stack,\nwithout any dynamic memory allocation.\nIf you want to store data of a size only known at runtime\n(say the text content of a file),\nyou need to use a type that dynamically allocates memory (on the heap),\nfor example [String], or [Vec].\nYou can explicitly allocate a data type on the heap by wrapping it in a [Box].\n\n(If you're unfamiliar with the notion of \"Stack and Heap\",\nyou can find a good explanation in\n[this chapter][rust-book-ownership]\nof the official Rust book.)\n\n[rust-book-ownership]: https://doc.rust-lang.org/1.26.1/book/second-edition/ch04-01-what-is-ownership.html\n[String]: https://doc.rust-lang.org/1.26.1/std/string/struct.String.html\n[Vec]: https://doc.rust-lang.org/1.26.1/std/vec/struct.Vec.html\n[Box]: https://doc.rust-lang.org/1.26.1/std/boxed/struct.Box.html\n\nNote: Creating a new (not-empty) String means allocating memory,\nwhich is a somewhat costly operation.\nA language like Rust gives you quite a few options to\nskip some allocations,\nand doing so can speed up performance-critical parts of your code significantly.\n(Spoiler: Cow is one of these options.)\n\nStructuring Data\n\nIf you know what you will do with your data,\nyou can probably figure out how to best store it.\nIf you for example always iterate through a known list of values, an array (or a [Vec]) is the way to go.\nIf you need to look up values by known keys, and don't care about the order they are stored in, a [hash map] sounds good.\nIf you need a stack to put data onto from different threads, you can use [crossbeam-deque].\nThis is just to give you a few examples -- there are books on this topic and you should read them.\nA Cow doesn't really help you here per-se, but you can use it _inside_ your data structures.\n\n[hash map]: https://doc.rust-lang.org/1.26.1/std/collections/struct.HashMap.html\n[crossbeam-deque]: https://crates.io/crates/crossbeam-deque\n\nDropping Data\n\nLuckily, in Rust it is easy to\nmake sure our data gets removed from memory\nas soon as possible\n(so we don't use up too much memory and slow down the system).\nRust uses the ownership model of automatically [dropping][rust-book-memory-and-allocation] resources when they go out of scope,\nso it doesn't need to periodically run a garbage collector to free memory.\nYou can still waste memory, of course, by allocating too much of it manually,\nor by building reference cycles and leaking it.\n\n[rust-book-memory-and-allocation]: https://doc.rust-lang.org/1.26.1/book/second-edition/ch04-01-what-is-ownership.html#memory-and-allocation\n\nNo Needless Copying\n\nOne important step towards being a responsible citizen in regard to memory usage is to not copy data more than necessary.\nIf you for example have a function that removes whitespace at the beginning of a string,\nyou could create a new string that just contains the characters after the leading whitespace.\n(Remember: A new string means a new memory allocation.)\nOr, you could return a _slice_ of the original string, that starts after the leading whitespace.\nThe second options requires that we keep the original data around,\nbecause our new slice is just referencing it internally.\nThis means that instead of copying however many bytes your string contains,\nwe just write two numbers:\nA pointer to the point in the original string after the leading whitespace,\nand the length of the remaining string that we care about.\n(Carrying the length with us is a convention in Rust.)\n\nBut what about a more complicated function?\nLet's imagine we want to replace some characters in a string.\nDo we always need to copy it over with the characters swapped out?\nOr can we be clever and return some pointer to the original string if there was no replacement needed?\nIndeed, in Rust we can! This is what Cow is all about.\n\nWhat is a Cow Anyway\n\nIn Rust, the abbreviation \"Cow\" stands for \"clone on write\"[^clone].\nIt is an enum with two states: Borrowed and Owned.\nThis means you can use it to abstract over\nwhether you own the data or just have a reference to it.\nThis is especially useful when you want to _return_ a type\nfrom a function that may or may not need to allocate.\n\n[^clone]: Yes, that's right: _Clone_ on write, not _copy_ on write. That's because in Rust, the Copy trait is guaranteed to be a simple memcpy operation, while Clone can also do custom logic (like recursively clone a HashMap<String, String>.\n\nA std Example\n\nLet's look at an example.\nSay you have a [Path] and want to convert it to a string.\nSadly, not every filesystem path is valid UTF-8\n(Rust strings are guaranteed to be UTF-8 encoded).\nRust has a handy function to get a string regardless:\n[Path::to_string_lossy].\nWhen the path is valid UTF-8 already,\nit will return a reference to the original data,\notherwise it will create a new string\nwhere invalid characters are replaced with the � character.\n\n[Path]: https://doc.rust-lang.org/1.26.1/std/path/struct.Path.html\n[Path::to_string_lossy]: https://doc.rust-lang.org/1.26.1/std/path/struct.Path.html#method.to_string_lossy\n\nA Beefy Definition\n\nWith that in mind, let's look at [the actual definition of Cow][std::borrow::Cow]:\n\nAs you can see, it takes some convincing to have Rust accept this type\nin a way we can work with it.\nLet's go through it one by one.\n\n- 'a is the [lifetime][rust-book-lifetime] that we need our data to be valid for.\n  For the Owned case it's not very interesting\n  (to Cow own the data -- it's valid until the Cow goes out of scope),\n  but in case the Cow contains Borrowed data,\n  this lifetime is a restriction set by the data we refer to.\n  We cannot have a Cow that refers to already freed memory,\n  and rustc will let us know when that is possible by mentioning that the Cow outlives its 'a.\n- [ToOwned] is a trait that defines a method to convert borrowed data into owned data\n  (by cloning it and giving us ownership of the new allocation, most likely).\n  The type we receive from this method is an [associated type][rust-book-advanced-traits] on the trait,\n  and its name is Owned (yep, the same name as the Cow variant, sorry).\n  This allows us to refer to it in Owned(<B as ToOwned>::Owned).\n\n  To make this a bit more concrete, let's assume we have a Cow that's storing a &str (in the Borrowed case).\n  The ToOwned implementation of str has type Owned = String, so <&str as ToOwned>::Owned == String.\n- [?Sized][rust-book-sized] is a funny one.\n  By default, Rust expects all types to be of a known size,\n  which it expresses by having an implicit constraint on the [Sized marker trait].\n  You can explicitly opt-out of this by adding a \"constraint\" on ?Sized.\n\n  The thing is: Not all possible types have a known size.\n  For example, [u8] is an array of bytes somewhere in memory, but we don't know its length.\n  In your application code you won't see a type like this directly,\n  you'll see it behind _references_ instead.\n  And note: In Rust, the reference itself can contain the length.\n  (See what I wrote above about slices!)\n\n  But how does that relate to Cow again?\n  You see, the B in Cow's definition is behind a reference:\n  Once directly visible in the Borrowed variant,\n  and the second type hidden in the [ToOwned::Owned] (which is of type [Borrow<Self>]).\n  Since a Cow should be able to contain a &[u8],\n  its definition needs to work for &'a B where B = [u8].\n  That in turn means need to say:\n  \"we don't require this to be Sized, we know it's behind a reference anyway\"\n  -- which is exactly what the ?Sized syntax does.\n\n[rust-book-lifetime]: https://doc.rust-lang.org/1.26.1/book/second-edition/ch10-03-lifetime-syntax.html\n[Sized marker trait]: https://doc.rust-lang.org/1.41.0/std/marker/trait.Sized.html\n[ToOwned]: https://doc.rust-lang.org/1.26.1/std/borrow/trait.ToOwned.html\n[ToOwned::Owned]: https://doc.rust-lang.org/1.41.0/std/borrow/trait.ToOwned.html#associatedtype.Owned\n[Borrow<Self>]: https://doc.rust-lang.org/1.41.0/std/borrow/trait.Borrow.html\n[rust-book-advanced-traits]: https://doc.rust-lang.org/1.26.1/book/second-edition/ch19-03-advanced-traits.html\n[rust-book-sized]: https://doc.rust-lang.org/1.26.1/book/second-edition/ch19-04-advanced-types.html#dynamically-sized-types",
  "title": "The Secret Life of Cows",
  "updatedAt": "2018-06-02T00:00:00.000Z"
}