The Slice Type
Slices let you reference a contiguous sequence of elements in a collection rather than the whole collection. A slice is a kind of reference, so it does not have ownership.
Here's a small programming problem: write a function that takes a string of words separated by spaces and returns the first word it finds in that string. If the function doesn't find a space in the string, the whole string must be one word, so the entire string should be returned.
Let's work through how we'd write the signature of this function without using slices, to understand the problem that slices will solve:
fn first_word(s: &String) -> ?
The first_word function has a &String as a parameter. We don't want ownership, so this is fine. But what should we return? We don't really have a way to talk about part of a string. However, we could return the index of the end of the word, indicated by a space. Let's try that, as shown in Listing 4-7.
Filename: src/main.rs
fn first_word(s: &String) -> usize {
1 let bytes = s.as_bytes();
for (2 i, &item) in 3 bytes.iter().enumerate() {
4 if item == b' ' {
return i;
}
}
5 s.len()
}
Listing 4-7: The first_word function that returns a byte index value into the String parameter
Because we need to go through the String element by element and check whether a value is a space, we'll convert our String to an array of bytes using the as_bytes method [1].
Next, we create an iterator over the array of bytes using the iter method [3]. We'll discuss iterators in more detail in Chapter 13. For now, know that iter is a method that returns each element in a collection and that enumerate wraps the result of iter and returns each element as part of a tuple instead. The first element of the tuple returned from enumerate is the index, and the second element is a reference to the element. This is a bit more convenient than calculating the index ourselves.
Because the enumerate method returns a tuple, we can use patterns to destructure that tuple. We'll be discussing patterns more in Chapter 6. In the for loop, we specify a pattern that has i for the index in the tuple and &item for the single byte in the tuple [2]. Because we get a reference to the element from .iter().enumerate(), we use & in the pattern.
Inside the for loop, we search for the byte that represents the space by using the byte literal syntax [4]. If we find a space, we return the position. Otherwise, we return the length of the string by using s.len() [5].
We now have a way to find out the index of the end of the first word in the string, but there's a problem. We're returning a usize on its own, but it's only a meaningful number in the context of the &String. In other words, because it's a separate value from the String, there's no guarantee that it will still be valid in the future. Consider the program in Listing 4-8 that uses the first_word function from Listing 4-7.
// src/main.rs
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s); // word will get the value 5
s.clear(); // this empties the String, making it equal to ""
// word still has the value 5 here, but there's no more string that
// we could meaningfully use the value 5 with. word is now totally invalid!
}
Listing 4-8: Storing the result from calling the first_word function and then changing the String contents
This program compiles without any errors and would also do so if we used word after calling s.clear(). Because word isn't connected to the state of s at all, word still contains the value 5. We could use that value 5 with the variable s to try to extract the first word out, but this would be a bug because the contents of s have changed since we saved 5 in word.
Having to worry about the index in word getting out of sync with the data in s is tedious and error prone! Managing these indices is even more brittle if we write a second_word function. Its signature would have to look like this:
fn second_word(s: &String) -> (usize, usize) {
Now we're tracking a starting and an ending index, and we have even more values that were calculated from data in a particular state but aren't tied to that state at all. We have three unrelated variables floating around that need to be kept in sync.
Luckily, Rust has a solution to this problem: string slices.