Strings in Rust
Working with strings in Rust is quite different from JavaScript. While JavaScript has a single string type with built-in Unicode support, Rust has multiple string types with explicit UTF-8 encoding. This guide will help you understand the differences and effectively work with strings in Rust.
String Types in Rust
Section titled “String Types in Rust”Rust has two main string types:
String
: A growable, heap-allocated UTF-8 encoded string (similar to JavaScript’sString
)&str
: A string slice, which is a reference to a UTF-8 encoded string (often stored elsewhere)
Think of String
as owning its data (like a vec![u8]
with UTF-8 constraints), and &str
as borrowing string data.
Creating Strings
Section titled “Creating Strings”Let’s compare how we create strings in JavaScript versus Rust:
// JavaScript string creation
let greeting = "Hello, world!";
let name = 'Alice';
let message = `Hello, ${name}!`;
// Creating from other values
let numAsString = String(42);
let boolAsString = String(true);
In Rust:
// String literals (these are &str)
let greeting = "Hello, world!";
// Creating a String
let owned_greeting = String::from("Hello, world!");
let also_owned = "Hello, world!".to_string();
let empty = String::new();
// String interpolation equivalent
let name = "Alice";
let message = format!("Hello, {}!", name);
// Creating from other types
let num_as_string = 42.to_string();
let bool_as_string = true.to_string();
Key differences:
- Rust distinguishes between string literals (
&str
) and owned strings (String
) - No template literal syntax in Rust; use
format!
macro instead - Single quotes in Rust are for
char
types, not strings - To convert values to strings, use the
.to_string()
method orformat!
macro
String vs &str: When to Use Each
Section titled “String vs &str: When to Use Each”Here’s a simple rule of thumb:
- Use
&str
for function parameters when you only need to read the string - Use
String
when you need to own or modify the string
// Takes a string slice - function doesn't need ownership
fn print_info(info: &str) {
println!("Info: {}", info);
}
// Returns an owned String - function creates a new string
fn generate_greeting(name: &str) -> String {
format!("Hello, {}!", name)
}
fn main() {
let name = "Alice";
// Both String and &str can be passed where &str is needed
print_info(name);
let owned_name = String::from("Bob");
print_info(&owned_name); // &String coerces to &str
// Generate and own a new String
let greeting = generate_greeting(name);
println!("{}", greeting);
}
String Concatenation and Building
Section titled “String Concatenation and Building”JavaScript makes string concatenation easy:
// JavaScript concatenation
let greeting = "Hello, ";
let name = "world";
let message = greeting + name + "!";
// Or with +=
let fullMessage = "Hello";
fullMessage += ", ";
fullMessage += "world!";
Rust offers several approaches:
// Using + operator (consumes the first string)
let greeting = String::from("Hello, ");
let name = "world";
let message = greeting + name + "!";
// Note: greeting is moved/consumed here and can't be used again
// Using format! macro (doesn't consume any strings)
let greeting = String::from("Hello, ");
let name = "world";
let message = format!("{}{}{}", greeting, name, "!");
// greeting and name are still valid here
// Using += via push_str() (modifies the original)
let mut full_message = String::from("Hello");
full_message.push_str(", ");
full_message.push_str("world!");
// Adding a single character with push()
let mut hi = String::from("Hi");
hi.push('!');
Key differences:
- The
+
operator in Rust consumes the first string - The second argument for
+
must be a string slice (&str
), not an ownedString
- For multiple concatenations,
format!
is often clearer and more efficient - To mutate a string, you need to declare it as
mut
String Length and Size
Section titled “String Length and Size”JavaScript’s length
property gives the number of UTF-16 code units:
let text = "Hello";
console.log(text.length); // 5
let emoji = "👋";
console.log(emoji.length); // 2 (because emoji uses multiple UTF-16 code units)
Rust provides different methods for different concepts of “length”:
let text = "Hello";
println!("{}", text.len()); // 5 bytes
let emoji = "👋";
println!("{}", emoji.len()); // 4 bytes
println!("{}", emoji.chars().count()); // 1 Unicode scalar value
Key differences:
- Rust’s
.len()
gives the size in bytes, not characters - Use
.chars().count()
to count Unicode scalar values - Rust strings are always UTF-8 encoded
Accessing Characters
Section titled “Accessing Characters”JavaScript allows indexing strings to get characters:
let greeting = "Hello";
let firstChar = greeting[0]; // "H"
// Caution with Unicode!
let emoji = "👋";
console.log(emoji[0]); // Gives a surrogate half, not the actual character
// Better for Unicode:
let firstEmoji = Array.from(emoji)[0]; // "👋"
Rust doesn’t allow direct indexing of strings, for good reason:
let greeting = "Hello";
// let first_char = greeting[0]; // This doesn't compile in Rust!
// Instead, use one of these approaches:
let first_byte = greeting.as_bytes()[0]; // 72 (ASCII for 'H')
// For the first character:
if let Some(first_char) = greeting.chars().next() {
println!("First char: {}", first_char); // "H"
}
// To get a character at a specific position (inefficient for large indices):
let third_char = greeting.chars().nth(2).unwrap(); // 'l'
Key differences:
- Rust doesn’t allow direct string indexing because UTF-8 characters can span multiple bytes
- Access individual bytes with
.as_bytes()[i]
- Access Unicode characters with
.chars()
iterator
Slicing Strings
Section titled “Slicing Strings”JavaScript uses the slice
method:
let greeting = "Hello, world!";
let hello = greeting.slice(0, 5); // "Hello"
Rust uses range syntax for slices, but requires care with UTF-8:
let greeting = "Hello, world!";
let hello = &greeting[0..5]; // "Hello"
// Be careful! This must be at valid UTF-8 character boundaries
// let will_panic = &"👋 hello"[0..2]; // PANICS: index 2 is not a character boundary
Key differences:
- Rust slices must be at valid UTF-8 character boundaries
- Indexing a non-boundary results in a runtime panic, not a silent error
- For safety with Unicode, consider using character-based methods
String Iteration
Section titled “String Iteration”JavaScript iteration methods:
let greeting = "Hello";
// By code units (caution with Unicode!)
for (let i = 0; i < greeting.length; i++) {
console.log(greeting[i]);
}
// By Unicode code points (better for international text)
for (const char of greeting) {
console.log(char);
}
Rust iteration methods:
let greeting = "Hello";
// By bytes
for b in greeting.bytes() {
println!("{}", b); // Prints byte values (72, 101, 108, 108, 111)
}
// By Unicode scalar values (chars)
for c in greeting.chars() {
println!("{}", c); // Prints characters (H, e, l, l, o)
}
// With indices (byte positions, not character positions)
for (i, c) in greeting.char_indices() {
println!("Character '{}' at byte position {}", c, i);
}
String Conversion and Casting
Section titled “String Conversion and Casting”JavaScript automatic conversions:
let num = 42;
let boolVal = true;
let str1 = "Value: " + num; // Converts num to string
let str2 = "Is true: " + boolVal; // Converts bool to string
Rust explicit conversions:
let num = 42;
let bool_val = true;
// Convert to String (owned)
let str1 = format!("Value: {}", num);
let str2 = format!("Is true: {}", bool_val);
// Convert from String to other types
let num_str = "42";
let parsed_num: i32 = num_str.parse().expect("Not a number!");
// More robust error handling
match "42a".parse::<i32>() {
Ok(n) => println!("Successfully parsed: {}", n),
Err(e) => println!("Failed to parse: {}", e),
}
UTF-8 and Unicode Handling
Section titled “UTF-8 and Unicode Handling”JavaScript treats strings as sequences of UTF-16 code units:
let face = "😊";
console.log(face.length); // 2 (UTF-16 surrogate pair)
Rust treats strings as UTF-8 encoded bytes:
let face = "😊";
println!("Bytes: {}", face.len()); // 4 bytes
println!("Characters: {}", face.chars().count()); // 1 character
// Common operations with Unicode
let text = "héllo";
println!("Uppercase: {}", text.to_uppercase());
println!("Lowercase: {}", text.to_lowercase());
// Normalizing Unicode (requires the unicode-normalization crate)
// use unicode_normalization::UnicodeNormalization;
// let normalized = text.nfc().collect::<String>();
String Searching and Replacement
Section titled “String Searching and Replacement”JavaScript offers various search methods:
let sentence = "The quick brown fox";
console.log(sentence.includes("quick")); // true
console.log(sentence.startsWith("The")); // true
console.log(sentence.endsWith("fox")); // true
console.log(sentence.indexOf("brown")); // 10
// Replacement
let replaced = sentence.replace("quick", "swift");
Rust has similar methods:
let sentence = "The quick brown fox";
println!("{}", sentence.contains("quick")); // true
println!("{}", sentence.starts_with("The")); // true
println!("{}", sentence.ends_with("fox")); // true
println!("{}", sentence.find("brown").unwrap_or(0)); // 10
// Replacement (creates a new string)
let replaced = sentence.replace("quick", "swift");
Splitting and Joining Strings
Section titled “Splitting and Joining Strings”JavaScript:
// Split
let comma_str = "apple,banana,orange";
let fruits = comma_str.split(","); // ["apple", "banana", "orange"]
// Join
let joined = fruits.join("-"); // "apple-banana-orange"
Rust:
// Split
let comma_str = "apple,banana,orange";
let fruits: Vec<&str> = comma_str.split(',').collect();
println!("{:?}", fruits); // ["apple", "banana", "orange"]
// Join
let joined = fruits.join("-"); // "apple-banana-orange"
String Ownership and Manipulation
Section titled “String Ownership and Manipulation”JavaScript strings are immutable, but this is handled behind the scenes:
let greeting = "Hello";
greeting = greeting + ", world!"; // Creates a new string behind the scenes
Rust makes ownership explicit:
let greeting = String::from("Hello");
// This consumes greeting and creates a new String
let new_greeting = greeting + ", world!";
// greeting is no longer valid here!
// To keep the original, we could clone it:
let greeting = String::from("Hello");
let new_greeting = greeting.clone() + ", world!";
// Both greeting and new_greeting are valid
Strings and Functions
Section titled “Strings and Functions”JavaScript string methods create new strings:
let original = "Hello, world!";
let upper = original.toUpperCase();
let trimmed = original.trim();
// original is unchanged
Rust string methods also create new strings:
let original = "Hello, world!";
let upper = original.to_uppercase();
let trimmed = original.trim();
// original is unchanged
However, Rust’s ownership system makes function interactions different:
fn process(s: String) {
// Takes ownership of the string
println!("Processing: {}", s);
} // s is dropped here
fn borrow_process(s: &str) {
// Borrows the string
println!("Processing: {}", s);
} // just the reference is dropped, not the string
fn main() {
let owned = String::from("hello");
// process(owned); // This would consume owned
// println!("{}", owned); // Error! owned was moved
// Instead, use references:
borrow_process(&owned); // Borrow, don't take ownership
println!("{}", owned); // Still valid
}
String Performance Considerations
Section titled “String Performance Considerations”For performance-critical code, be aware of these considerations:
// Pre-allocate capacity for efficiency
let mut s = String::with_capacity(100);
for i in 0..100 {
s.push_str(&i.to_string());
}
// Avoid excessive cloning
let base = String::from("Hello");
// Bad: clones on every iteration
for i in 0..1000 {
let message = base.clone() + &i.to_string();
// Use message
}
// Better: reuse a single String
let base = "Hello";
let mut message = String::with_capacity(100);
for i in 0..1000 {
message.clear();
message.push_str(base);
message.push_str(&i.to_string());
// Use message
}
Advanced String Patterns
Section titled “Advanced String Patterns”String Interning
Section titled “String Interning”JavaScript strings are automatically interned by the engine:
let a = "hello";
let b = "hello";
console.log(a === b); // true, they reference the same memory
In Rust, string literals are interned, but String
instances are not:
let a = "hello"; // &str
let b = "hello"; // &str
println!("{}", std::ptr::eq(a, b)); // true, they reference the same memory
let owned_a = String::from("hello");
let owned_b = String::from("hello");
println!("{}", std::ptr::eq(&owned_a, &owned_b)); // false, different allocations
For applications needing string interning, consider crates like string-cache
.
Raw Strings
Section titled “Raw Strings”JavaScript:
let regex = "\\d+"; // Need to escape backslashes
let path = "C:\\Program Files\\App"; // Need to escape backslashes
Rust:
let regex = r"\d+"; // No need to escape backslashes
let path = r"C:\Program Files\App"; // No need to escape backslashes
// For strings with quotes:
let json = r#"{"name": "John", "age": 30}"#;
// For strings that contain #"
let complex = r##"A string with #"quotes"# inside"##;
Common String Operations Cheat Sheet
Section titled “Common String Operations Cheat Sheet”Operation | JavaScript | Rust |
---|---|---|
Create string | let s = "hello" | let s = "hello" (for &str ) let s = String::from("hello") (for String ) |
Concatenate | s1 + s2 | s1 + &s2 (consumes s1) format!("{}{}", s1, s2) |
Get length | s.length | s.len() (bytes) s.chars().count() (characters) |
Uppercase/Lowercase | s.toUpperCase() | s.to_uppercase() |
Trim whitespace | s.trim() | s.trim() |
Contains substring | s.includes("sub") | s.contains("sub") |
Replace | s.replace("old", "new") | s.replace("old", "new") |
Split | s.split(",") | s.split(",") |
Join | array.join(",") | vec.join(",") |
Summary
Section titled “Summary”Strings in Rust are more complex than in JavaScript, but this complexity provides greater control and safety:
- Rust distinguishes between
String
(owned) and&str
(borrowed) types - Rust strings are always valid UTF-8, with explicit handling for Unicode
- Indexing and slicing have safety checks to prevent invalid UTF-8
- The ownership system affects how you pass strings to functions
- Explicit conversion methods replace JavaScript’s implicit conversions
These differences require some adjustment for JavaScript developers, but they prevent entire classes of string-related bugs and make international text handling more reliable.
Next Steps
Section titled “Next Steps”Now that you’ve learned about Rust’s string types, let’s continue our exploration of Rust collections with Hash Maps. Hash maps in Rust are similar to JavaScript objects and Map collections, and we’ll explore their unique features in the next section.