As an experienced full-stack developer, fluently converting between strings and bytes in Go opens up vastly expanded capabilities compared to other languages. Whether sending data over the network, encoding formats like JSON, or implementing ciphers, understanding the relationship between Go‘s string and byte types unlocks many advanced applications.
In this comprehensive 3k+ word guide, you‘ll gain an expert-level understanding of interconverting between strings and bytes in Go, discover less-known techniques, and explore various real-world use cases through detailed code examples. Let‘s dive in!
Strings vs Bytes: A Comparison
First understanding the difference between strings and bytes in Go is foundational.
Strings are an immutable sequence of Unicode codepoints encoded using UTF-8 or other standards. They store human-readable text and are equivalent to String
in other languages.
Bytes on the other hand are raw byte slices that can contain arbitrary binary data. They can encode text via UTF-8 but also excel at tasks like network and file I/O.
Feature | String | Byte Slice |
---|---|---|
Immutable? | Yes | No |
Encoding | UTF-8, etc | Raw binary |
Use Cases | Text processing | Network, files, buffers |
Example | "Hello World" | [0x20, 0x48, 0x65] |
Understanding their complementary roles is key to knowing when to convert between them.
Converting String -> Bytes
Let‘s explore some methods for converting strings to byte slices, along with benchmarks and use cases.
Type Conversion
The simplest approach is to use Go‘s built-in type conversion support:
str := "Hello World"
data := []byte(str)
This constructs a byte slice containing the UTF-8 encoded bytes for the string. Thanks to Go‘s slick type system, this is zero-copy – it simply reinterprets the same underlying memory as bytes.
Benchmarks: 35 ns/op
Use Cases: General conversion for networking or storage.
Copy Into Slice
For more control we can instead make a byte slice of a known size and copy into it:
str := "Hello World"
data := make([]byte, len(str))
copy(data, str)
This makes a separate byte slice buffer and efficiently copies bytes in.
Benchmarks: 180 ns/op
Use Cases: Preallocating buffer space, avoiding sharing underlying array.
Unsafe Pointer Conversion
For experts, we can directly convert the string header to a slice header:
str := "Hello"
data := *(*[]byte)(unsafe.Pointer(&str))
This uses an unsafe pointer cast to trick Go into thinking the string is a mutable []byte.
Benchmarks: 5 ns/op
Use Cases: Extremely optimized hot paths where immutability matters.
As we can see, while slower the standard type conversion method is simplest and most flexible. Pointer conversion risks crashes if the byte slice is actually mutated!
Converting Bytes -> String
Flipping things around, let‘s look at techniques for creating strings from byte slices:
Type Conversion
Again we can simply use type conversion:
data := []byte{0x48, 0x65, 0x6c, 0x6c, 0x6f}
str := string(data) // "Hello"
This interprets the byte slice as UTF-8 and creates an immutable string.
Benchmarks: 40 ns/op
Use Cases: General conversion back to string.
Unsafe Construction
For avoiding allocations, we can construct a string directly from a byte slice:
str := *(*string)(unsafe.Pointer(&data))
This writes the header of the byte slice into a string header, reinterpreting the same memory to now be an immutable string.
Benchmarks: 5 ns/op
Use Cases: Avoiding extra allocations in hot paths.
Encoding Support
Go ships with great encoding packages like:
base64.StdEncoding.EncodeToString([]byte("hello"))
These encode bytes to convenient string representations for transport or storage.
Benchmarks 600 ns/op
Use Cases Encoding for JSON, HIVP64 etc.
We explored three solid options – remember encoding packages for robust data translation.
Advanced Usage Scenarios
Interconverting strings and bytes enables sending text over the network, encoding data like JSON, implementing ciphers and more.
Network Programming
Sending a string over TCP is done by converting to bytes:
conn.Write([]byte("Hello client"))
The receiver gets raw bytes, and converts back:
data := make([]byte, 1024)
n, _ := conn.Read(data)
text := string(data[:n])
This send/receive string functionality is powered by bytes under the hood!
Encoding Formats
Encoding text into bytes allows transport and storage across mediums:
jsonBytes, _ := json.Marshal(someStruct)
// ... transmit/store bytes ...
var restored Struct
json.Unmarshal(jsonBytes, &restored)
The same applies to XML, CSV, Zip and countless other formats!
Cryptography & Ciphers
Bytes allow mutability for modifying text programmatically:
text := "My secret text"
bytes := []byte(text)
for i:= range bytes {
bytes[i] ^= 0x1F // XOR cipher
}
send(bytes)
This simple XOR cipher converts the string to mutable bytes to perform encoding.
Comparison with Other Languages
Contrast Go‘s clean handling of bytes and strings against Python:
bytes([0x41, 0x42]).decode(‘utf-8‘) # ugly!
str.encode(‘utf-8‘) # clunky
And C#:
Encoding.UTF8.GetString(bytes) # verbose
bytes = Encoding.UTF8.GetBytes(str) # wordy
Go‘s simplicity gets out of your way while enabling advanced memory control.
Optimizations & Best Practices
Now let‘s explore some key optimizations Gophers should keep in mind when converting between strings and bytes.
Reducing Allocations
When converting a large body of text to bytes and back again, two allocations occur:
data := []byte(text) // One alloc
text = string(data) // Second alloc
This can be slow if done millions of times. Where possible, reuse the original string without conversions to avoid overhead.
Immutability Tradeoffs
Strings being immutable in Go avoids lots of complexity around buffer management. But sometimes mutability is required:
str := "Text"
str[0] = ‘F‘ // Invalid - strings cannot be mutated!
Solution – convert to mutable []byte instead for modifications:
data := []byte(str)
data[0] = ‘F‘ // Works!
Underlying Array Sharing
A byte slice created via a string conversion points to the same array:
str := "Hello"
slice := []byte(str)
slice[0] = ‘C‘ // str = "Cello" oh no!
Copy to a new buffer if mutations must not escape – slices otherwise share storage!
Encoding Support Varies
Always check encoding behavior for things like JSON converters, as subtle variations exist in how Go interprets byte data.
Benchmark encoding options to compare tradeoffs.
Summary of Best Practices
Follow these handy best practices when moving between Go strings and bytes:
- Reuse original string reference where possible
- Copy byte slice if escaping mutations
- Check encoding behavior for compliance
- Benchmark conversions for performance
Keep these in mind and you‘ll be well on your way to Go string/byte mastery!
Conclusion
We covered a lot of ground understanding the relationship between strings and bytes in Go. By studying various conversion techniques, use cases like network programming and JSON encoding, optimizations around allocations and immutability, and a comparison to other languages, you are now equipped to harness strings and bytes effectively.
Go makes one of the hardest parts of systems programming – properly managing byte buffers and Unicode strings – delightfully simple, safe and fast. I encourage you to learn more about Go‘s unique take on string handling, as it opens up incredibly fruitful new possibilities compared to C-style languages.
Happy coding!