Byte size of a Python string
The byte size of a string is the number of bytes required to encode it. This is particularly useful when working with strings that contain special characters or emojis, as they may require more than one byte to encode.
While the len()
function returns the number of characters in a string, it doesn't account for the number of bytes required to encode it. To get the byte size of a string, you can use the str.encode()
method to encode the string and then return the length of the encoded string.
def byte_size(s): return len(s.encode('utf-8')) byte_size('š') # 4 byte_size('Hello World') # 11