PHP apparently lacks a function to return the byte length of a string, such as you might need for a Content-Length header. Assuming your PHP internal character encoding is a single byte charset (such as latin1 / “ISO-8859-1″) then the answer is the same as your character count and you could use strlen(). This is the default (in the US anyway) and will of course break should you ever change the internal encoding to a multibyte charset like UTF-8 and use the mbstring.func_overload faculty to transparently replace the non-multibyte functions with their multibyte equivalents.
In that case, mb_strlen() becomes strlen(). mb_strlen() is strlen() with support for different charsets, which may use different byte lengths for a single character. By default, it will use your internal encoding. However, it supports an explicit character encoding as its second parameter. Knowing that latin1 uses a single byte, mb_strlen($string,’ISO-8859-1′) will do what we want and count the bytes in $string regardless of your current internal encoding or the string’s apparent encoding.
Let’s wrap this in a function for clarity, and in case we need to change our counting method because of a change in PHP or some such helpfullness:
function bytelength ($string) {
// Returns: (int) number of bytes in string
// Use latin1 as a alias for "1 byte"
return mb_strlen($string,'ISO-8859-1');
}

Leave a Comment