PHP's strlen in JavaScript

Here’s what our current JavaScript equivalent to PHP's strlen looks like.

module.exports = functionstrlen (string) {
// discuss at: https://locutus.io/php/strlen/
// original by: Kevin van Zonneveld (https://kvz.io)
// improved by: Sakimori
// improved by: Kevin van Zonneveld (https://kvz.io)
// input by: Kirk Strobeck
// bugfixed by: Onno Marsman (https://twitter.com/onnomarsman)
// revised by: Brett Zamir (https://brett-zamir.me)
// note 1: May look like overkill, but in order to be truly faithful to handling all Unicode
// note 1: characters and to this function in PHP which does not count the number of bytes
// note 1: but counts the number of characters, something like this is really necessary.
// example 1: strlen('Kevin van Zonneveld')
// returns 1: 19
// example 2: ini_set('unicode.semantics', 'on')
// example 2: strlen('A\ud87e\udc04Z')
// returns 2: 3
const str = string + ''
const iniVal = (typeofrequire !== 'undefined' ? require('../info/ini_get')('unicode.semantics') : undefined) || 'off'
if (iniVal === 'off') {
return str.length
}
let i = 0
let lgth = 0
const getWholeChar = function (str, i) {
const code = str.charCodeAt(i)
let next = ''
let prev = ''
if (code >= 0xD800 && code <= 0xDBFF) {
// High surrogate (could change last hex to 0xDB7F to
// treat high private surrogates as single characters)
if (str.length <= (i + 1)) {
thrownewError('High surrogate without following low surrogate')
}
next = str.charCodeAt(i + 1)
if (next < 0xDC00 || next > 0xDFFF) {
thrownewError('High surrogate without following low surrogate')
}
return str.charAt(i) + str.charAt(i + 1)
} elseif (code >= 0xDC00 && code <= 0xDFFF) {
// Low surrogate
if (i === 0) {
thrownewError('Low surrogate without preceding high surrogate')
}
prev = str.charCodeAt(i - 1)
if (prev < 0xD800 || prev > 0xDBFF) {
// (could change last hex to 0xDB7F to treat high private surrogates
// as single characters)
thrownewError('Low surrogate without preceding high surrogate')
}
// We can pass over low surrogates now as the second
// component in a pair which we have already processed
returnfalse
}
return str.charAt(i)
}
for (i = 0, lgth = 0; i < str.length; i++) {
if ((getWholeChar(str, i)) === false) {
continue
}
// Adapt this line at the top of any loop, passing in the whole string and
// the current iteration and returning a variable to represent the individual character;
// purpose is to treat the first part of a surrogate pair as the whole character and then
// ignore the second part
lgth++
}
return lgth
}
[ View on GitHub | Edit on GitHub | Source on GitHub ]

How to use

You you can install via npm install locutus and require it via require('locutus/php/strings/strlen'). You could also require the strings module in full so that you could access strings.strlen instead.

If you intend to target the browser, you can then use a module bundler such as Parcel, webpack, Browserify, or rollup.js. This can be important because Locutus allows modern JavaScript in the source files, meaning it may not work in all browsers without a build/transpile step. Locutus does transpile all functions to ES5 before publishing to npm.

A community effort

Not unlike Wikipedia, Locutus is an ongoing community effort. Our philosophy follows The McDonald’s Theory. This means that we don't consider it to be a bad thing that many of our functions are first iterations, which may still have their fair share of issues. We hope that these flaws will inspire others to come up with better ideas.

This way of working also means that we don't offer any production guarantees, and recommend to use Locutus inspiration and learning purposes only.

Notes

  • May look like overkill, but in order to be truly faithful to handling all Unicode characters and to this function in PHP which does not count the number of bytes but counts the number of characters, something like this is really necessary.

Examples

Please note that these examples are distilled from test cases that automatically verify our functions still work correctly. This could explain some quirky ones.

#codeexpected result
1strlen('Kevin van Zonneveld')19
2ini_set('unicode.semantics', 'on') strlen('A\ud87e\udc04Z')3

« More PHP strings functions


Star