Artboard 16light, inspiration, solution, idea, innovation,Google Sheets iconSwift icon
Published at
Updated at
Reading time
This post is part of my Today I learned series in which I share all my web development learnings.

Unicode is such an interesting topic, and it feels like there are new things to discover every day. Today was one of these days. I was reading a blog post and came across the u flag. I haven't seen this regular expression flag, and I found myself reading Axel's chapter in "Exploring ES6" on that topic.

So what's this u flag?

In JavaScript, we've got the "problem" that strings are represented in UTF-16 which means that not every character can be represented with a single code unit. This behavior leads to weird length properties of certain strings, and it becomes tricky when you deal with surrogate pairs.

In short: surrogate pairs are two Unicode code units representing a single character.

If you want to learn more about Unicode or Regular Expressions in JavaScript, have a look at these two talks:

Should the period (.) in regular expressions (.) match a character that needs two code units then? This is where the u flag comes into play.

Let's have a look at an example:

const emoji = '\u{1F60A}'; // "smiling face with smiling eyes" / "๐Ÿ˜Š"
emoji.length               // 2 -> it's a surrogate pair
/^.$/.test(emoji)          // false
/^.$/u.test(emoji)         // true

The unicode mode (//u) enables the use of code point escape sequences (\u{1F42A}) in regular expressions and they help when dealing with surrogate pairs.

const emoji = '\u{1F42A}';  // "๐Ÿช"
/\u{1F42A}/.test(emoji);    // false
/\uD83D\uDC2A/.test(camel); // true
/\u{1F42A}/u.test(emoji);   // true

Unicode mode helps deal with Unicode in Regular Expressions. Read Axel's book chapter or Mathias Bynens' article on the topic if you want to learn more. Have fun!

Related Topics

Related Articles