Javascript RegExp non-capturing groups


I am writing a set of RegExps to translate a CSS selector into arrays of ids and classes.

For example, I would like '#foo#bar' to return ['foo', 'bar'].

I have been trying to achieve this with

"#foo#bar".match(/((?:#)[a-zA-Z0-9\-_]*)/g)

but it returns ['#foo', '#bar'], when the non-capturing prefix ?: should ignore the # character.

Is there a better solution than slicing each one of the returned strings?

You could use .replace() or .exec() in a loop to build an Array.

With .replace():

var arr = [];
"#foo#bar".replace(/#([a-zA-Z0-9\-_]*)/g, function(s, g1) {
                                               arr.push(g1);
                                          });

With .exec():

var arr = [],
    s = "#foo#bar",
    re = /#([a-zA-Z0-9\-_]*)/g,
    item;

while (item = re.exec(s))
    arr.push(item[1]);

It matches #foo and #bar because the outer group (#1) is capturing. The inner group (#2) is not, but that' probably not what you are checking.

If you were not using global matching mode, an immediate fix would be to use (/(?:#)([a-zA-Z0-9\-_]*)/ instead.

With global matching mode the result cannot be had in just one line because match behaves differently. Using regular expression only (i.e. no string operations) you would need to do it this way:

var re = /(?:#)([a-zA-Z0-9\-_]*)/g;
var matches = [], match;
while (match = re.exec("#foo#bar")) {
    matches.push(match[1]);
}

See it in action.


I'm not sure if you can do that using match(), but you can do it by using the RegExp's exec() method:

var pattern = new RegExp('#([a-zA-Z0-9\-_]+)', 'g');
var matches, ids = [];

while (matches = pattern.exec('#foo#bar')) {
    ids.push( matches[1] ); // -> 'foo' and then 'bar'
}

Unfortunately there is no lookbehind assertion in Javascript RegExp, otherwise you could do this:

/(?<=#)[a-zA-Z0-9\-_]*/g

Other than it being added to some new version of Javascript, I think using the split post processing is your best bet.


You can use a negative lookahead assertion:

"#foo#bar".match(/(?!#)[a-zA-Z0-9\-_]+/g);  // ["foo", "bar"]

The lookbehind assertion mentioned some years ago by mVChr is added in ECMAScript 2018. This will allow you to do this:

'#foo#bar'.match(/(?<=#)[a-zA-Z0-9\-_]*/g) (returns ["foo", "bar"])

(A negative lookbehind is also possible: use (?<!#) to match any character except for #, without capturing it.)


MDN does document that "Capture groups are ignored when using match() with the global /g flag", and recommends using matchAll(). matchAll() isn't available on Edge or Safari iOS, and you still need to skip the complete match (including the#`).

A simpler solution is to slice off the leading prefix, if you know its length - here, 1 for #.

const results = ('#foo#bar'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);

The [] || ... part is necessary in case there was no match, otherwise match returns null, and null.map won't work.

const results = ('nothing matches'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);