JavaScript split String with white space


I would like to split a String but I would like to keep white space like:

var str = "my car is red";

var stringArray [];

stringArray [0] = "my";
stringArray [1] = " ";
stringArray [2] = "car";
stringArray [3] = " ";
stringArray [4] = "is";
stringArray [5] = " ";
stringArray [6] = "red";

How I can proceed to do that?

Thanks !

You could split the string on the whitespace and then re-add it, since you know its in between every one of the entries.

var string = "text to split";
    string = string.split(" ");
var stringArray = new Array();
for(var i =0; i < string.length; i++){
    stringArray.push(string[i]);
    if(i != string.length-1){
        stringArray.push(" ");
    }
}

Update: Removed trailing space.


Using regex:

var str   = "my car is red";
var stringArray = str.split(/(\s+)/);

console.log(stringArray); // ["my", " ", "car", " ", "is", " ", "red"] 

\s matches any character that is a whitespace, adding the plus makes it greedy, matching a group starting with characters and ending with whitespace, and the next group starts when there is a character after the whitespace etc.


For split string by space like in Python lang, can be used:

   var w = "hello    my brothers    ;";
   w.split(/(\s+)/).filter( function(e) { return e.trim().length > 0; } );

output:

   ["hello", "my", "brothers", ";"]

or similar:

   w.split(/(\s+)/).filter( e => e.trim().length > 0)

(output some)


You can just split on the word boundary using \b. See MDN

"\b: Matches a zero-width word boundary, such as between a letter and a space."

You should also make sure it is followed by whitespace \s. so that strings like "My car isn't red" still work:

var stringArray = str.split(/\b(\s)/);

The initial \b is required to take multiple spaces into account, e.g. my car is red

EDIT: Added grouping


Although this is not supported by all browsers, if you use capturing parentheses inside your regular expression then the captured input is spliced into the result.

If separator is a regular expression that contains capturing parentheses, then each time separator is matched, the results (including any undefined results) of the capturing parentheses are spliced into the output array. [reference)

So:

var stringArray = str.split(/(\s+)/);
                             ^   ^
//

Output:

["my", " ", "car", " ", "is", " ", "red"]

This collapses consecutive spaces in the original input, but otherwise I can't think of any pitfalls.


str.split(' ').join('§ §').split('§');