JavaScript正则表达式lastIndex及其他属性

2023年10月31日 671点热度 0人点赞 0条评论

lastIndex

lastIndex 是正则表达式的一个可读可写的整数，它用于指定下一次匹配开始的索引位置，只有正则表达式使用了 "g"标志(全局匹配) 或者 "y" 标志(粘性匹配)时这个属性才会生效。

在每次使用正则进行匹配之后，lastIndex 这个值会自动设为下一次开始匹配的索引位置，但也可以手动设置这个值来指定下一次匹配开始的索引位置。

在每次开始匹配时，如果 lastIndex 等于或小于字符串的长度，则该正则表达式将会从lastIndex 位置开始匹配。

如果匹配失败或lastIndex大于字符串的长度，正则表达式在下次匹配时将会从字符串起始位置开始匹配，且lastIndex 被重置为 0。

如果匹配成功，lastIndex 会被设置本次匹配成功的下一个位置。

但是有一种情况需要注意，如果下次匹配的起始位置位于字符串的结尾，且正则表达式能够成功匹配空字符时，那么该正则表达式将会永远重复匹配此处。

const regexp = /(hi)?/g;
console.log(regexp.exec('hi'));
// [ 'hi', 'hi', index: 0, input: 'hi', groups: undefined ]
console.log(regexp.lastIndex);
// 2
console.log(regexp.exec('hi'));
// [ '', undefined, index: 2, input: 'hi', groups: undefined ]
console.log(regexp.lastIndex);
// 2
console.log(regexp.exec('hi'));
// [ '', undefined, index: 2, input: 'hi', groups: undefined ]
console.log(regexp.lastIndex);
// 2

"?"表示可以匹配0次或1次，换句话说任何内容都可以匹配成功。同时也意味着该正则将能够匹配起始索引为2的位置的内容，即使它位于一个字符串的末尾，且无论匹配多少次lastIndex永远为2。

source

source 属性返回表示当前正则表达式内容的字符串形式，该字符串不包含正则表达式两边的斜杠和标志字符，source 也就是实例化Regexp时传入的第一个参数。

const regexp_1 = /123abc/g;
const regexp_2 = new RegExp('123abc', 'g');
console.log(regexp_1, regexp_2);
// 123abc 123abc

flags

flags 属性是这个正则表达式对象的标志所组成得一个字符串。

const regexp = /abc/gim;
console.log(regexp.flags);
// gim

dotAll

dotAll 属性是一个Boolean类型的值，用于表示是否在正则表达式中使用"s"标志。使用"s"标志将会允许"."元字符能够匹配包括行终止符在内的所有字符。

const regexp = /./;
const regexp_s = /./s;
console.log(regexp.test('\n'), regexp_s.text('\n'));
// false true
console.log(regexp_s.dotAll);
// true

global

global 属性是一个Boolean类型的值，用于表示是否在正则表达式中使用"g"标志。使用"g"标志将允许正则表达式在整个字符串中进行查找匹配所有的符合项。

const regexp = /abc/g;
console.log(regexp.global);
// true

ignoreCase

ignoreCase 属性是一个Boolean类型的值，用于表示是否在正则表达式中使用"i"标志。"i"标志表示允许正则表达式在匹配时忽略字母大小写差异。

const regexp = /abc/i;
console.log(regexp.ignoreCase);
// true

multiline

multiline 属性是一个Boolean类型的值，用于表示是否在正则表达式中使用"m"标志。"m"标志表示正则表达式启用多行匹配模式，以方便进行多行文本处理。

const regexp = /abc\nabc/m;
console.log(regexp.multiline);
// true

sticky

sticky 属性是一个Boolean类型的值，用于表示是否在正则表达式中使用"y"标志。"y"标志表示正则表达式执行的是粘性(sticky)匹配。粘性匹配是指从目标字符串的指定位置开始匹配，而不是在整个字符串中起始位置开始查找匹配。

const regexp = /abc\nabc/y;
console.log(regexp.sticky);
// true

unicode

unicode 属性是一个Boolean类型的值，用于表示正则表示启用Unicode匹配模式，这允许正则表达式中出现Unicode码点，这将可以实现非常简单的处理各种 Unicode 字符，特别是一些特殊的Unicode字符和码点。

const regexp = /\u{30}/u;
console.log(regexp.test('0'));
// true
console.log(regexp.unicode);
// true

hasIndices

hasIndices 属性是一个Boolean类型的值，用于表示正则表示使用"d"标志。它是在ECMAScript 2022标准中加入。

const regexp = /(\d{2}).(\d{2})/dg;
const text = '北京时间今天上午11时14分,中国成功发射了神舟十七号载人航天飞船。';
console.log(regexp.hasIndices);
// true
console.log(regexp.exec(text));
// [
//   '11时14',
//   '11',
//   '14',
//   index: 8,
//   input: '北京时间今天上午11时14分,中国成功发射了神舟十七号载人航天飞船。',
//   groups: undefined,
//   indices: [ [ 8, 13 ], [ 8, 10 ], [ 11, 13 ], groups: undefined ]
// ]