Regular Expression Special Characters

"."---Any single character(a "wildcard")

"["---Begin character class

"]"---End character class

"{"---Begin count

"}"---End count

"("---Begin grouping

")"---End grouping

"\"---Next character has a special meaning

"*"---Zero or more

"+"---One or more

"?"---Optional(zero or one)

"!"---Alternative(or)

"^"---Start of line; negation

"$"---End of line

Example:

case 1:

        ^A*B+C?

$

explain 1:

        以A开头。有多个或者没有B。有至少一个C。之后有没有都能够,结束。

A pattern can be optional or repeated(the default is exactly once) by adding a suffix:

Repetition

{n}---Exactly n times;

{n,}---no less than n times;

{n,m}---at least n times and at most m times;

*---Zero or more , that is , {0,}

+---One or more, that is ,{1,}

?---Optional(zero or one), that is {0,1}

Example:

case 1:

        A{3}B{2,4}C*

explain 1:

        AAABBC  or  AAABBB

A suffix ? after any of the repetition notations makes the pattern matcher "lazy" or "non-greedy".

That is , when looking for a pattern, it will look for the shortest match rather than the lonest.

By default, the pattern matcher always looks for the longest match (similar to C++'s Max rule).

Consider:

    ​ababab

The pattern (ab)*matches all of "ababab". However, (ab)*? matches only the first "ab".

The most common character classifications have names:

Character Classes

alnum --- Any alphanumeric character

alpha --- Any alphanumeric character

blank --- Any whitespace character that is not a line separator

cntrl --- Any control character

d --- Any decimal digit

digit --- Any decimal digit

graph --- Any graphical character

lower --- Any lowercase character

print --- Any printable character

punct --- Any punctuation character

s --- Any whitespace character

space --- Any whitespace character

upper --- Any uppercase charater

w --- Any word character(alphnumeric characters plus the underscore)

xdigit --- Any hexadecimal digit character

Several character classes are supported by shorthand notation:

Character Class Abbreviations

\d --- A decimal digit --- [[:digit:]]

\s --- A space (space tab,...) --- [[:space:]]

\w --- A letter(a-z) or digit(0-9) or underscore(_) --- [_[:alnum:]]

\D --- Not \d --- [^[:digit:]]

\S --- Not \s --- [^[:space:]]

\W --- Not \w --- [^_[:alnum:]]

In addition, languages supporting regular expressions often provide:

Nonstandard (but Common)  Character Class Abbreviations

\l --- A lowercase character --- [[:lower:]]

\u --- An uppercase character --- [[:upper;]]

\L --- Not \l --- [^[:lower:]]

\U --- Not \u --- [^[:upper:]]

Note the doubling of the backslash to include a backslash in an ordinary string literal.

As usual, backslashes can denote special charaters:

Special Characters

\n --- Newline

\t --- Tab

\\ --- One backslash

\xhh -- Unicode characters expressed using twp hexadecimal digits

\uhhh --- Unicode characters expressed using four hexadecimal digits

To add to the opportunites for confusion, two further logically differents uses of the backslash are provided:

Special Characters

\b --- The first or last character of a word (a "boundary character")

\B --- Not a \b

\i --- The ith sub_match in this pattern

Here are some examples of patterns:

Ax*    ​    ​//A,Ax,Axxxx

Ax+    ​    ​//Ax,Axxx not A

\d-?

\d    ​//1-2,12 not 1--2

\w{2}-d{4,5}    ​    ​//Ab-1234,XX54321,22-5432

(\d*:)?(\d+)    ​    ​  //12:3, 1:23, 123, :123 Not 123:

(bs|BS)    ​    ​    ​    ​  //bs ,BS Not bS

[aeiouy]    ​    ​    ​    ​//a,o,u    An English vowel, not x

[^aeiouy]    ​    ​    ​ //x,k     Not an English vowel, not e

[a^eiouy]    ​    ​    ​ //a,^,o,u   An Engish vowel or ^

以下是測试代码:

#include <iostream>
#include <regex> using namespace std; int main()
{
const char* reg_esp = "^A*B+C? $";
regex rgx(reg_esp);
cmatch match;
const char* target = "AAAAAAAAABBBBBBBBC";
if(regex_search(target,match,rgx))
{
for(size_t a = 0;a < match.size();a++)
cout << string(match[a].first,match[a].second) << endl;
}
else
cout << "No Match Case !" << endl;
return 0;
}

最新文章

  1. vs2015 生成项目时,提示执行失败,参数错误
  2. MySQL5.6 GTID、多线程复制
  3. global &amp; nonlocal
  4. Android之APK文件签名——keytool和jarsigner
  5. MSSQL 多个表关联更新
  6. ArrayList集合排序
  7. BZOJ 3926 &amp;&amp; ZJOI 2015 诸神眷顾的幻想乡 (广义后缀自动机)
  8. 在服务器上php执行某些远程函数出错
  9. linux下Java环境的配置
  10. 第三记“晋IT”分享成长沙龙
  11. CREATE DATABASE RoomReservation
  12. TP3.2 图片上传及缩略图
  13. Python Cook函数笔记 【第一章】
  14. css之overflow应用
  15. Ubuntu 16.04 LTS 安装 miniconda
  16. Go语言数组
  17. OEMCC 13.2 集群版本安装部署
  18. SpringBoot和Mybatis的整合
  19. FileInputStream类与FileOutputStream类
  20. Vivado HLS初识---阅读《vivado design suite tutorial-high-level synthesis》(4)

热门文章

  1. 完美去除WPF按钮的边框
  2. sqlalchemy操作
  3. android如何使用自己定义JNI接口,以及NDK环境建设和使用的工具。
  4. WPF 3D:使用GeometryModel3D的BackMaterial
  5. JavaScript获取路径
  6. 安卓的sqlite增删改
  7. A. Initial Bet(Codeforces Round #273)
  8. ueditor文本编辑器的使用
  9. WPF绘制党徽(立体效果,Cool)
  10. HTML——博客页面布局