Regex Cheat Sheet - PCRE2

A comprehensive guide to regular expressions with patterns, examples, and explanations

PCRE2 Specific Features

(*UTF)Enable UTF mode
(*UCP)Unicode character properties
(*CRLF)CRLF newline convention
(*LF)LF newline convention
(*CR)CR newline convention
(*ANYCRLF)Any CRLF newline convention
(*ANY)Any newline convention
(*BSR_ANYCRLF)\R matches any CRLF sequence
(*BSR_UNICODE)\R matches Unicode line endings
(*NO_AUTO_POSSESS)Disable auto-possessification
(*NO_DOTSTAR_ANCHOR)Disable .* anchoring optimization
(*NO_JIT)Disable JIT compilation
(*NO_START_OPT)Disable start-up optimizations
(*LIMIT_MATCH=n)Limit match attempts
(*LIMIT_RECURSION=n)Limit recursion depth
(*LIMIT_DEPTH=n)Limit backtracking depth
(*LIMIT_HEAP=n)Limit heap memory usage

Enhanced Unicode Support

\p{L}Any letter
\p{M}Any mark
\p{N}Any number
\p{P}Any punctuation
\p{S}Any symbol
\p{Z}Any separator
\p{C}Any other (control, format, etc.)
\p{Ll}Lowercase letter
\p{Lu}Uppercase letter
\p{Lt}Titlecase letter
\p{Lm}Modifier letter
\p{Lo}Other letter
\p{Script=Latin}Latin script characters
\p{Script=Greek}Greek script characters
\p{Script=Cyrillic}Cyrillic script characters
\p{Block=BasicLatin}Basic Latin Unicode block
\p{Block=LatinExtendedA}Latin Extended-A block

Advanced Recursion

(?R)Recurse entire pattern
(?0)Recurse entire pattern (alternative)
(?1)Recurse group 1
(?+1)Recurse relative group +1
(?-1)Recurse relative group -1
(?&name)Call named subroutine
(?P>name)Call named subroutine (Python style)
\g<name>Call named subroutine (alternative)
\g'name'Call named subroutine (alternative)
\g<1>Call numbered subroutine
\g<+1>Call relative subroutine

Callouts

(?C)Callout with automatic number
(?C0)Callout with number 0
(?C255)Callout with number 255
(?C"string")String callout
(?C'string')String callout (alternative quotes)
(?C`string`)String callout (backtick quotes)

Script Runs

(*script_run:pattern)Script run - all characters from same script
(*sr:pattern)Script run (short form)
(*atomic_script_run:pattern)Atomic script run
(*asr:pattern)Atomic script run (short form)

Enhanced Backtracking Control

(*PRUNE:name)Named prune
(*SKIP:name)Named skip
(*THEN:name)Named then
(*COMMIT:name)Named commit
(*MARK:name)Set mark
(*:name)Set mark (short form)

Newline Handling

\RAny Unicode newline sequence
\r\nCRLF sequence
\nLine feed
\rCarriage return
\fForm feed
\x0BVertical tab
\x85Next line (NEL)
\x{2028}Line separator
\x{2029}Paragraph separator

Word Boundaries Enhanced

\bWord boundary
\BNon-word boundary
\AStart of subject
\ZEnd of subject or before final newline
\zEnd of subject
\GFirst matching position in subject
\KReset match start