A regular expression, often called a pattern, is an expression that describes a set of strings.
They are usually used to give a concise description of a set, without having to list all elements. The Unix utilities like sed and grep make extensive use of regular expressions. Scripting languages like Perl have regular expression engines built directly into their syntax .
Extensive documentation about Regular expressions in Perl can be found at: http://perldoc.perl.org/perlre.html
ROOT has this capability through the use of the P(erl) C(ompatible) R(egular) E(xpression)
Its functionality can be accessed through the TPRegexp and TString class . Note that in patterns taken from Perl all backslash character have to be replaced in the C/C++ strings by two backslashes .
This macro shows several ways how to use the Match/Substitute capabilities of the the TPRegexp class . It can be run as follows :
Processing /mnt/build/workspace/root-makedoc-v608/rootspi/rdoc/src/v6-08-00-patches/tutorials/regexp.C...
pepernotenkoek
lekkere walnotenboom
two one three
on 24-09-1959 the world stood still
Check if the email address "fons.rademakers@cern.ch" is valid: TRUE
neutrino proton electron neutrino
void regexp()
{
TString s1(
"lekkere pepernotenkoek");
cout << s1(r1) << endl;
r1.Substitute(s1,"wal$1boom");
cout << s1 << endl;
cout << s2 << endl;
TString s3(
"on 09/24/1959 the world stood still");
cout << s3 << endl;
TString s4(
"http://fink.sourceforge.net:8080/index/readme.html");
if (nrSubStr > 2) {
cout << "protocol: " << proto << " port: " << port << endl;
}
TString s5(
"fons.rademakers@cern.ch");
TPRegexp r5(
"^([\\w-\\.]+)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.)|(([\\w-]+\\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\\]?)$");
cout << "Check if the email address \"" << s5 << "\" is valid: " << (r5.MatchB(s5) ? "TRUE" : "FALSE") << endl;
TString s6(
"neutron proton electron neutron");
cout << s6 << endl;
}
- Author
- Eddy Offermann
Definition in file regexp.C.