The preg_grep() Function
Similar to the UNIX grep command, the preg_grep() function returns an array of values that match a pattern
found in an array instead of a search string. You can also invert the search and get an array of all elements that do not
contain the pattern being searched for (like UNIX grep -v) by using the PREG_GREP_INVERT flag.
Format
array preg_grep ( string pattern, array input [, int flags] )
!
Example:
$new_array = preg_grep("/ma/", array("normal", "mama", "man","plan")); //
$new_array contains: normal, mama, man
$new_array=preg_grep("/ma/",array("normal","mama","man",
"plan"),PREG_GREP_INVERT); // $new_array contains: plan
Example 12.13.
_29*!a>*6b!
<html><head><title>The preg_grep() Function</title></head>
<body bgcolor="lavender">
<font face="verdana" >
<b>
<h2>The preg_grep() Function</h2>
<font size="+1">
<pre>
<?php
1 $regex="/Pat/";
2 $search_array=array("Margaret","Patsy", "Patrick",
"Patricia", "Jim");
sort($search_array);
3 $newarray=preg_grep( $regex, $search_array );
4 print "Found ". count($newarray). " matches\n";
5 print_r($newarray);
6 $newarray=preg_grep($regex,$search_array,
PREG_GREP_INVERT);
print "Found ". count($newarray). " that didn't match\n";
print_r($newarray);
?>
</b>
</pre>
</font>
</body>
</html>
Explanation
E
F5*!D0->0?<*!$regex!>/!0//>;.*9!+5*!-*;,<0-!*=4-*//>2.&!/Pat/&!+50+!6><<!?*!,/*9!<0+*-!?1!
preg_grep()!0/!+5*!/*0-85!40++* :
G
F5>/!0 01!6><<!?*!,/*9!0/!+5*!/,?Z*8+!32-!+5*!/*0-85!6>+5!+5*!preg_grep()!3,.8+>2.:
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
K
W3+*-!+5*!0 01!50/!?**.!/2-+*9&!+5*!preg_grep()!3,.8+>2.!6><<!/*0-85!32-!+5*!40++* &!
/Pat/&!>.!*085!*<*7*.+!23!+5*!0 01&!0.9!-*+, !0.9!0//>;.!+5*!70+85*9!0 01!*<*7*.+/!
+2!0.2+5*-!0 01!80<<*9!$newarray:
%
F5*!count()!3,.8+>2.!-*+, /!+5*!.,7?*-!23!*<*7*.+/!>.!+5*!.*6!0 01M!+50+!>/&!+5*!
.,7?*-!23!*<*7*.+/!65*-*!+5*!40++* !/Pat/!60/!32,.9:
'
F5*!32,.9!*<*7*.+/!0-*!9>/4<01*9:!O2+*!+50+!+5*!>.9*=!D0<,*/!50D*!?**.!4-*/*-D*9:
P
R5*.!+5*!PREG_GREP_INVERT!3<0;!>/!/4*8>3>*9&!+5*!preg_grep()!3,.8+>2.!6><<!70+85!0.9!
-*+, !0.1!*<*7*.+/!.2+!32,.9!>.!+5*!2->;>.0<!0 01&!0/!/526.!>.!+5*!2,+4,+!>.!N>;,-*!
EG:E%:
!
Figure 12.14. The preg_grep() function. Output from Example 12.13.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
12.2.3. Getting Control—The RegEx Metacharacters
Regular expression metacharacters are characters that do not represent themselves. They are endowed with special
powers to allow you to control the search pattern in some way (e.g., finding a pattern only at the beginning of the line,
or at the end of the line, or if it starts with an upper- or lowercase letter). Metacharacters will lose their special meaning
if preceded with a backslash. For example, the dot metacharacter represents any single character, but when preceded
with a backslash is just a dot or period.
If you see a backslash preceding a metacharacter, the backslash turns off the meaning of the metacharacter, but if you
see a backslash preceding an alphanumeric character in a regular expression, then the backslash is used to create a
metasymbol. A metasymbol provides a simpler form to represent some of regular expression metacharacters. For
example, [0-9] represents numbers in the range between 0 and 9, and \d represents the same thing. [0-9] uses the
bracketed character class, whereas \d is a metasymbol (see Table 12.6).
Table 12.6. Metacharacters
Character+Class
What+It+Matches
Metacharacter
A>.;<*!850-08+*-/!0.9!9>;>+/!
"32-!72-*&!/**!XQ0+85>.;!
A>.;<*!_50-08+*-/!0.9!
H>;>+/Y!2.!40;*!'G%(
Q0+85*/!0.1!850-08+*-!*=8*4+!0!.*6<>.*:!!
Matches any single character in a set.
Q0+85*/!0.1!/>.;<*!850-08+*-!.2+!>.!0!/*+:
.!!
[a-z0-9]
[^a-z0-9]
A>.;<*!850-08+*-/!0.9!9>;>+/!
cQ*+0/17?2</!"32-!72-*&!
/**!XQ*+0/17?2</Y!2.!40;*!
'KJ(
Q0+85*/!2.*!9>;>+:!!
Matches a nondigit, same as [^0-9].
Matches an alphanumeric (word) character.
Q0+85*/!0!.2.0<450.,7*->8!".2.62-9(!
850-08+*-:
\d!!
\D
\w
\W
R5>+*/408*!850-08+*-/
Q0+85*/!65>+*/408*!850-08+*-&!/408*/&!+0?/&!
0.9!.*6<>.*/:
\s
!
Q0+85*/!0!.2.65>+*/408*!850-08+*-:
\S
!
Q0+85*/!0!.*6<>.*:
\n
!
Q0+85*/!0!-*+, :
\r
!
Q0+85*/!0!+0?:
\t
!
Q0+85*/!0!32-7!3**9:
\f
!
Q0+85*/!0!.,<<!850-08+*-
\0
W.852-*9!850-08+*-/!"32-!
72-*!/**!XW.852->.;!
Q*+0850-08+*-/Y!2.!40;*!
'GJ(
Q0+85*/!0!62-9!?2,.90-1:!!
Matches a nonword boundary.
Matches to beginning of line.
Matches to end of line.
Matches the beginning of the string only.
Q0+85*/!+5*!*.9!23!+5*!/+->.;!2-!<>.*:
\b!!
\B
^
$
\A
\D
Q0+85*/!J!2-!E!288, *.8*/!23!+5*!<*++*-!x:
x?
)*4*0+*9!850-08+*-/!"32-!
72-*&!/**!XQ*+0850-08+*-/!
+2!)*4*0+!#0++* !Q0+85*/Y!
2.!40;*!'KK(
Q0+85*/!J!2-!72-*!288, *.8*/!23!+5*!<*++*-!
x:
x*
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Table 12.6. Metacharacters
Character+Class
What+It+Matches
Metacharacter
!
Q0+85*/!E!2-!72-*!288, *.8*/!23!+5*!<*++*-!
x:
x+
Q0+85*/!2.*!2-!72-*!40++* /!23!xyz!"*:;:&!
xyxxyzxyz(:
(xyz)+
d-2,4*9!850-08+*-/!"32-!
72-*&!/**!Xd-2,4>.;!2-!
_<,/+*->.;:Y!2.!40;*!'%%(
Q0+85*/!0+!<*0/+!m!288, *.8*/!23!+5*!<*++*-!x&!
0.9!.2!72-*!+50.!n!288, *.8*/!23!+5*!<*++*-!
x:
x{m,n}
W<+* 0+>D*!850-08+*-/!"32-!
72-*&!/**!XQ*+0850-08+*-/!
32-!W<+* 0+>2.Y!2.!40;*!
'%K(
Q0+85*/!2.*!23!was&!were&!2-!will:
was|were|will
)*7*7?*-*9!850-08+*-/!
"32-!72-*&!
/**X)*7*7?*->.;!2-!
_04+,->.;Y!2.!40;*!'%'(
e/*9!32-!?08@-*3*-*.8>.;:!!
Matches first set of parentheses.
Matches second set of parentheses.
Q0+85*/!+5>-9!/*+!23!40-*.+5*/*/:
(string)!!
\1 or $1
\2 or $2
\3!2-!$3
Q0+85*/!x!?,+!92*/!.2+!-*7*7?*-!+5*!70+85:!
F5*/*!0-*!80<<*9!.2.804+,->.;!40-*.+5*/*/:
(?:x)
#2/>+>D*!<22@05*09!0.9!
<22@?*5>.9!"32-!72-*&!/**!
X#2/>+>D*!B22@05*09Y!2.!
40;*!''J!0.9!X#2/>+>D*!
B22@?*5>.9Y!2.!40;*!''G
Q0+85*/!x!2.<1!>3!x!>/!32<<26*9!?1!y:!N2-!
*=074<*&!/Jack(?=Sprat)/!70+85*/!Jack!
2.<1!>3!>+!>/!32<<26*9!?1!Sprat:!
/Jack(?=Sprat|Frost)/!70+85*/!Jack!2.<1!>3!
>+!>/!32<<26*9!?1!Sprat!2-!Frost:!O*>+5*-!
Sprat!.2-!Frost!>/!@*4+!0/!40-+!23!650+!60/!
70+85*9:
x(?=y)
!
Q0+85*/!x!2.<1!>3!x!>/!.2+!32<<26*9!?1!y:!N2-!
*=074<*&!/\d+(?!\.)/!70+85*/!2.*!2-!72-*!
.,7?*-/!2.<1!>3!+5*1!0-*!.2+!32<<26*9!?1!0!
9*8>70<!42>.+:
x(?!y)
!
The following regular expression contains metacharacters:
/^a c/
!
The first metacharacter is a caret (^). The caret metacharacter matches for a string only if it is at the beginning of the
line. The period (.) is used to match for any single character, including a space. This expression contains three periods,
representing any three characters. To find a literal period or any other character that does not represent itself, the
character must be preceded by a backslash to prevent interpretation.
The expression reads: Search at the beginning of the line for a letter a, followed by any three single characters,
followed by a letter c. It will match, for example, abbbc, a123c, a c, aAx3c, and so on, only if those patterns were
found at the beginning of the line.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
In the following examples, we perform pattern matches, searches, and replacements based on the data from a text file
called data.txt. In the PHP program, the file will be opened and, within a while loop, each line will be read. The
functions discussed in the previous section will be used to find patterns within each line of the file. The regular
expressions will contain metacharacters, described in Table 12.6.
Anchoring Metacharacters
Often it is necessary to find a pattern only if it is found at the beginning or end of a line, word, or string. The
“anchoring” metacharacters (see Table 12.7) are based on a position just to the left or to the right of the character that is
being matched. Anchors are technically called zero-width assertions because they correspond to positions, not actual
characters in a string; for example, /^abc/ means find abc at the beginning of the line, where the ^ represents a
position, not an actual character.
Table 12.7. Anchors (Assertions)
Metacharacter
What+It+Matches
^
Q0+85*/!+2!?*;> >.;!23!<>.*!2-!?*;> >.;!23!/+->.;:
$
Q0+85*/!+2!*.9!23!<>.*!2-!*.9!23!/+->.;:
\A
Q0+85*/!+5*!?*;> >.;!23!0!/+->.;:
\b
Q0+85*/!0!62-9!?2,.90-1:
\B
Q0+85*/!0!.2.62-9!?2,.90-1:
\D
Q0+85*/!+5*!*.9!23!0!/+->.;:
!
Beginning-of-Line Anchor
The ^ metacharacter is called the beginning-of-line anchor. It is the first character in the regular expression and matches
a pattern found at the beginning of a line or string.
Example 12.14.
_29*!a>*6b!
(The file data.txt Contents)
Mama Bear 702
Steve Blenheim 100
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Jon DeLoach 500
Karen Evich 600
BB Kingson 803
(The PHP Program)
<?php
1 $fh=fopen("data.txt", "r");
2 while( ! feof($fh)){
3 $text = fgets($fh);
4 if (preg_match("/^B/", $text)){
5 echo "$text";
}
}
?>
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
(Output)
Betty Boop 200
BB Kingson 803
Explanation
E
F5*!3><*!data.txt!>/!24*.*9!32-!-*09>.;:
G
W/!<2.;!0/!+5*!*.9!23!3><*!50/!.2+!?**.!-*085*9&!+5*!while!<224!6><<!82.+>.,*!+2!
*=*8,+*:
K
N2-!*085!>+*-0+>2.!23!+5*!<224&!+5*!fgets()!3,.8+>2.!-*09/!>.!0!<>.*!23!+*=+:
%
F5*!preg_match()!3,.8+>2.!6><<!-*+, !TRUE!>3!0!40++* !82./>/+>.;!23!0!/+->.;!
?*;> >.;!6>+5!0!B!>/!70+85*9:
'
F5*!<>.*/!+50+!70+85*9!0-*!4->.+*9:
End-of-Line Anchor
The end-of-line anchor, a dollar sign, is used to indicate the ending position in a line. The dollar sign must be the last
character in the pattern, just before the closing forward slash delimiter of the regular expression, or it no longer means
“end-of-line anchor.”
[1]
[1]
If moving files between Windows and UNIX, the end-of-line anchor might not work. You can use programs such as
dos2unix to address this problem.
Example 12.15.
_29*!a>*6b!
(The File data.txt Contents)
Mama Bear 702
Steve Blenheim 100
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Jon DeLoach 500
Karen Evich 600
BB Kingson 803
(The PHP Program)
<?php
1 $fh=fopen("data.txt", "r");
2 while( ! feof($fh)){
3 $text = fgets($fh);
4 if (preg_match("/0$/", $text)){
5 echo "$text";
}
}
?>
(Output)
Steve Blenheim 100
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Jon DeLoach 500
Karen Evich 600
Explanation
E
F5*!3><*!data.txt!>/!24*.*9!32-!-*09>.;:
G
W/!<2.;!0/!+5*!*.9!23!3><*!50/.f+!?**.!-*085*9&!+5*!while!<224!6><<!82.+>.,*!+2!
*=*8,+*:
K
N2-!*085!>+*-0+>2.!23!+5*!<224&!+5*!fgets()!3,.8+>2.!-*09/!>.!0!<>.*!23!+*=+:
%
F5*!preg_match()!3,.8+>2.!6><<!-*+, !F)eT!>3!0!40++* !82./>/+>.;!23!0!<>.*!
*.9>.;!6>+5!0!0!>/!70+85*9:!F5*!$!7*+0850-08+*-!>.9>80+*/!+50+!J!7,/+!?*!
32<<26*9!?1!0!.*6<>.*:
'
F5*!<>.*/!+50+!70+85*9!0-*!4->.+*9:
Word Boundaries
A word boundary is represented in a regular expression by the metasymbol \b. You can search for the word that begins
with a pattern, ends with a pattern, or both begins and ends with a pattern; for example, /\blove/ matches a word
beginning with the pattern love, and would match lover, loveable, or lovely, but would not find glove.
/love\b/ matches a word ending with the pattern love, and would match glove, clove, or love, but not
clover. /\blove\b matches a word beginning and ending with the pattern love, and would match only the word
love.
Example 12.16.
_29*!a>*6b!
(The File data.txt Contents)
Mama Bear 702
Steve Blenheim 100
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Jon DeLoach 500
Karen Evich 600
BB Kingson 803
(The PHP Script)
<?php
$fh=fopen("data.txt", "r");
while( ! feof($fh)){
$text = fgets($fh);
1 if (preg_match("/\bbear\b/i", $text)){
2 echo "$text";
}
}
?>
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
(The Output)
Mama Bear 702
Explanation
E
F5*!preg_match()!3,.8+>2.!6><<!-*+, !TRUE!>3!0!40++* !82./>/+>.;!23!+5*!62-9!
bear!>/!70+85*9&!0.9!>+!>/!>./*./>+>D*!+2!80/*:!]*80,/*!+5*!-*;,<0-!*=4-*//>2.!>/!
0.852-*9!2.!?2+5!*.9/!23!+5*!62-9!6>+5!+5*!62-9!?2,.90-1!7*+0/17?2<&!\b&!
2.<1!bear!>/!70+85*9!>.!$test&!.2+!X,.?*0-0?<*&Y!X?*0-9&Y!2-!X?*0-/:Y
G
F5*!<>.*/!+50+!70+85*9!0-*!4->.+*9:
Matching Single Characters and Digits
There are metacharacters to match single characters or digits, and single noncharacters or nondigits, whether in or not in
a set.
The Dot Metacharacter
The dot metacharacter matches for any single character with exception to the newline character. For example, the
regular expression /a.b/ is matched if the string contains a letter a, followed by any one single character (except the
\n), followed by a letter b, whereas the expression / / matches any string containing at least three characters. To
match on a literal period, the dot metacharacter must be preceded by a backslash; for example, /love\./ matches on
love. not lover.
Example 12.17.
_29*!a>*6b!
(The File data.txt Contents)
Mama Bear 702
Steve Blenheim 100
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Jon DeLoach 500
Karen Evich 600
BB Kingson 803
(The PHP Program)
<?php
1 $fh=fopen("data.txt", "r");
2 while( ! feof($fh)){
3 $text = fgets($fh);
4 if( preg_match("/^ /", $text )){
echo "$text";
}
}
?>
(Output)
Jon DeLoach 500
Explanation
E
F5*!3><*!data.txt!>/!24*.*9!32-!-*09>.;:
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
G
W/!<2.;!0/!+5*!*.9!23!3><*!50/!.2+!?**.!-*085*9&!+5*!while!<224!6><<!82.+>.,*!+2!
*=*8,+*:
K
N2-!*085!>+*-0+>2.!23!+5*!<224&!+5*!fgets()!3,.8+>2.!-*09/!>.!0!<>.*!23!+*=+:
%
F5*!-*;,<0-!*=4-*//>2.!/^ /!82.+0>./!+5*!92+!7*+0850-08+*-:!F5*!-*;,<0-!
*=4-*//>2.!7*0./b!;2!+2!+5*!?*;> >.;!"^(!23!+5*!<>.*!0.9!3>.9!0.1!+5-**!
850-08+*-/&!32<<26*9!?1!0!/408*:!"F5*!92+!7*+0850-08+*-!92*/!.2+!70+85!+5*!
.*6<>.*!850-08+*-:(!F5*!2.<1!<>.*!+50+!70+85*9!+5*!40++* !/+0-+/!6>+5!Jon:!L+!
?*;>./!6>+5!+5-**!850-08+*-/!32<<26*9!?1!0!/408*:
Example 12.18.
_29*!a>*6b!
(The File data.txt Contents)
Mama Bear 702
Steve Blenheim 100
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Jon DeLoach 500
Karen Evich 600
BB Kingson 803
(The PHP Program)
<?php
1 $fh=fopen("data.txt", "r");
2 while( ! feof($fh)){
$text = fgets($fh);
3 $newtext=preg_replace("/J /", "Daniel", $text);
echo "$newtext";
}
?>
(Output)
Mama Bear 702
Steve Blenheim 100
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Daniel DeLoach 500
Karen Evich 600
BB Kingson 803
Explanation
E
F5*!+*=+!3><*!data.txt!>/!24*.*9!32-!-*09>.;:
G
e.+><!+5*!*.9!23!+5*!3><*!>/!-*085*9&!+5*!while!<224!6><<!82.+>.,*!<224>.;&!
-*09>.;!>.!2.*!<>.*!0+!0!+>7*!3-27!+5*!3><*:
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
-*09>.;!>.!2.*!<>.*!0+!0!+>7*!3-27!+5*!3><*:
K
F5*!3>-/+!0-;,7*.+!+2!+5*!preg_replace()!3,.8+>2.!>/!0!-*;,<0-!*=4-*//>2.!
82.+0>.>.;!+5*!92+!7*+0850-08+*-:!L3!+5*!-*;,<0-!*=4-*//>2.!"0!804>+0<!g!
32<<26*9!?1!0+!<*0/+!+62!850-08+*-/(!>/!70+85*9!>.!$text&!+5*!32,.9!40++* !
6><<!?*!-*4<08*9!6>+5!Daniel:
The Character Class
A character class represents one character from a set of characters. For example, [abc] matches either an a, b, or c;
[a-z] matches one character from a set of characters in the range from a to z; and [0-9] matches one character in
the range of digits between 0 to 9. If the character class contains a leading caret ^, then the class represents any one
character not in the set; for example, [^a-zA-Z] matches a single character not in the range from a to z or A to Z,
and [^0-9] matches a single digit not in the range between 0 and 9 (see Table 12.8).
Table 12.8. Character Classes
Metacharacter
What+It+Matches
[abc]
Q0+85*/!0.!a!2-!b!2-!c:
[a–z0–9_]
Q0+85*/!0.1!/>.;<*!850-08+*-!>.!0!/*+:
[^a–z0–9_]
Q0+85*/!0.1!/>.;<*!850-08+*-!.2+!>.!0!/*+:
!
PHP provides additional metasymbols to represent a character class. The symbols \d and \D represent a single digit
and a single nondigit, respectively (the same as [0-9] and [^0-9]); \w and \W represent a single word character
and a single nonword character, respectively (the same as [A-Za-z_0-9] and [^A-Za-z_0-9]).
If you are searching for a particular character within a regular expression, you can use the dot metacharacter to
represent a single character, or a character class that matches on one character from a set of characters. In addition to
the dot and character class, PHP supports some backslashed symbols (called metasymbols) to represent single
characters.
Matching One Character from a Set
A regular expression character class represents one character out of a set of characters, as shown in Example 12.19.
Example 12.19.
_29*!a>*6b!
(The File data.txt Contents)
Mama Bear 702
Steve Blenheim 100
Betty Boop 200
Igor Chevsky 300
Norma Cord 400
Jon DeLoach 500
Karen Evich 600
BB Kingson 803
(The PHP Program)
<?php
1 $fh=fopen("data.txt", "r");
2 while( ! feof($fh)){
3 $text = fgets($fh);
4 if(preg_match("/^[BKI]/",$text)){
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Không có nhận xét nào:
Đăng nhận xét