PHP script with regular expressions

Question

I'm trying to get the text between the heading tag using the following php script:

$search_string= < h1 >testing here< /h1 >;

$text = preg_match('<%TAG%[^>]*>(.*?)</%TAG%>',$search_string, $matches);

echo $matches[0];

When i try to run this script there is no value being returned. Instead there is warning message: Warning: preg_match() [function.preg-match]: Unknown modifier '(' in C:\xampp\htdocs\check_for_files.php on line 10

Can anyone help with this please?

See [ RegEx match open tags except XHTML self-contained tags ](stackoverflow.com/questions/1732348/…). — Matthew Flaschen
– Matthew Flaschen, Commented Oct 14, 2010 at 3:32
True, you'll want to use a real tag name (e.g. 'h1') in your expression, and quoting your $search_string will also help. — Blackcoat
– Blackcoat, Commented Oct 14, 2010 at 3:36

Blackcoat · Accepted Answer · 2010-10-14 03:32:45Z

2

Your expression needs delimiters. / is the most common, but # should work for this situation.

$text = preg_match('#<%TAG%[^>]*>(.*?)</%TAG%>#',$search_string, $matches);

answered Oct 14, 2010 at 3:32

Blackcoat

3,33033 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

codaddict · Accepted Answer · 2010-10-14 03:46:08Z

2

The warning is because you've not enclosed your regex in delimiters. So try

$text = preg_match('#<%TAG%[^>]*>(.*?)</%TAG%>#',$search_string, $matches);

Understanding the warning.

Consider your regex:

'<%TAG%[^>]*>(.*?)</%TAG%>'
 ^          ^
start      end

Since you've not explicitly put the regex between delimiter, PHP thinks you are using < and > as delimiter as < is the first char in the regex. Hence when it sees an un-escaped < it takes it as end of pattern. Next we can have few modifiers after the closing delimiter which allow us to alter the behavior of the pattern matching. Some commmon modifiers are:

i for case insensitive
m for multi line match

Now in your case there is a ( after the closing delimiter which is not a valid modifier, hence the warning.

edited Oct 14, 2010 at 3:46

answered Oct 14, 2010 at 3:33

codaddict

457k83 gold badges501 silver badges537 bronze badges

Comments

mway · Accepted Answer · 2010-10-14 03:34:28Z

1

/^<[^>]+>(.*)<\/[^>]+>$/ should do the trick.

answered Oct 14, 2010 at 3:34

mway

4,3573 gold badges28 silver badges37 bronze badges

3 Comments

Pavan Over a year ago

hi, I'm very interested in this approach. Could you please explain this? Thank you.

mway Over a year ago

It's a pretty basic expression; <[^>]+> means 'one or more of any character except > enclosed within <>; (.*) matches anything; and <\/[^>]+> is similar to the first in that it means 'one or more of any character except > enclosed within </>. The first and the last are structured this way so you don't have to write complex rules to match what might possibly be in the tag (attributes, etc); we assume > will not be in it (because that's not valid in class names or element ids, for example). Not the most efficient expression, but gets the job done.

mway Over a year ago

Also: there are parenthesis around .* (eg, (.*)) so that that group is returned as a specific match within the results.

Collectives™ on Stack Overflow

PHP script with regular expressions

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related