0

I have a template-esque system which can load bulk templates (more than one template entry in one file) and store them accordingly. The problem is that the current approach uses preg_replace() and eval and it is really error-prone. An example for this error could be an improperly-placed character which breaks the regular expression and creates a parse error:

Parse error: syntax error, unexpected '<' in tsys.php: eval()'d code

The code which does this said loading is the following:

// Escaping
$this->_buffer = str_replace( array('\\', '\'', "\n"), array('\\\\', '\\\'', ''), $this->_buffer);

// Regular-expression chunk up the input string to evaluative code
$this->_buffer = preg_replace('#<!--- BEGIN (.*?) -->(.*?)<!--- END (.*?) -->#', "\n" . '$this->_tstack[\'\\1\'] = \'\\2\';', $this->_buffer);

// Run the previously created PHP code
eval($this->_buffer);

An example file of this bulk template looks like the following:

<!--- BEGIN foo -->
<p>Some HTML code</p>
<!--- END foo -->

<!--- BEGIN bar -->
<h1>Some other HTML code</h1>
<!--- END bar -->

When the code is ran on this input, the $this->_tstack will be given two elements:

array (
  'foo' => "<p>Some HTML code</p>",
  'bar' => "<h1>Some other HTML code</h1>",
);

Which is the expected behavior but I am looking for a method which we could drop the need of eval.

3
  • 1
    WTF is that supposed to be doing? Commented Jul 13, 2012 at 14:46
  • @ircmaxell You store the HTML in these files, and this code is supposed to load these files into the internal _tstack container, from which, you can print templates to the screen. What I am searching an approach for parsing multi- (or bulk-) template files without the need of eval(). Commented Jul 13, 2012 at 15:01
  • @ircmaxell And for a greater insight: after the template is loaded, you can prepare templates (templates can have variable places inside them which are filled with values), add them to output buffers and if needed, print them on the screen. Templates are the ones who contain what HTML the system should and will output. Commented Jul 13, 2012 at 21:28

2 Answers 2

1

Well, here goes. Given $template contains:

<!--- BEGIN foo -->
    <p>Some HTML code</p>
<!--- END foo -->

<!--- BEGIN bar -->
    <h1>Some other HTML code</h1>
<!--- END bar -->

Then:

$values = array();
$pattern = '#<!--- BEGIN (?P<key>\S+) -->(?P<value>.+?)<!--- END (?P=key) -->#si';
if ( preg_match_all($pattern, $template, $matches, PREG_SET_ORDER) ) {
    foreach ($matches as $match) {
        $values[$match['key']] = trim($match['value']);
    }
}
var_dump($values);

Results in:

array(2) {
  ["foo"]=>
  string(21) "<p>Some HTML code</p>"
  ["bar"]=>
  string(29) "<h1>Some other HTML code</h1>"
}

If white space preservation is important, remove trim().

Sign up to request clarification or add additional context in comments.

5 Comments

This did work, with some slight modifications. Because template names can contain whitespaces, I needed to change (?P<key>\S+) to (?P<key>\S.+). I have altered the answer to make it look like a valid PHP code, because there were some syntax errors. Thank you for the answer.
@Whisperity Sorry, that was valid PHP 5.4 code; not 5.3. Your edits make it backward compatible; thank you :) Also, it's probably best to narrow the character class allowance to what you need. If it can only be whitespace and "word characters" (letters, digits, and underscores), then [\w\s]+ should suffice. [\w\s-]+ for hyphens too.
I have just updated to PHP 5.4.4, but when I tested the code I only had 5.3.8. Off topic, but should I try to make it (the project) backward compatible, or drop it all, and write it as a new code?
@Whisperity That depends on far too many factors. Backward compatibility is obviously a massive asset (read; necessity) for libraries/applications made publicly available for use in the wild. However, if this is purely an in-house solution, I would leverage features offered by the latest release, and start coding new. Again, too many other factors; you'll have to decide for yourself.
@Bracketwroks I think I won't fudge on [] instead of array(), but will drop mysql_ in favour of mysqli_. Anyway, before we go more off-topic, I wish to thank you for the solution. I really need to learn RegExp more.
1

You can use preg_match_all to do that:

// Remove CR and NL
$buffer = str_replace(array("\r", "\n"), '', $this->_buffer);

// Grab interesting parts
$matches = array();
preg_match_all('/\?\?\? BOT (?P<group>[^ ]+) \?\?\?(?P<content>.*)!!! EOT \1 !!!/', $buffer, $matches);

// Build the stack
$stack = array_combine(array_values($matches['group']), array_values($matches['content']));

Will output:

Array
(
    [foo] => <p>Some HTML code</p>
    [bar] => <h1>Some other HTML code</h1>
)

1 Comment

The method did work, but since I asked, the templates were modified to use a different, more HTML-like format (see the updated question). And I am not sure how to modify your preg_replace_all() line to prevent the system from getting Warning: array_combine() [function.array-combine]: Both parameters should have at least 1 element error and FALSE as $stack.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.