For this job PHP has built in function array libxml_get_errors ( void ) that will return an array of errors. Take a look at this documentation. There is also an example.
My test with page body:
<?php
libxml_use_internal_errors(true);
$xmlstr = <<< XML
<body>
<h1>Correct tag</h1>
<h2>Tag not closed</h2>
<p>Missing end of paragraph
<br>
<script type="text/javascript">
var test = "Script";
</script>
<img src="some.url" alt="Image title" >
<footer>Some error in footer?<footer>
</body>
XML;
$doc = simplexml_load_string($xmlstr);
$xml = explode("\n", $xmlstr);
if (!$doc) {
$errors = array_reverse ( libxml_get_errors() );
echo "<pre>";
foreach ($errors as $error) {
echo display_xml_error($error, $xml);
}
echo "</pre>";
libxml_clear_errors();
}
function display_xml_error($error, $xml)
{
$return = $xml[$error->line - 1] . "\n";
$return .= str_repeat('-', $error->column) . "^\n";
switch ($error->level) {
case LIBXML_ERR_WARNING:
$return .= "Warning $error->code: ";
break;
case LIBXML_ERR_ERROR:
$return .= "Error $error->code: ";
break;
case LIBXML_ERR_FATAL:
$return .= "Fatal Error $error->code: ";
break;
}
$return .= trim($error->message);
if ($error->file) {
$return .= "\n File: $error->file";
}
return "$return\n\n--------------------------------------------\n\n";
}
?>
Results with:
---------^
Fatal Error 77: Premature end of data in tag body line 1
--------------------------------------------
---------^
Fatal Error 77: Premature end of data in tag p line 4
--------------------------------------------
---------^
Fatal Error 77: Premature end of data in tag br line 5
--------------------------------------------
---------^
Fatal Error 77: Premature end of data in tag img line 9
--------------------------------------------
---------^
Fatal Error 77: Premature end of data in tag footer line 10
--------------------------------------------
---------^
Fatal Error 76: Opening and ending tag mismatch: footer line 10 and body
--------------------------------------------
Do not be confused with error for body not closed. In case HTML is valid, than there are no errors dropped. For example, the following code has no errors according to array libxml_get_errors():
<body>
<h1>Correct tag</h1>
<h2>Tag closed</h2>
<p>Not missing end of paragraph</p>
<br />
<script type="text/javascript">
var test = "Script";
</script>
<img src="some.url" alt="Image title" />
<div class="somediv">
<p>Paragraph nested</p>
<ul>
<li>List element</li>
<li>List element</li>
</ul>
</div>
<footer>No error in footer</footer>
</body>