(PHP 5, PHP 7, PHP 8)
DOMDocument::loadHTMLFile — Load HTML from a file
The function parses the HTML document in the file named
filename
. Unlike loading XML, HTML does not have
to be well-formed to load.
This function parses the input using an HTML 4 parser. The parsing rules of HTML 5, which is what modern web browsers use, are different. Depending on the input this might result in a different DOM structure. Therefore this function cannot be safely used for sanitizing HTML.
As an example, some HTML elements will implicitly close a parent element when encountered. The rules for automatically closing parent elements differ between HTML 4 and HTML 5 and thus the resulting DOM structure that DOMDocument sees might be different from the DOM structure a web browser sees, possibly allowing an attacker to break the resulting HTML.
If an empty string is passed as the filename
or an empty file is named, a warning will be generated. This warning
is not generated by libxml and cannot be handled using libxml's error handling
functions.
While malformed HTML should load successfully, this function may generate E_WARNING
errors when it encounters bad markup. libxml's error handling functions may be used to handle these errors.
Version | Description |
---|---|
8.3.0 | This function now has a tentative bool return type. |
8.0.0 |
Calling this function statically will
now throw an Error.
Previously, an E_DEPRECATED was raised.
|
Example #1 Creating a Document
<?php
$doc = new DOMDocument();
$doc->loadHTMLFile("filename.html");
echo $doc->saveHTML();
?>