Currently, we offer Zend_Dom_Query, which provides a unified interface for querying DOM documents utilizing both XPath and CSS selectors. Zend_Dom provides tools for working with DOM documents and structures. PhpQuery is a server-side, chainable, CSS3 selector driven Document Object Model (DOM) API based on jQuery JavaScript Library written in PHP5 and provides additional Command Line Interface (CLI).
The DOM tree and extends it by adding methods for manipulating theĭOM tree of HTML documents. Wa72\HtmlPageDom is a PHP library for easy manipulation of HTMLĭocuments using DOM. FluentDOM can load formats like JSON, CSV, JsonML, RabbitFish and others. Current versions extend the DOM implementing standard interfaces and add features from the DOM Living Standard. Selectors are written in XPath or CSS (using a CSS to XPath converter). FluentDom - RepoįluentDOM provides a jQuery-like fluent XML interface for the DOMDocument in PHP. If you prefer to use a 3rd-party lib, I'd suggest using a lib that actually uses DOM/ libxml underneath instead of string parsing.
If you need to parse broken HTML, don't even consider SimpleXml because it will choke.Ī basic usage example can be found at A simple program to CRUD node and node values of xml file and there is lots of additional examples in the PHP Manual. SimpleXML is an option when you know the HTML is valid XHTML. The SimpleXML extension provides a very simple and easily usable toolset to convert XML to an object that can be processed with normal property selectors and array iterators. It may be a better choice for memory management than DOM or SimpleXML, but will be more difficult to work with than the pull parser implemented by XMLReader. The XML Parser library is also based on libxml, and implements a SAX style XML push parser.
Each XML parser also has a few parameters you can adjust. This extension lets you create XML parsers and then define handlers for different XML events.
I am not aware of how to trigger the HTML Parser Module, so chances are using XMLReader for parsing broken HTML might be less robust than using DOM where you can explicitly tell it to use libxml's HTML Parser Module.Ī basic usage example can be found at getting all values from h1 tags using php XML Parser The reader acts as a cursor going forward on the document stream and stopping at each node on the way. The XMLReader extension is an XML pull parser. How to use the DOM extension has been covered extensively on StackOverflow, so if you choose to use it, you can be sure most of the issues you run into can be solved by searching/browsing Stack Overflow. Since DOM is a language-agnostic interface, you'll find implementations in many languages, so if you need to change your programming language, chances are you will already know how to use that language's DOM API then.Ī basic usage example can be found in Grabbing the href attribute of an A element and a general conceptual overview can be found at DOMDocument in php It takes some time to get productive with DOM, but that time is well worth it IMO.
It is an implementation of the W3C's Document Object Model Core Level 3, a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents.ĭOM is capable of parsing and modifying real world (broken) HTML and it can do XPath queries. The DOM extension allows you to operate on XML documents through the DOM API with PHP 5. I prefer using one of the native XML extensions since they come bundled with PHP, are usually faster than all the 3rd party libs and give me all the control I need over the markup.