XmlDocument

Specializes Document for handling generic XML. (always uses strict mode, uses xml mime type and file header)

class XmlDocument : Document {

this(string data, bool enableHtmlHacks);

this(Utf8Stream data, bool enableHtmlHacks);

}

Constructors

this this(string data, bool enableHtmlHacks)
this(Utf8Stream data, bool enableHtmlHacks): Constructs a stricter-mode XML parser and parses the given data source.

Inherited Members

From Document

processTagOpen void processTagOpen(Element what)
processTagClose void processTagClose(Element what)
processNodeWhileParsing void processNodeWhileParsing(Element parent, Element child): These three functions, processTagOpen, processTagClose, and processNodeWhileParsing, allow you to process elements as they are parsed and choose to not append them to the dom tree.
fromUrl Document fromUrl(string url, bool strictMode): Convenience method for web scraping. Requires arsd.http2 to be included in the build as well as arsd.characterencodings.
opIndex ElementCollection opIndex(string selector): This is just something I'm toying with. Right now, you use opIndex to put in css selectors. It returns a struct that forwards calls to all elements it holds, and returns itself so you can chain it.
contentType string contentType [@property setter]: If you're using this for some other kind of XML, you can set the content type here.
filename string filename [@property getter]: implementing the FileResource interface, useful for sending via http automatically.
contentType string contentType [@property getter]: implementing the FileResource interface, useful for sending via http automatically.
getData immutable(ubyte)[] getData(): implementing the FileResource interface; it calls toString.
enableAddingSpecialTagsToDom void enableAddingSpecialTagsToDom(): Adds objects to the dom representing things normally stripped out during the default parse, like comments, <!instructions>, <% code%>, and <? code?> all at once.
parseSawComment bool delegate(string) parseSawComment;: If the parser sees a html comment, it will call this callback  will call parseSawComment(" comment ") Return true if you want the node appended to the document. It will be in a HtmlComment object.
parseSawAspCode bool delegate(string) parseSawAspCode;: If the parser sees <% asp code... %>, it will call this callback. It will be passed "% asp code... %" or "%= asp code .. %" Return true if you want the node appended to the document. It will be in an AspCode object.
parseSawPhpCode bool delegate(string) parseSawPhpCode;: If the parser sees <?php php code... ?>, it will call this callback. It will be passed "?php php code... ?" or "?= asp code .. ?" Note: dom.d cannot identify the other php <? code ?> short format. Return true if you want the node appended to the document. It will be in a PhpCode object.
parseSawQuestionInstruction bool delegate(string) parseSawQuestionInstruction;: if it sees a <?xxx> that is not php or asp it calls this function with the contents. <?SOMETHING foo> calls parseSawQuestionInstruction("?SOMETHING foo") Unlike the php/asp ones, this ends on the first > it sees, without requiring ?>. Return true if you want the node appended to the document. It will be in a QuestionInstruction object.
parseSawBangInstruction bool delegate(string) parseSawBangInstruction;: if it sees a <! that is not CDATA or comment (CDATA is handled automatically and comments call parseSawComment), it calls this function with the contents. <!SOMETHING foo> calls parseSawBangInstruction("SOMETHING foo") Return true if you want the node appended to the document. It will be in a BangInstruction object.
parseGarbage void parseGarbage(string data): Given the kind of garbage you find on the Internet, try to make sense of it. Equivalent to document.parse(data, false, false, null); (Case-insensitive, non-strict, determine character encoding from the data.) NOTE: this makes no attempt at added security, but it will try to recover from anything instead of throwing.
parseStrict void parseStrict(string data, bool pureXmlMode): Parses well-formed UTF-8, case-sensitive, XML or XHTML Will throw exceptions on things like unclosed tags.
parseUtf8 void parseUtf8(string data, bool caseSensitive, bool strict): Parses well-formed UTF-8 in loose mode (by default). Tries to correct tag soup, but does NOT try to correct bad character encodings.
selfClosedElements immutable(string)[] selfClosedElements;: List of elements that can be assumed to be self-closed in this document. The default for a Document are a hard-coded list of ones appropriate for HTML. For XmlDocument, it defaults to empty. You can modify this after construction but before parsing.
rawSourceElements immutable(string)[] rawSourceElements;: List of elements that contain raw CDATA content for this document, e.g. <script> and <style> for HTML. The parser will read until the closing string and put everything else in a RawSource object for future processing, not trying to do any further child nodes or attributes, etc.
inlineElements immutable(string)[] inlineElements;: List of elements that are considered inline for pretty printing. The default for a Document are hard-coded to something appropriate for HTML. For XmlDocument, it defaults to empty. You can modify this after construction but before parsing.
parse void parse(string rawdata, bool caseSensitive, bool strict, string dataEncoding): Take XMLish data and try to make the DOM tree out of it.
title string title [@property getter]: Gets the <title> element's innerText, if one exists
title string title [@property setter]: Sets the title of the page, creating a <title> element if needed.
getElementById Element getElementById(string id)
requireElementById SomeElementType requireElementById(string id, string file, size_t line)
requireSelector SomeElementType requireSelector(string selector, string file, size_t line)
optionSelector MaybeNullElement!SomeElementType optionSelector(string selector, string file, size_t line)
querySelector Element querySelector(string selector)
querySelectorAll Element[] querySelectorAll(string selector)
getElementsBySelector deprecated Element[] getElementsBySelector(string selector)
getElementsByTagName Element[] getElementsByTagName(string tag)
getElementsByClassName Element[] getElementsByClassName(string tag): These functions all forward to the root element. See the documentation in the Element class.
getFirstElementByTagName Element getFirstElementByTagName(string tag): FIXME: btw, this could just be a lazy range......
mainBody Element mainBody()
body alias body = mainBody: This returns the <body> element, if there is one. (It different than Javascript, where it is called 'body', because body used to be a keyword in D.)
getMeta string getMeta(string name): this uses a weird thing... it's [name=] if no colon and [property=] if colon
setMeta void setMeta(string name, string value): Sets a meta tag in the document header. It is kinda hacky to work easily for both Facebook open graph and traditional html meta tags/
forms Form[] forms(): .
createForm Form createForm(): .
createElement Element createElement(string name): .
createFragment Element createFragment(): .
createTextNode Element createTextNode(string content): .
findFirst Element findFirst(bool delegate(Element) doesItMatch): .
clear void clear(): .
prolog string prolog [@property getter]
setProlog void setProlog(string d): Returns or sets the string before the root element. This is, for example, <!DOCTYPE html>\n or similar.
toString string toString(): Returns the document as string form. Please note that if there is anything in piecesAfterRoot, they are discarded. If you want to add them to the file, loop over that and append it yourself (but remember xml isn't supposed to have anything after the root element).
toPrettyString string toPrettyString(bool insertComments, int indentationLevel, string indentWith): Writes it out with whitespace for easier eyeball debugging
root Element root;: The root element, like <html>. Most the methods on Document forward to this object.
piecesBeforeRoot Element[] piecesBeforeRoot;: if these were kept, this is stuff that appeared before the root element, such as <?xml version ?> decls and <!DOCTYPE>s
piecesAfterRoot Element[] piecesAfterRoot;: stuff after the root, only stored in non-strict mode and not used in toString, but available in case you want it
loose bool loose;: .

XmlDocument

Constructors

Inherited Members

From Document

See Also

Meta

Source

History