1/21
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Extensible Markup Language (XML)
A meta-language that describes the content of the document
Provides a portable method for encapsulating and describing data
Defines markup languages
Java = portable programs so XML =
portable data
Tag or Grammar of the language
XML does not specify the tag or grammar of the language
Tag
Markup tags that have meaning to a language processor
Grammar
Defines correct usage of a language’s tag
Tags in XML vs Tags in HTML
Tags in XML are defined by the author while tags in HTML are predefined by the W3C standard
XML Components
Prolog
Components of the document
Prolog
Defines the xml version, entity definitions, and DOCTYPE
Components of the document
Tags and attributes
CDATA (character data)
Entities
Processing instructions
Comments
XML Elements
XML tag with encapsulated data
Element contents must be character data in the encoding character set, no binary data
Tag names rules:
Case sensitive
Start with a letter or underscore
After first character, numbers, ‘-’ and ‘.’ are allowed
Cannot contain whitespaces
Avoid use of colon expect for indicating namespaces, discussed later
XML Element Attributes
Provide metadata for the element
Attribute names must adhere to the same rules as element names
For every attribute, there must be a value, even if the value is an empty string
No duplicate attributes within a single element
Document Entities
Entities refer to a data item, typically text
Entities are user definable
Well-formed vs Valid XML Documents
An XML document can be well-formed if it follows basic syntax rules.
An XML document is valid if its structure matches a Document Type Definition (DTD).
Commonly abused XML syntax rules:
Element and attribute names must be legal XML names
Characters < and & must be escaped as character entities when used in text
Every element must be closed
Attributes must have values and values must be delimited with quotation marks
Every element except the root element must be the child of exactly one element
Comments must be properly formed, in particular, a comment may not contain the string “--”
Two Solutions to validating XML documents
Document Type Definition (DTD)
XML Schema
Document Type Definition (DTD)
Sequence of declarations enclosed in a DOCTYPE declaration or stored separately and referred to from a DOCTYPE
Defines Structure of the Document
What DTD defines
Allowable tags and their attributes
Attribute values constraints
Nesting of tags
Number of occurrences for tags
Entity definitions
Formal Public Identifier (FPI) has four parts:
Connection of DTD to a formal standard
Group responsible for the DTD
Description and type of document
Language used in the DTD
Types of Elements
ANY
EMPTY
PCDATA
elements
mixed
Attribute Modifiers
#IMPLIED
#REQUIRED
#FIXED
Default Value
Limitations of DTD
DTD itself is not in XML format
Does not express data types
Does not allow one to specify a specific format for the data to appear in
No namespace support