A markup language specified using XML looks a lot like HTML; a
document consists of a single element, which contains
sub-elements, which can have further sub-elements inside them.
Elements are indicated by tags in the text. Tags are always
inside angle brackets <
>
. Elements can either contain
content, or they can be empty.
An element can contain content between opening and closing
tags, as in <name>Euryale</name>
, which is a name
element containing the data "Euryale". This content may be text
data, other XML elements, or a mixture of both.
Elements can also be empty, containing nothing, and are represented as
a single tag ended with a slash. For example, <stop/>
is an
empty stop
element. Unlike HTML, XML element names are
case-sensitive; stop
and Stop
are two different
elements.
Opening and empty tags can also contain attributes, which specify
values associated with an element. For example, in the XML text
<name lang='greek'>Herakles</name>
, the name
element
has a lang
attribute which has a value of "greek".
In <name lang='latin'>Hercules</name>
,
the attribute's value is "latin".
XML also includes entities as a shorthand for including a
particular character or a longer string. Entity references always
begin with a "&" and end with a ";". For example, a
particular Unicode character can be written as ሴ
using
its character code in decimal, or as ሴ
using
hexadecimal. It's also possible to define your own entities, making
&title;
expand to ``The Odyssey'', for example. If you want to
include the "&" character in XML content, it must be written as
&
.