SAX2 supports XML namespaces. If namespace processing is active, parsers won't call startElement(), but instead will call a method named startElementNS(). The default of this setting varies from parser to parser, so you should always set it to a safe value (unless your handler supports both namespace-aware and -unaware processing).
For example, our FindIssue content handler described in previous section doesn't implement the namespace-aware methods, so we should request that namespace processing is deactivated before beginning to parse XML:
from xml.sax import make_parser from xml.sax.handler import feature_namespaces # Create a parser parser = make_parser() # Disable namespace processing parser.setFeature(feature_namespaces, 0)
The second argument to setFeature() is the desired state of
the feature, mostly commonly a Boolean. You would call
parser.setFeature(feature_namespaces, 1)
to enable namespace
processing.
Namespaces in XML work by first defining a namespace prefix that maps to a given URI specified by the relevant DTD, and then using that prefix to mark elements and attributes that come from that DTD. For example, the XLink specification says that the namaspace URI is "http://www.w3.org/1999/xlink". The following XML snippet includes some XLink attributes:
<root xmlns:xlink="http://www.w3.org/1999/xlink"> <elem xlink:href="http://www.python.org" /> </root>
The xmlns:xlink
attribute on the root
element
declares that the prefix "xlink" maps to the given URL. The
elem
element therefore has one attribute named
href
that comes from the XLink namespace. Namespace-aware
methods expect (URI, name)
tuples instead of just
element and attribute names; instead of "xlink:href", they would
receive ('http://www.w3.org/1999/xlink', 'href')
.
Note that the actual value of the prefix is immaterial, and software shouldn't make assumptions about it. The XML document would have exactly the same meaning if the root element said "xmlns:pref1="http://..."" and the attribute name was given as "pref1:href".
If namespace processing is turned on, you would have to write startElementNS() and endElementNS() methods that looked like this:
def startElementNS(self, (uri, localname), qname, attrs): ... def endElementNS(self, (uri, localname, qname): ...
The first argument is a 2-tuple containing the URI and the name of the
element within that namespace. qname is a string containing the
original qualified name of the element, such as "xlink:a", and
attrs is a dictionary of attributes. The keys of this
dictionary will be (URI, attribute_name)
pairs. If
no namespace is specified for an element or attribute, the URI will
given given as None
.