![]() |
|
Submit XML Questions |
|
Interrelated Information Technology |
|
| Microsoft Operating Systems and Networks Questions | Java Questions |
| UNIX Questions | Mainframe Questions |
The Computer Education Techniques knowledge base is a service for answering questions, inclusive of the research and validation of the accuracy of information in the public domain. Citation of source documentation and examples are used to provide answers to the questions. Utilization of the information of this service and reliance on the answers, information or other materials received through this web site is done at your own risk.
| Q | I have been informed that there are alternatives to DTDs. What is a schema? |
| A | The W3C XML Schema
recommendation provides a means of specifying formal data typing and
validation of element content in terms of data types, so that document type
designers can provide criteria for checking the data content of elements as
well as the markup itself. Schemas are written in XML Document Syntax, like
XML documents are, avoiding the need for processing software to be able to
read XML Declaration Syntax (used for DTDs). The term vocabulary is sometimes used to refer to DTDs and Schemas together. Schemas are aimed at e-commerce, data control, and database-style applications where character data content requires validation and where stricter data control is needed than is possible with DTDs; or where strong data typing is required. They are usually unnecessary for traditional text document publishing applications. Unlike DTDs, Schemas cannot be specified in an XML Document Type Declaration. They can be specified in a Namespace, where Schema-aware software should pick it up, but this is optional. |
| Q | What's a XML namespace? |
| A | A namespace is a collection of
element and attribute names identified by a Uniform Resource Identifier
reference. The reference may appear in the root element as a value of the
xmlns attribute.
For example, the namespace reference for an XML document with a root element x might appear like this: <x xmlns="http://www.company.com/company-schema"> More than one namespace may appear in a single XML document, to allow a name to be used more than once. Each reference can declare a prefix to be used by each name, so the previous example might appear as: <x xmlns:spc= which would nominate the namespace for the ‘spc’ prefix: <spc:name>Mr. Big</spc:name> |
| Q |
What's my information? DATA or TEXT? |
||||
| A | Some important distinctions
exist between the major classes of XML applications and the way in which
they are used:
Two classes of applications are usually referred to as ‘document’ and ‘data’ applications, and this is reflected in the software, which is usually (but not always) aimed at one class or the other.
There is a third major area, Web Development, whose requirements are often hybrid, and span the features of both document and data applications because they contain partly static descriptive text and partly dynamic data. While in theory it would be possible to use data-class software to write a novel, or document-class software to create invoices, it would probably be severely suboptimal. Because of the nature of the information used by the two classes, data-class applications tend to use Schemas, and document-class applications tend to use DTDs, but there is a considerable degree of overlap. |
| Q | Can XML use non-Latin characters? | ||||||
| A | Yes, the XML Specification
explicitly says XML uses ISO 10646, the international standard character
repertoire which covers most known languages.
Unicode is an identical repertoire, and the two standards correspond with each other. The specification states:
Some of the common encodings supported by software include:
|
| Q | What is a CDATA section in XML? |
| A1 | CDATA stands for Character
DATA. CDATA sections provides the ability to escape blocks of text
containing mark up.
CDATA sections take the general form: <![CDATA[....put text containg markup here...]]> |
|
Example: For example, in order to print out the following line of text: "The left angled bracket '<' and the ampersand '&' must be replaced by their entities < and & respectively".
In HTML, the left angled bracket '<' and the ampersand '&' must be replaced by their entities &lt; and &amp; respectively". By escaping the text using CDATA, this can be simplified to:
<![CDATA["The left angled bracket '<' and the ampersand '&' must be replaced by their entities < and & respectively".]]> |
| Q |
What are the disadvantages of XML in terms of size and performance? |
| A | Despite the advantages, XML
does sometimes cause a significant increase in data size and processing
time. These disadvantages are the result of design decisions and tradeoffs
made by XML's original designers. For example, to make XML fully
internationalized, the designers chose to require Unicode support, which can
increase the memory required for processing and storing information from XML
documents. The designers also chose the robustness of redundant labels in
start and end tags, increasing the amount of space XML requires in disk
storage or the amount of bandwidth for moving it over a network. The most
serious performance risk, however, is one that people do not often worry
about: XML's ability to include external resources. XML repeats every element and attribute name for every element and attribute instance: In fact, it repeats the element name twice for every instance. If a long XML document contains 20,000 nonempty elements named maintenance-entry, the string maintenance-entry will appear in the document 40,000 times, consuming between 680,000 and 2,720,000 bytes of storage space, depending on the character encoding. For loosely structured XML, such as human-readable documents, this overhead is often not a problem, but for highly structured XML, such as a database dump, these repeated names represent a significant overhead. There is a temptation to use short, cryptic element and attribute names, such as c183, instead of workflow-approval, destroying XML's advantage of transparency. There is also a temptation to reduce the amount of tagging, using whitespace and line ends to delimit some fields. These solutions are not particularly good, but they do show the desperation people face when dealing with enormous XML data files. |
| Q | What are the return values of XPaths Expressions? | ||||||||
| A | They return a value:
|
| Q | What is XML? Does my company need to modify our data and applications to it? |
| A | XML is an open standard; it
is not defined or proprietary to any one company such as Windows by
Microsoft, WebSphere by IBM, or the Oracle database engine by Oracle
Corporation. The fundamental objective of the XML standard is to enable
generic SGML to be served, received, and processed on the Web in the same
way that HTML currently is utilized. SGML is the Standard Generalized Markup
Language. It is the international standard for defining descriptions of the
structure of different types of electronic document. The XML specification
can be downloaded. XML provides the following functionality:
All data on the web will at some point in the not so distant future be XML! |
| Q | I work for a small company and we are in the process of building our E-Commerce web site. Should I use XML instead of HTML? And needless to say, I don't have a huge budget. What do you suggest for XML training. |
| A | When designing web content use XML. At the most fundamental level; XML is a format for storing data; while HTML is a means for displaying data. XML is superior to HTML for the following reasons:
FYI, our HTML courses provide a demonstration and overview of XML. And our XML courses teach how to convert and retrofit HTML to XML. |
| Q | Are any parts of an XML document case-sensitive? |
| A | All of an XML document is case-sensitive; both markup and text. This is significantly different from HTML and most other SGML applications. |
| Q | What is a DTD? |
| A | DTD stands for Document Type Definition. It is a formal description in XML Declaration Syntax of a particular type of document. It defines what names are to be used for the different types of elements, where they may occur, and how they all fit together. |
| Q | Is it possible to get XML data into or out of a database and how hard is to implement? |
| A | All the database manufacturer provide XML import and export modules. |
| Q | What's a namespace? |
| A | A namespace is a collection of element and attribute names identified by a Uniform Resource Identifier reference. |
| Q | I have both heard and read about "well-formed" documents? What is it? |
| A | A well-formed XML document is
easy for a program to read, and ready for network delivery. The following characteristics are typical of a well-formed document:
|