This tutorial explains about special characters escape in XML. XML content contains the following components

Tags text attributes CDATA comments

Learn which characters are escaped on these XML components.

Why are XML Escape Characters required?

Let’s go through some examples of why is required in XML

  • XML tag contains content contains special characters
<result>
     example <Text> with "special" characters content
</result>

In the above example, XML tag result content, content contains <,> and ",". Processor parsing this XML content assumes that <text> is a start tag(<>) and expects an end tag and fails, same with "" characters, resulting in parse errors.

these characters need special handling to escape characters.

  • XML attributes contain quotation marks

For example, attributes contain single and double quotes as an attribute value.

<user message='sales' and "hr" department'>
    Some text content.
</user>

Here a message attribute contains a value sales' and "hr" department, that contains single and double quotes.

attribute value is usually enclosed in single or double quotes, and content inside quotes treats that as a value, But the processor is unable to understand the actual value due to ambiguity.

To avoid the above processing errors, We need to escape some characters

In that case, What characters need to escape in XML Documents?

Following are characters that you need to escape.

NameCharacterEscaping Character
Double quotation"&quot;
Single quotation'&apos;
Less than<&lt;
Greater Than>&gt;
Amphersand&&amp;

Escape characters are replaced based on the usage and type.

Let’s see some examples where Escape is required and not.

XML escape Characters Examples

Let’s see some examples of XML content

  • Escape Characters in XML Text Content

    • Character < need to escape with &lt;, Otherwise this assumes that start of tag <users/> symbol.
    • Always replace & with &amp; anywhere except & character in &entity; text.
    • Remaining characters(",',>) are not required to escape it, It is optional.
<result>
     example &lt;Text&gt; with &quot;special&quot; characters content
</result>
<errors>
    <!-- valid text content, escape is not required for this characters, Optional-->
"'>
</errors>

Above example, <,>, and " characters replaced with escape characters &lt;, &gt; and &quot;.

"'> characters not required to escape it.

  • Escape Characters in XML attributes

In the following cases, escape is not required.

  • If attributes are enclosed in single quotes, The values containing double quotes are valid.
<user message="'"/> <!-- Valid and escape not required  for single quote-->
  • If attributes are enclosed in double quotes, The values containing single quotes are valid.
<user message='"'/> <!-- Valid and escape not required for double quote-->
  • Escape is not required for character >
<user message='>'/> <!-- Valid and escape not required for greater than -->
  • Remaining cases, You need to escape single and double quotes &apos;, &quot;

In the below example, double quotes are replaced with escape characters, single quotes are not escaped.

<user message='sales' and &quot;hr&quot; department'>
    text example
</user>
  • CDATA content Escape characters

Escape characters are not required to be implemented in CDATA sections.

<?xml version="1.0"?>
<user>
<![CDATA[ Characters not required to escape"'<>&]]>
</user>
  • Comment

All these 5 characters are not required to escape in comments.