Escape special characters in HTML pre tags with Python and regular expressions.

This article shows how to use Python and regular expressions to replace the contents of pre-tags with escaped html.

Let's say you have some HTML like this:

<pre><div>inline test</div></pre>

or this:

<pre>
<p>
    multi
    line
    test
</p>
</pre>

but you want escaped HTML like this:

<pre>&lt;div&gt;inline test&lt;/div&gt;</pre>

<pre>&lt;p&gt;
    multi
    line
    test
&lt;/p&gt;</pre>

Here is the Python code to do that:

The following code takes a string and replaces the contents of each pre tag with escaped HTML:

import re
import html

text = """<pre><div>inline test</div></pre>

<pre>
<p>
    multi
    line
    test
</p>
</pre>
"""

pattern = "<pre>\n?((.|\n)*?)\n?</pre>"
result = re.sub(pattern, lambda x: "<pre>" + html.escape(x.group(1)) + "</pre>", text)
print(result)
Written by Loek van den Ouweland on 2021-08-09. Questions regarding this artice? You can send them to the address below.