Rectangle 27 0

drop_tree() method is exactly what you need:

Drops the element and all its children. Unlike el.getparent().remove(el) this does not remove the tail text; with drop_tree the tail text is merged with the previous element.

Find all br elements inside pre, set the tail to \n and drop the element:

from lxml import etree
import lxml.html

text = """
<div>
    <pre>
        <br>
        test
        <br>
    </pre>
    <br>
</div>
"""

root = lxml.html.fromstring(text)
for action, el in etree.iterwalk(root):
    if el.tag == 'pre':
        for br in el.xpath('br'):
            br.tail = '\n' + br.tail
            br.drop_tree()

print etree.tostring(root)
<div>
    <pre>


        test


    </pre>
    <br/>
</div>

python - How to replace an HTML tag with text inside an lxml iterwalk ...

python html replace html-parsing lxml
Rectangle 27 0

I understand that you have lxml as a requirement, but using BeautifulSoup for parsing and modifying HTML is much much more easy and fun. If speed really matters here, you can use lxml as an underlying parser:

from bs4 import BeautifulSoup

text = """
<div>
    <pre>
        <br>
        test
        <br>
    </pre>
    <br>
</div>
"""

soup = BeautifulSoup(text, "lxml")
for pre in soup.find_all('pre'):
    for br in pre.find_all('br'):
        br.replace_with('\n')

print soup.prettify()
<html>
 <body>
  <div>
   <pre>


        test


    </pre>
   <br/>
  </div>
 </body>
</html>

python - How to replace an HTML tag with text inside an lxml iterwalk ...

python html replace html-parsing lxml
Rectangle 27 0

Well I don't think you want to just change the text node of the element. What I think you want to do is to modify the text node of your Element add a SubElement of name br to your lxml_element and then set the tail attribute of your subelement to the 2nd part of the string you are parsing. I found the tutorial here: http://lxml.de/tutorial.html#the-element-class to be very useful.

python - Replace text with HTML tag in LXML text element - Stack Overf...

python lxml