I’m trying to replace all HTML codes in my HTML file in a for Loop (not sure if this is the easiest approach) without changing the formatting of the original file. When I run the code below I don’t get the codes replaced. Does anyone know what could be wrong?
import re tex=open('ALICE.per-txt.txt', 'r') tex=tex.read() for i in tex: if i =='õ': i=='õ' elif i == 'ç': i=='ç' with open('Alice1.replaced.txt', "w") as f: f.write(tex) f.close()
You can use
>>> import html >>> html.unescape('õ') 'õ'
With your code:
import html with open('ALICE.per-txt.txt', 'r') as f: html_text = f.read() html_text = html.unescape(html_text) with open('ALICE.per-txt.txt', 'w') as f: f.write(html_text)
Please note that I opened the files with a
with statement. This takes care of closing the file after the
with block – something you forgot to do when reading the file.
Answered By – Matthias
Answer Checked By – Marie Seifert (AngularFixing Admin)