I want to remove dangling attributes of html element.
I use regex re.sub(r'(<[\S]+.*\s)[^=]+[\s]', r'\1', x) to find attributes without =.
>>> import re
>>> string_list = ['<tag valid1="o n e" valid2=two some dangling></tag>', '<tag valid1="o n e" valid2=two some dangling/>']
>>> map(lambda x: re.sub(r'(<[\S]+.*\s)[^=]+[\s]', r'\1', x), string_list)
['<tag valid1="o n e" valid2=two dangling></tag>', '<tag valid1="o n e" valid2=two dangling/>']
But this only removes the first. How to repeatedly remove all?
ElementTreeto parse it but it only supports xml, which does not allow dangling attributes. That is why I want to do this.