1

Is it possible in Python to remove only single line from a string variable without doing the concatenation of both chunks? I'm after removing cachedLength line.

input_columns = f'''\
<inputColumn 
    refId="Package\DFT\DST.Inputs[DST Input].Columns[{tbl}]"
    cachedDataType="{dt_type}"
    cachedName="{tbl}"
    cachedLength="{dt_lngh}"
    cachedPrecision="{dt_prc}"
    cachedScale="{dt_scl}"
    externalMetadataColumnId="Package\DFT\DST.Inputs[DST Input].ExternalColumns[{tbl}]"
    lineageId="Package\DFT\SRC.Outputs[SRC Output].Columns[{tbl}]" />
'''

Work in progress:

i = input_columns.split("\n")[4]
print(input_columns.replace(i, ""))
5
  • 2
    Not possible. Strings are immutable in Python. Use str.replace or re.sub to create new string instead. Commented Sep 21, 2021 at 21:34
  • 1
    Your choices are usually: split and rejoin the string, replace a fixed substring with string.replace(), replace with re.sub(), or use a library to manage a particular kind of data (HTML, XML, etc.). Commented Sep 21, 2021 at 21:36
  • You can use a combination of str.split and str.join, but I think that would be "concatenating" again. Commented Sep 21, 2021 at 21:42
  • 1
    What you have is an XML element. You shouldn't be trying to remove lines from an arbitrarily formatted string; you should parse it, remove an attribute, and reserialize the result. Commented Sep 21, 2021 at 21:43
  • To be specific, this is XML/DTSX file I need to modify data for. I know there is BIML/EzAPI but they don't support Python. Commented Sep 21, 2021 at 21:52

2 Answers 2

1

You can replace the line using re module:

re.sub(r'\s*cachedLength=".*"', '', input_columns)

But as mentioned in the comments, technically, Python will recreate the string because it's an immutable type.

Sign up to request clarification or add additional context in comments.

Comments

1

Not exactly answering the question, but if modifying the xml is what you're after, you could use a package for that. E.g., using lxml:

from lxml import etree

xml = """\
<inputColumn 
    refId="Package\DFT\DST.Inputs[DST Input].Columns[{tbl}]"
    cachedDataType="{dt_type}"
    cachedName="{tbl}"
    cachedLength="{dt_lngh}"
    cachedPrecision="{dt_prc}"
    cachedScale="{dt_scl}"
    externalMetadataColumnId="Package\DFT\DST.Inputs[DST Input].ExternalColumns[{tbl}]"
    lineageId="Package\DFT\SRC.Outputs[SRC Output].Columns[{tbl}]" />
"""

e = etree.fromstring(xml)

del e.attrib["cachedLength"]

print(etree.tostring(e, pretty_print=True, encoding="unicode"))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.