5

I have a question which im pretty much stuck on..

I have a file called xml_data.txt and another file called entry.txt

I want to replace everything between <core:topics> and </core:topics>

I have written the below script

$test = Get-Content -Path ./xml_data.txt
$newtest = Get-Content -Path ./entry.txt
$pattern = "<core:topics>(.*?)</core:topics>"
$result0 = [regex]::match($test, $pattern).Groups[1].Value
$result1 = [regex]::match($newtest, $pattern).Groups[1].Value
$test -replace $result0, $result1

When I run the script it outputs onto the console it doesnt look like it made any change.

Can someone please help me out

Note: Typo error fixed

7
  • 1
    Use $test = Get-Content -Path ./xml_data.txt -Raw, and $pattern = "(?s)<core:topics>(.*?)</core:topics>" Commented Aug 20, 2019 at 11:25
  • 5
    DO NOT parse XML with regex. For manipulating XML data use PowerShell's builtin XML parser. Beware that your data apparently is using namespaces, so you need to take care of that. Commented Aug 20, 2019 at 11:26
  • $test -replace [regex]::Escape($result0), $result1 Commented Aug 20, 2019 at 11:44
  • @WiktorStribiżew What if I have mutiple entry how will i loop that Commented Aug 20, 2019 at 12:48
  • Sorry, I think your question is rather unclea and a bit too broad. Commented Aug 20, 2019 at 14:10

1 Answer 1

4

There are three main issues here:

  • You read the file line by line, but the blocks of texts are multiline strings
  • Your regex does not match newlines as . does not match a newline by default
  • Also, the literal regex pattern must when replacing with a dynamic replacement pattern, you must always dollar-escape the $ symbol. Or use simple string .Replace.

So, you need to

  • Read the whole file in to a single variable, $test = Get-Content -Path ./xml_data.txt -Raw
  • Use the $pattern = "(?s)<core:topics>(.*?)</core:topics>" regex (it can be enhanced in case it works too slow by unrolling it to <core:topics>([^<]*(?:<(?!</?core:topics>).*)*)</core:topics>)
  • Use $test -replace [regex]::Escape($result0), $result1.Replace('$', '$$') to "protect" $ chars in the replacement, or $test.Replace($result0, $result1).
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.