I am looking for a way to randomize a specific string in a huge file by using predefined strings from array, without having to write temporary file on disk.
There is a file which contains the same string, e.g. "ABC123456789" at many places:
<Id>ABC123456789</Id><tag1>some data</tag1><Id>ABC123456789</Id><Id>ABC123456789</Id><tag2>some data</tag2><Id>ABC123456789</Id><tag1>some data</tag1><tag3>some data</tag3><Id>ABC123456789</Id><Id>ABC123456789</Id>
I am trying to randomize that "ABC123456789" string using array, or list of defined strings, e.g. "@('foo','bar','baz','foo-1','bar-1')". Each ABC123456789 should be replaced by randomly picked string from the array/list.
I have ended up with following solution, which is working "fine". But it definitely is not the right approach, as it do many savings on disk - one for each replaced string and therefore is very slow:
$inputFile = Get-Content 'c:\temp\randomize.xml' -raw
$checkString = Get-Content -Path 'c:\temp\randomize.xml' -Raw | Select-String -Pattern '<Id>ABC123456789'
[regex]$pattern = "<Id>ABC123456789"
while($checkString -ne $null) {
$pattern.replace($inputFile, "<Id>$(Get-Random -InputObject @('foo','bar','baz','foo-1','bar-1'))", 1) | Set-Content 'c:\temp\randomize.xml' -NoNewline
$inputFile = Get-Content 'c:\temp\randomize.xml' -raw
$checkString = Get-Content -Path 'c:\temp\randomize.xml' -Raw | Select-String -Pattern '<Id>ABC123456789'
}
Write-Host All finished
The output is randomized, e.g.:
<Id>foo
<Id>bar
<Id>foo
<Id>foo-1
However, I would like to achieve this kind of output without having to write file to disk in each step. For thousands of the string occurrences it takes a lot of time. Any idea how to do it?
========================= Edit 2023-02-16
I tried the solution from zett42 and it works fine with simple XML structure. In my case there is some complication which was not important in my text processing approach. Root and some other elements names in the structure of processed XML file contain colon and there must be some special setting for "-XPath" for this situation. Or, maybe the solution is outside of Powershell scope.
<?xml version='1.0' encoding='UTF-8'?>
<C23A:SC777a xmlns="urn:C23A:xsd:$SC777a" xmlns:C23A="urn:C23A:xsd:$SC777a" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:C23A:xsd:$SC777a SC777a.xsd">
<C23A:FIToDDD xmlns="urn:iso:std:iso:20022:tech:xsd:pacs.008.001.02">
<CxAAA>
<DxBBB>
<ABC>
<Id>ZZZZZZ999999</Id>
</ABC>
</DxBBB>
<CxxCCC>
<ABC>
<Id>ABC123456789</Id>
</ABC>
</CxxCCC>
</CxAAA>
<CxAAA>
<DxBBB>
<ABC>
<Id>ZZZZZZ999999</Id>
</ABC>
</DxBBB>
<CxxCCC>
<ABC>
<Id>ABC123456789</Id>
</ABC>
</CxxCCC>
</CxAAA>
</C23A:FIToDDD>
<C23A:PmtRtr xmlns="urn:iso:std:iso:20022:tech:xsd:pacs.004.001.02">
<GrpHdr>
<TtREEE Abc="XV">123.45</TtREEE>
<SttlmInf>
<STTm>ABCA</STTm>
<CLss>
<PRta>SIII</PRta>
</CLss>
</SttlmInf>
</GrpHdr>
<TxInf>
<OrgnlTxRef>
<DxBBB>
<ABC>
<Id>YYYYYY888888</Id>
</ABC>
</DxBBB>
<CxxCCC>
<ABC>
<Id>ABC123456789</Id>
</ABC>
</CxxCCC>
</OrgnlTxRef>
</TxInf>
</C23A:PmtRtr>
</C23A:SC777a>
-Replace) is a bad idea. Instead you should use the related parser for searching and replacing. See e.g.: Powershell regex for replacing text between two stringsSelect-Xmlwith the-Namespaceparameter like this:Select-Xml -XPath '//a:Id/text()' -Namespace @{a = 'urn:iso:std:iso:20022:tech:xsd:pacs.008.001.02'}