0

I wanted to extract some strings from some text files. After some researching for that files, I found some pattern that strings appear in a text file.

I composed a short powershell script by help of google-search. This script receives two parameters (textfile path and extracting keyword) and operates extracting strings from text file.

As finding & extracting the target strings from the file $tpath\temp.txt, this script saves it to another file $tpath\tmpVI.txt.

Set-PSDebug -Trace 2 -step
$txtpath=$args[0]
$exkey=$args[1]
$tfile=gc "$tpath\temp.txt"
$savextracted="$tpath\tmpVI.txt"

$tfile -replace '&', '&' -replace '^.*$exkey', '' -replace '\s.*$', '' -replace '\\.*$','' | out-file "$savextracted" -encoding ascii

But until now, the extracted & saved result has been fault, never wanted strings.

By PS debugging, it seems the regular expressions in the last line make troubles and variable $exkey does so in replace quotation. But I don't know how to fix this. What shall I do?

1 Answer 1

1

If you're looking to capture lines that have your match, here's a snippet that solves that problem:

Function Get-Matches
{
    Param(
        [Parameter(Mandatory,Position=0)]
        [String] $Path,

        [Parameter(Mandatory,Position=1)]
        [String] $Regex
    )

    @(Get-Content -Path $Path) -match $Regex
}
Sign up to request clarification or add additional context in comments.

5 Comments

Those text files are all one line text files. In fact, I already extracted keyword-matched only one line from original text file by FINDSTR command in windows CMD, and saved it to the file "temp.txt" of my PS script. And then only for extracting keyword-related strings from that one line files, I composed above PS script. That script be called in batch file and implement extraction and save extracted strings to another text file for other use later. but it still did not extract correctly..
@ThmLee If they're all on one line, are you just looking for whether or not the word exists in the file then? A true/false evaluation?
Well, but those are quite some huge lines, which have 30k+ characters in their one line.;;;
@ThmLee If it's ASCII, that's still only ~240KB
The text file is UTF-8, and extracted strings are saved in ASCII to the file "$tpath\tmpVI.txt". No problem in savig the strings, but it's a problem of extracting unwanted strings so I think there must be troubles in regular expression, variables in quotation, or escaping method in the last line. Extracting process in the script is as follows. step1) replace all "&" to "&". step2) remove all characters in front of the variable "$exkey". step3) remove all characters after a white space. step4) remove all characters after a backslash "\"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.