5

I'm trying to obfuscate a string, but need to preserve a couple patterns. Basically, all alphanumeric characters need to be replaced with a single character (say 'X'), but the following (example) patterns need to be preserved (note that each pattern has a single space at the beginning)

  • QQQ"
  • RRR"

I've looked through a few samples on negative lookahead/behinds, but still not haven't any luck with this (only testing QQQ).

var test = @"""SOME TEXT       AB123 12XYZ QQQ""""empty""""empty""1A2BCDEF";
var regex = new Regex(@"((?!QQQ)(?<!\sQ{1,3}))[0-9a-zA-Z]");            
var result = regex.Replace(test, "X");  

The correct result should be:

"XXXX XXXX       XXXXX XXXXX QQQ""XXXXX""XXXXX"XXXXXXXX

This works for an exact match, but will fail with something like ' QQR"', which returns

"XXXX XXXX       XXXXX XXXXX XQR""XXXXX""XXXXX"XXXXXXXX

2 Answers 2

4

You can use this:

var regex = new Regex(@"((?> QQQ|[^A-Za-z0-9]+)*)[A-Za-z0-9]");            
var result = regex.Replace(test, "$1X");

The idea is to match all that must be preserved first and to put it in a capturing group.

Since the target characters are always preceded by zero or more things that must be preserved, you only need to write this capturing group before [A-Za-z0-9]

Sign up to request clarification or add additional context in comments.

4 Comments

@hwnd: no you can't because a single Q is not matched.
I must of misunderstood what he wants to do. I thought he wants to preserve QQQ and RRR
+1 Perhaps you should modify the regex a bit to become ((?> (?:Q|R){3}""|[^A-Z0-9]+)*)[A-Z0-9] just so it accepts the double quote and ' RRR"' as well (and use ignorecase).
This is correct, I do want to preserve QQQ and RRR. The only issue with this is that a " needs to follow QQQ or RRR, but that's easily addressed. Thanks, I was on the right track but missing a fundamental piece!
2

Here's a non-regex solution. Works quite nice, althought it fails when there is one pattern in an input sequence more then once. It would need a better algorithm fetching occurances. You can compare it with a regex solution for a large strings.

public static string ReplaceWithPatterns(this string input, IEnumerable<string> patterns, char replacement)
{
    var patternsPositions = patterns.Select(p => 
           new { Pattern = p, Index = input.IndexOf(p) })
           .Where(i => i.Index > 0);

    var result = new string(replacement, input.Length);
    if (!patternsPositions.Any()) // no pattern in the input
        return result;

    foreach(var p in patternsPositions)
        result = result.Insert(p.Index, p.Pattern); // return patterns back

    return result;
}

2 Comments

I debated doing a non-regex version of this, since the strings will never be huge, but deep down inside I felt there had to be a way to do it with regex...I just have to make sure I comment it or I'll forget what it does by next week.
@tencntraze Yeah, that'a exactly my problem with regex :D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.