4

I would like to know a way to remove duplicate words or strings in a text file(not lines) using notepad++ regex find tool.

I only saw ways to remove duplicate lines using TextFx and that is not what i am looking for.

Example -

123 / 789 123 / 321

Removing 123 would result in

123 / 789 / 321

2
  • 3
    So what exactly are you looking for? Do you mean duplicates like "It's the the mailman!" (duplicate the) or "The cat chased the dog" (duplicate the with intervening words) , or "banana" (duplicate na)? Or something else? Commented Jan 13, 2013 at 10:11
  • Use this RE \b(\w+)(?:\s+\1\b)+ Courtesy of [StackExchange][1] [1]: superuser.com/questions/454046/… Commented Aug 19, 2015 at 10:35

1 Answer 1

4

I'm not familiar with Notepad++, but assuming it uses standard syntax, replace

\b(\w+)\b([\w\W]*)\b\1\b

with

$1$2
Sign up to request clarification or add additional context in comments.

2 Comments

There is no lookahead in your code. You probably mean backreference.
Oops. Thanks. It was late. In fact I meant to offer a lookahead-based solution, but then I remembered reading somewhere that NPP doesn't support lookaheads, wrote a different solution, and forgot to change the description.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.