31

Ok, so this is something completely stupid but this is something I simply never learned to do and its a hassle.

How do I specify a string that does not contain a sequence of other characters. For example I want to match all lines that do NOT end in '.config'

I would think that I could just do

.*[^(\.config)]$

but this doesn't work (why not?)

I know I can do

.*[^\.][^c][^o][^n][^f][^i][^g]$

but please please please tell me that there is a better way

5
  • 2
    what regex engine are you using? They can very in features supported. You should tag your question with the engine you are using. Commented Dec 28, 2009 at 22:01
  • 1
    why not use grep -v "\.config"? Commented May 23, 2010 at 6:32
  • 6
    @Lazer - because not everything in the world is a *nix system? Commented May 23, 2010 at 18:09
  • Or are THEY the duplicate!!!!??? Commented Oct 26, 2016 at 17:47
  • This is the duplicate. The other question was asked (and made community) on Jan 2 '09 at 7:30, yours was asked on Dec 28 '09 at 21:47 (almost a year later). I am tagging this question. Commented Nov 9, 2016 at 8:04

7 Answers 7

56

You can use negative lookbehind, e.g.:

.*(?<!\.config)$

This matches all strings except those that end with ".config"

Sign up to request clarification or add additional context in comments.

5 Comments

This works but .*(?!=\.config)$ does not - I thought the two syntaxes were equivalent. Any clue?
They are NOT equivalent. (?<!) matches the preceding string (look behind), while (?!) matches the following string (look ahead)
No, they are not. Negative lookahead is (?!matchthis), and your example can't work because you're looking ahead at a moment when you're already at the end of the string ($).
Actually, that link overstates the difficulty. Regex matching already produces a DFA from the regex, so the scary exponential expansion step referred to is already occurring every time you use the original regex. Once you've paid that price, it's straightforward to (1) complement the set of states that are considered Accepting states, and (2) declare success if you ever "fall off" the automaton by encountering a symbol in a state that has no transition for that symbol.
36

Your question contains two questions, so here are a few answers.

Match lines that don't contain a certain string (say .config) at all:

^(?:(?!\.config).)*$\r?\n?

Match lines that don't end in a certain string:

^.*(?<!\.config)$\r?\n?

and, as a bonus: Match lines that don't start with a certain string:

^(?!\.config).*$\r?\n?

(each time including newline characters, if present.

Oh, and to answer why your version doesn't work: [^abc] means "any one (1) character except a, b, or c". Your other solution would also fail on test.hg (because it also ends in the letter g - your regex looks at each character individually instead of the entire .config string. That's why you need lookaround to handle this.

Comments

4
(?<!\.config)$

:)

Comments

2

By using the [^] construct, you have created a negated character class, which matches all characters except those you have named. Order of characters in the candidate match do not matter, so this will fail on any string that has any of [(\.config) (or [)gi.\onc(])

Use negative lookahead, (with perl regexs) like so: (?!\.config$). This will match all strings that do not match the literal ".config"

Comments

2

Unless you are "grepping" ... since you are not using the result of a match, why not search for the strings that do end in .config and skip them? In Python:

import re
isConfig = re.compile('\.config$')
# List lst is given
filteredList = [f.strip() for f in lst if not isConfig.match(f.strip())]

I suspect that this will run faster than a more complex re.

2 Comments

Unless you are grepping, why use regex at all? Python has in for a reason. Other languages I'm sure have similar solutions.
Yeah this is what I do now, but it is best to know how to do it both ways. I've run into situations where this has forced some awkward syntax.
2

As you have asked for a "better way": I would try a "filtering" approach. I think it is quite easy to read and to understand:

#!/usr/bin/perl

while(<>) {
    next if /\.config$/; # ignore the line if it ends with ".config"
    print;
}

As you can see I have used perl code as an example. But I think you get the idea?

added: this approach could also be used to chain up more filter patterns and it still remains good readable and easy to understand,

    next if /\.config$/; # ignore the line if it ends with ".config"
    next if /\.ini$/;    # ignore the line if it ends with ".ini"
    next if /\.reg$/;    # ignore the line if it ends with ".reg"

    # now we have filtered out all the lines we want to skip
    ... process only the lines we want to use ...

Comments

0

I used Regexpal before finding this page and came up with the following solution when I wanted to check that a string doesn't contain a file extension:

^(.(?!\.[a-zA-Z0-9]{3,}))*$ I used the m checkbox option so that I could present many lines and see which of them did or did not match.

so to find a string that doesn't contain another "^(.(?!" + expression you don't want + "))*$"

My article on the uses of this particular regex

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.