14

I am trying to write a regex that matches a valid CSS class name structure. I have this so far:

$pattern = "([A-Za-z]*\.[A-Za-z]+\s*{)";

$regex = preg_match_all($pattern, $html, $matches);

However, a class name can be in the following formats that my regex won't match:

p.my_class{
}
p.thisclas45{
}

These are just some cases, I've looked around to find the rules of how you can name a class in a style block but couldn't find anything. Anyone know where the rules for the class naming conventions are?

Are there any more cases that I need to consider? What regex would you use to match a class name?

I have already narrowed it down to a style block using the PHP DOM Document class.

20
  • Where are your delimiters? And did you mean CSS? Commented Jun 13, 2011 at 10:12
  • In CSS, class names begin with a dot. Your regexp does not match so. :-? Commented Jun 13, 2011 at 10:14
  • @Tomalak - sorry I removed them for some reason, its a #. @ Álvaro - it does match it as I have a * which is 0 or more characters infront of the .. Commented Jun 13, 2011 at 10:17
  • 2
    @Abs: Are you after: (a) classname [you don't have this right atm]; (b) selector including classname [looks about right!]; or (c) any selector [you're missing loads of cases]? Commented Jun 13, 2011 at 10:20
  • 1
    ( and ) (as well as {}, [] and <>) may be used as delimiters, i.e., "([A-Za-z]*\.[A-Za-z]+\s*{)" is fully valid pattern. If another delimiter is used, there's no need to put whole pattern in (), i.e., it can be either (something) or #something#, and there's no need to write #(something)#, as you would use whole pattern as subpattern in such case. Commented Jun 13, 2011 at 10:30

3 Answers 3

26

Have a look at http://www.w3.org/TR/CSS21/grammar.html#scanner

According to this grammar and the post Which characters are valid in CSS class names/selectors? this should be the right pattern to scan for css classes:

\.-?[_a-zA-Z]+[_a-zA-Z0-9-]*\s*\{

Note: Tag names are not required as prefix for classes in css. Just .hello { border: 1; } is also valid.

Sign up to request clarification or add additional context in comments.

7 Comments

wow that was fast! I tested the above with a few variations including invalid class names and it works. Thank you very much!
What about using [\w-] instead of [_a-zA-Z0-9-]? \w matches any word character, i.e. any letter or digit or the underscore character (from docs).
s/prefix for classes/prefix for selectors/
That's not going to match .modern-trade. It should be \.-?[_a-zA-Z\-]+[\w\-]*\s*\{.
Thanks, for me this will do :) But this regex doesn't take some weird rules into account – escaped and unicode characters. Here's a good read about that: mathiasbynens.be/notes/css-escapes
|
2

This regex:

/(\w+)?(\s*>\s*)?(#\w+)?\s*(\.\w+)?\s*{/gm

will match any of the following:

p.my_class{}
p.thisclas45{}
.simple_class{}
tag#id.class{}
tag > #id{}

You can play around with it, on RegExr, here.

3 Comments

@Tomalak do you mean the capture groups? Please take a look at the regexr link, the "replace tab" shows where they go: $1 is the tag, $2 the ancestor (>), $3 the id and $4 the class name (without the '{'). If you mean the full regexp, it is /(\w+)?(\s*>\s*)?(#\w+)?\s*(\.\w+)?\s*{/gm
No, your delimiters, as I said. You didn't include them in your answer. Delimiters are part of the expression, and enough people don't use them properly that leaving them out of the answer is dangerous.
but it will not match something like this: .icon-something:before { content: "\e935"; }
2

This regex will select all classes in a CSS file, no matter how much complex the CSS code is.

/(?<=\.)([a-zA-Z0-9_-]+)(?![^\{]*\})/g

Eg:

.class-1:focus > :is(button, a, div) > :first-child > .class2:first-child > .class_3 #id-1 + * { 
    padding: 8.3px;
    -webkit-box-align: center;
    color: #ff4834 !important;
}
@keyframes shimmer {
    0% {
        -webkit-transform: translateX(-100%);
        transform: translateX(-100%);
    }
    to {
        -webkit-transform: translateX(100%);
        transform: translateX(100%);
    }
}

Output:

['class-1', 'class2', 'class_3']

3 Comments

g is not an appropriate pattern modifier in PHP. The lookbehind can be replaced by \.\K -- this should largely improve performance because the regex engine won't need to look backward after every matching word or hyphen substring. The capture group is unneeded because it will be identical to the fullstring match. Your answer does not explain the reason for that negated lookahead subpattern.
@mickmackusa is correct on every count, HOWEVER I picked up the regex between the two slashes and it worked great for a Find in Files in vscode across a very complex set of CSS files.
Thankyou @raghavan-vidhyasagar - I used your code with a small improvement. Your code did not handle when the CSS class contains a . - for example, .is-gap-0\.5 appears in my CSS file. This works for me now: /(?<=\.)([a-zA-Z0-9_-]+(?:\\\.[a-zA-Z0-9_-]+)*)(?![^\{]*\})/g

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.