1

I'm building a CSS Parser in C#, and I need to "normalize" the case of CSS selectors. What I mean by that is I want to make the tag names lowers case, but keep the classes and ids the way they are.

For example, if I had a string such as:

.Header Td.selected

I want to change normalize the case and change it to:

.Header td.selected

I want to preserve the case of the classes and id's, because in CSS they are case-sensitive. And I need to change the case-insensitive parts to lower case, to avoid storing duplicate CSS rules in my parser.

Therefore, I need some code to be able to distinguish the case-insensitive parts and change them to lower case. How do I do that?

6
  • Seems your questions already has the answer. You need to differentiate html tags from the rest, which shouldn't be difficult given there are only a certain number of tags. Commented Dec 30, 2016 at 16:52
  • HTML tag names are the only things in CSS not preceded by a period, a hash, bracket etc. So it shouldn't be too hard. Commented Dec 30, 2016 at 16:57
  • Keep in mind though that HTML is not case sensitive, but XML is. Treating a stylesheet meant for an XML file this way will mutilate it beyond recognition. Commented Dec 30, 2016 at 16:58
  • I need it for one application and it won't be applied to XML. Commented Dec 30, 2016 at 17:07
  • 1
    Try Regex.Replace(s, @"\s\w+", m => m.Value.ToLower()) Commented Dec 30, 2016 at 18:28

1 Answer 1

1
selector = Regex.Replace(
    selector,
    @"(?<![#.:])(-?\b[^\W\d][-\w]*)",
    m => m.Value.ToLower())

It looks for identifiers that are not preceded by #, . or :.

-?\b[^\W\d][-\w]* or -?[^\W\d][-\w]* matches a CSS identifier, restricted to Basic Latin-1 (U+0000-U+007F).

h           [0-9a-f]
nonascii    [\240-\377]
unicode     \\{h}{1,6}(\r\n|[ \t\r\n\f])?
escape      {unicode}|\\[^\r\n\f0-9a-f]
nmstart     [_a-z]|{nonascii}|{escape}
nmchar      [_a-z0-9-]|{nonascii}|{escape}
ident       -?{nmstart}{nmchar}*

If the string is embedded in full css document, you could use

css = Regex.Replace(
    css,
    @"(?<![#.:])(-?\b[^\W\d][-\w]*)(?=(?:\s*(?:[+>,]|[#.:]?-?[^\W\d][-\w]*|\[.*?]))*\s*\{)",
    m => m.Value.ToLower())

It will make sure the word is part of the selector, and not the declarations. For it to match, the selector have to be followed by {.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.