0

I am working in nodejs wherein a css file is read and it reads out the classnames/idnames and their associated properties. For that purpose, I have used the following regex (data is the file content that I receive from the callback function):

  data = data.replace(/\}/gm,"}\n")
  data = data.replace(/[\r\n|\n|\r]*\}[\r\n|\n|\r]*/gm,"}~")
  data = data.split("~")
  regex = /[\.#a-z][a-z0-9\-]*\{.*\}/gi
  results = []
  for(i = 0;i<data.length;i++)
  {
    data[i] = data[i].replace(/([\r\n|\n|\r|\s]*)/gm,"")
    while ( (result = regex.exec(data[i])) ) {
      results.push(result[0]);
  }

Which reads the following file content:

@color:#ffeedd;
.circle{
background:red;
}

#big-circle{color:green;}#small-circle{
color:yellow;
}

mango{
  color:brown;
}

And gives the output as

[ '.circle{background:red;}',
  '#big-circle{color:green;}',
  '#small-circle{color:yellow;}',
  'mango{color:brown;}' ]

A brief of what I have done:

  1. I have divided the whole CSS files on the basis of existence of }, i.e. the closing bracket for the class, and added a \n after every } data = data.replace(/\}/gm,"}\n")
  2. I have used regex to replace every instance of newlines followed by } followed by newline characters with a } and a ~ data = data.replace(/[\r\n|\n|\r]*\}[\r\n|\n|\r]*/gm,"}~")
  3. Then I have split the data as per the }~ to give me an array of classes/ids data = data.split("~")
  4. Then I have removed spaces internally from each of the classes data[i] = data[i].replace(/([\r\n|\n|\r|\s]*)/gm,"")

However, there is an issue. is really works well if the classes are properly ending up with a }. If there ever be an error, this will not work properly. My question is, what regex or step can I apply to ensure that such errors are caught and shown to the user ( much like the lessc compiler)? I guess it is much more than a simple bracket matching ( which can be implemented using stack)

For example:

@color:#ffeedd;
.circle{
background:red;
}

#big-circle{color:green;}#small-circle{
color:yellow;


mango{
  color:brown;
}

Gives the following error:

ParseError: Unrecognised input. Possibly missing something in /less/style.less on line 13, column 1:
12 }
13 

Thanks

1
  • 2
    A regex works on the assumption that the input is valid. A regex can reject erroneous input by failing to match it, but it would not be able to identify the precise error. Commented Mar 7, 2016 at 6:16

1 Answer 1

2

Take the simple approach... use a LESS parser rather than trying to roll one yourself.

Regex is useful for matching well-known text formats. It's terrible for matching unknown formats. In fact, it's not possible. You have noticed one particular error and want to make your ad-hoc parser roll with it, but what about other possible errors? Are you going to try and catch all of them?

Too many brackets

.circle{{
    background:red;
}

or

.circle{
    background:red;
}
}

Forgotten semi-colons

.circle{
    background:red
    color: yellow
}

Forgotten colons

.circle{
    background red
}

Mixed-up programmers

.circle{
    background=red
}

or

.circle(
    background:red
)

The list of possible errors is nearly endless, but your regex is never going to be smart enough to catch them. Use a proper LESS parser (possibly the in-client version with error reporting turned on).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.