Intro:
I'm fairly new to RegEx so bear with me here. We have a client who has an extremely large CSS file. Verging on 27k lines total - 20k lines or so is pure CSS and the following is written in SCSS. I am attempting to cut this down and despite using more than allotted hours to work on this, I found it extremely interesting - so I wrote a little PHP script to do this for me! Unfortunately it's not quite there due to the RegEx being a little troublesome.
Context
remove.txt - Text file containing selectors, line by line that are redundant on our site and can be removed. main.scss - The big SASS file. PHP script - Basically reads the remove.txt file line by line, finds the selector in the main.scss file and adds a "UNUSED" string before each selector, so I can go down line by line and remove the rule.
Issue
So the main reason this is troublesome is because we have to account for lots of occurrences at the start of the CSS rules and towards the end as well. For example -
Example scenarios of .foo-bar (bold indicates what should match) -
.foo-bar {}
.foo-bar, .bar-foo {}
.foo-bar .bar-foo {}
.boo-far, .foo-bar {}
.foo-bar,.bar-foo {}
.bar-foo.foo-bar {}
PHP Script
<?php
$unused = 'main.scss';
if ($file = fopen("remove.txt", "r")) {
// Stop an endless loop if file doesn't exist
if (!$file) {
die('plz no loops');
}
// Begin looping through redundant selectors line by line
while(!feof($file)) {
$line = trim(fgets($file));
// Apply the regex to the selector
$line = $line.'([\s\S][^}]*})';
// Apply the global operators
$line = '/^'.$line.'/m';
// Echo the output for reference and debugging
echo ('<p>'.$line.'</p>');
// Find the rule, append it with UNUSED at the start
$dothings = preg_replace($line,'UNUSED $0',file_get_contents($unused), 1);
}
fclose($file);
} else {
echo ('<p>failed</p>');
}
?>
RegEx
From the above you can gather my RegEx will be -
/^REDUNDANTRULE([\s\S][^}]*})/m
It's currently having a hard time with dealing with indentation that typically occur within media queries and also when there are proceeding selectors applied to the same rule.
From this I tried adding to the start (To accommodate for whitespace and when the selector is used in a longer version of the selector) -
^[0a-zA-Z\s]
And also adding this to the end (to accommodate for commas separating selectors)
\,
Could any RegEx/PHP wizards point me in the right direction? Thank you for reading regardless!
Thanks @ctwheels for the fantastically explained answer. I encountered a couple other issues, one being full stops being used within the received redundant rules not being escaped. I've now updated my script to escape them before doing the find an replace as seen below. This is now my most up to date and working script -
<?php
$unused = 'main.scss';
if ($file = fopen("remove.txt", "r")) {
if (!$file) {
die('plz no loops');
}
while(!feof($file)) {
$line = trim(fgets($file));
if( strpos( $line, '.' ) !== false ) {
echo ". found in $line, escaping characters";
$line = str_replace('.', '\.', $line);
}
$line = '/(?:^|,\s*)\K('.$line.')(?=\s*(?:,|{))/m';
echo ('<p>'.$line.'</p>');
var_dump(preg_match_all($line, file_get_contents($unused)));
$dothings = preg_replace($line,'UNUSED $0',file_get_contents($unused), 1);
var_dump(
file_put_contents($unused,
$dothings
)
);
}
fclose($file);
} else {
echo ('<p>failed</p>');
}
?>