0

For a project I have to extract the RGB values from a file which are defined as following:

#71=IFCCOLOURRGB($,0.75,0.73,0.6800000000000001);
#98=IFCCOLOURRGB($,0.26,0.22,0.18);

I want to retun the RGB data and write it to a new file like this:

0.75 0.73 0.68

0.26 0.22 0.18

So far I've created this for loop:

import re 

IfcFile = open('IfcOpenHouse.ifc', 'r')

IfcColourRGB = re.compile('ifccolourrgb', re.IGNORECASE)


for rad_rgb_data in IfcFile:
    if re.search(IfcColourRGB, rad_rgb_data):
        print(IfcColourRGB.sub('', rad_rgb_data))

This returns:

#71=($,0.75,0.73,0.6800000000000001);

#98=($,0.26,0.22,0.18);

Now I am quite new to programming and I want to know if I've chosen the right approach for my task, I've been reading about regular expressions but I don't fully understand how to get rid of all the #=(,: characters and how to exactly specify which numbers you want returned and which not. Is it possible to define all regular expressions explicitly/individually and combining them in one for loop so I have an easier time understanding them?

2
  • You might not need to use regex, you could just split the string on the "," Commented Jan 8, 2015 at 17:27
  • Agreed but regex have the advantage to be less specific on the formated string. I mean you can change the way you store it and don't have to change this part of the code (the reading one). Commented Jan 8, 2015 at 17:31

3 Answers 3

2

You can use re.findall() with a positive look-behind pattern , then split with , and convert to float :

>>> s="""#71=IFCCOLOURRGB($,0.75,0.73,0.6800000000000001);
... #98=IFCCOLOURRGB($,0.26,0.22,0.18);"""
>>> import re
>>> l=re.findall(r'(?<=\$,)[\d\.,]+',s)
>>> [map(float,i.split(',')) for i in l]
[[0.75, 0.73, 0.68], [0.26, 0.22, 0.18]]
Sign up to request clarification or add additional context in comments.

Comments

0

I think you are overthinking this :^) You can loop through the lines and perform this search on each.

import re
Searcher = re.compile("IFCCOLOURRGB\(\$,([\d\.]+),([\d\.]+),([\d\.]+)")

for Line in IfcFile:
    Result = Searcher.search(Line)
    if Result:
        print Result.groups()

If you are just writing the values back out to a file, you don't need to convert to float after,except to truncate the 00000001 and print to 2 significant figures.

Comments

0

To extract the colors use:

IFCCOLOURRGB\((?P<Red>\.[0-9]{1,16}|[0-9]+(?:\.[0-9]{1,16})?),(?P<Green>\.[0-9]{1,16}|[0-9]+(?:\.[0-9]{1,16})?),(?P<Blue>\.[0-9]{1,16}|[0-9]+(?:\.[0-9]{1,16})?)\)

Capturing groups:

Red: value of red Green: value of green, Blue: value of blue


match = re.search(r"IFCCOLOURRGB\((?P<Red>\.[0-9]{1,16}|[0-9]+(?:\.[0-9]{1,16})?),(?P<Green>\.[0-9]{1,16}|[0-9]+(?:\.[0-9]{1,16})?),(?P<Blue>\.[0-9]{1,16}|[0-9]+(?:\.[0-9]{1,16})?)\)", subject)
if match:
    result1 = match.group("Red")
    result2 = match.group("Green")
    result3 = match.group("Blue")       
else:
    result = ""

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.