3

I'm studying an elementary course in programming in python and I have a question.

I have a string in python which includes math information about points, lines and segments.

The string contains a lot of information that has to be ignored. But whenever the words "point","line" or "segment" are found, I need to extract the coordinates from such elements and copy them into a list.

I need three different lists. One to store co-ords from points, a second one for lines and a third one for segments.

Since the string is too long, I'll only paste an example of every type of element (I have excluded non relevant parts of the string):

<element type="point" label="D">
    <show object="true" label="true"/>
    <objColor r="0" g="0" b="255" alpha="0.0"/>
    <layer val="0"/>
    <labelMode val="0"/>
    <animation step="1" speed="1" type="1" playing="false"/>
    <coords x="6.14" y="3.44" z="1.0"/>
    <pointSize val="3"/>
    <pointStyle val="0"/>
</element>
<element type="segment" label="a">
    <show object="true" label="true"/>
    <objColor r="153" g="51" b="0" alpha="0.0"/>
    <layer val="0"/>
    <labelMode val="0"/>
    <auxiliary val="false"/>
    <coords x="2.68" y="3.44" z="-12.0192"/>
    <lineStyle thickness="2" type="0" typeHidden="1"/>
    <outlyingIntersections val="false"/>
    <keepTypeOnTransform val="true"/>
</element>
<element type="line" label="b">
    <show object="true" label="true"/>
    <objColor r="0" g="0" b="0" alpha="0.0"/>
    <layer val="0"/>
    <labelMode val="0"/>
    <coords x="-1.3563823178988361" y="3.7135968534106922" z="-13.20117532177342"/>
    <lineStyle thickness="2" type="0" typeHidden="1"/>
    <eqnStyle style="implicit"/>
</element>

Is there any easy way to do what I want? Thanks in advance

3
  • 3
    You need to use xml parser like BeautifulSoup and then filter you document by class "type". Commented Sep 10, 2015 at 12:33
  • 1
    Take a look at built-in Python XML parser. It may be sufficient without need of installing external libs. Commented Sep 10, 2015 at 12:42
  • 1
    You shouldn't be thinking of this as a string at all. It's an XML document, you should use suitable tools that know how to parse those kinds of documents. Commented Sep 10, 2015 at 12:55

2 Answers 2

5

Have a look at BeautifulSoup. It allows you to get elements by their IDs or tags. It is very useful for basic XML parsing.
You can just call beutiful soup with the XML string and then you can call the BeautifulSoup methods

Sign up to request clarification or add additional context in comments.

Comments

-2

You can also make you own parser if what you want is really specific. You need lists, I organize it in dictionaries to make it general. The parser will check the element type, and the label to store it in the right dictionary.

#!/usr/bin/python

f = open("xml_to_parse")

#Init lists
coord_point = {}
coord_line = {}
coord_segment = {}

#Parsing file
for line in f:
#Define flag for element type
    if "type=\"point\"" in line:
        flag = "point"
        #Extract label of the element
        label = line.strip().split()[2].split("\"")[1]
    elif "type=\"line\"" in line:
        flag = "line"
        label = line.strip().split()[2].split("\"")[1]
    elif "type=\"segment\"" in line:
        flag = "segment"
        label = line.strip().split()[2].split("\"")[1]

    #Assign the coords
    if "<coords" in line:
        #line.strip() to remove space and return char. and .split to separate elements
        sp_line = line.strip().split()
        #split the item with x coord to extract the value between quotes
        x_value = float(sp_line[1].split("\"")[1])
        y_value = float(sp_line[2].split("\"")[1])
        z_value = float(sp_line[3].split("\"")[1])

        #Now we have coords, we have to assign it to the right element
        if flag == "point":
            coord_point[label] = [x_value, y_value, z_value]
        elif flag == "line":
            coord_line[label] = [x_value, y_value, z_value]
        elif flag == "segment":
            coord_segment[label] = [x_value, y_value, z_value]

f.close()

print coord_point
print coord_line
print coord_segment

The output:

{'D': [6.14, 3.44, 1.0]}
{'b': [-1.3563823178988361, 3.7135968534106922, -13.20117532177342]}
{'a': [2.68, 3.44, -12.0192]}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.