I am trying to parse quite complex xml file and store its content in dataframe. I tried xml.etree.ElementTree and I managed to retrieve some elements but I somehow retrieved it multiple times as if there were more objects. I am trying to extract the following: category, created, last_updated, accession type, name type identifier, name type synonym as a list
<cellosaurus>
<cell-line category="Hybridoma" created="2012-06-06" last_updated="2020-03-12" entry_version="6">
<accession-list>
<accession type="primary">CVCL_B375</accession>
</accession-list>
<name-list>
<name type="identifier">#490</name>
<name type="synonym">490</name>
<name type="synonym">Mab 7</name>
<name type="synonym">Mab7</name>
</name-list>
<comment-list>
<comment category="Monoclonal antibody target"> Cronartium ribicola antigens </comment>
<comment category="Monoclonal antibody isotype"> IgM, kappa </comment>
</comment-list>
<species-list>
<cv-term terminology="NCBI-Taxonomy" accession="10090">Mus musculus</cv-term>
</species-list>
<derived-from>
<cv-term terminology="Cellosaurus" accession="CVCL_4032">P3X63Ag8.653</cv-term>
</derived-from>
<reference-list>
<reference resource-internal-ref="Patent=US5616470"/>
</reference-list>
<xref-list>
<xref database="CLO" category="Ontologies" accession="CLO_0001018">
<url><![CDATA[https://www.ebi.ac.uk/ols/ontologies/clo/terms?iri=http://purl.obolibrary.org/obo/CLO_0001018]]></url>
</xref>
<xref database="ATCC" category="Cell line collections" accession="HB-12029">
<url><![CDATA[https://www.atcc.org/Products/All/HB-12029.aspx]]></url>
</xref>
<xref database="Wikidata" category="Other" accession="Q54422073">
<url><![CDATA[https://www.wikidata.org/wiki/Q54422073]]></url>
</xref>
</xref-list>
</cell-line>
</cellosaurus>