I am trying to extract data from a SOAP file (XML format) which has many children.
the XML_find_all is a great function to get the data from complex structure. However, it is unable to return missing values.
Here is a simple example:
Read the simple example file with two customers. One customer is missing the name.
x <- read_xml("<Customers> <Customer> <ID> 01 </ID> <Name> Bla </Name> </Customer> <Customer> <ID> 02 </ID> </Customer> </Customers>")
Can find both IDs
xml_find_all(x, ".//ID")
[1] 01 [2] 02
Find only one name
xml_find_all(x, ".//Name")
[1] Bla
How can I get an NA or something that can tell me which data is missing?
In the end, I want to build a data frame. Please keep in mind this is just a simple example. The real data has 4.000 "customers" and 100 attributes.