0

I'm have a tough time getting the value for a particular node. I'm pulling my XML data from the url http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=25092882,25260646,25242549&retmode=xml and I'm using the following code

function loadDoc() {
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
    if (xhttp.readyState == 4 && xhttp.status == 200) {
        myFunction(xhttp);
    }
};
xhttp.open("GET",
    "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=25092882,25260646,25242549&retmode=xml",
    true);
xhttp.send();}

function myFunction(xml) {
var txt = '';
var i;
var affiliation;
var aff;
var pmcid = '';
var xmlDoc = xml.responseXML;
var x = xmlDoc.getElementsByTagName("PubmedArticle");
var authors = "";
for (i = 0; i < x.length; i++) {
    var pmid = 'PMID: ' + x[i].getElementsByTagName("PMID")[0].childNodes[
        0].nodeValue;
    txt += pmid + "</br> ";
    var author = x[i].getElementsByTagName("Author");
    for (a = 0; a < author.length; a++) {
        authors += author[a].getElementsByTagName("LastName")[0].childNodes[
            0].nodeValue;
        authors += " ";
        authors += author[a].getElementsByTagName("Initials")[0].childNodes[
            0].nodeValue;
        affiliation = author[a].getElementsByTagName("AffiliationInfo")
        authors += " Author Affiliation: " + affiliation[0].getElementsByTagName(
            "Affiliation")[0].childNodes[0].nodeValue;
        authors += " " + "</br> ";
    }
    txt += authors + " ";
    var articleTitle = 'Article Title: ' + x[i].getElementsByTagName(
        "ArticleTitle")[0].childNodes[0].nodeValue;
    txt += articleTitle + "</br> ";
    var journal = 'Journal Title: ' + x[i].getElementsByTagName("Title")[
        0].childNodes[0].nodeValue;
    txt += journal + "</br> ";
    var yearPub = 'Date Published: ';
    txt += yearPub + "</br> "
    var AbstractText = 'Abstract Text: ' + x[i].getElementsByTagName(
        "AbstractText")[0].childNodes[0].nodeValue;
    txt += AbstractText + "</br> ";
    txt += "PMCID: " + pmcid + "</br> "
    txt += "</br> "
}
document.getElementById("demo").innerHTML += txt;

}

The line I'm having trouble with is the affiliation. The value is inside the Author Node that I'm looping and then there is the AffiliationInfo then Affiliation. If I take the Affiliation information out the function runs fine but I need to get the Affiliation values.

Thanks ahead of time.

2 Answers 2

1

Not all Author nodes have Author Affiliation nodes. You will need to check for existence.

   affiliation = author[a].getElementsByTagName("AffiliationInfo")
   if (affiliation.length > 0) {
       authors += " Author Affiliation: " + affiliation[0].getElementsByTagName("Affiliation")[0].childNodes[0].nodeValue;
       authors += " " + "</br> ";
    }

Set Month to a variable and check if the length.

var pubMonth = [add code to get month]
if (pubMonth.length > 0) {
    '..Do stuff
}

If you really wanted to get serious with XML parsing, I would suggest using XPath. There is a lot of extra code you're writing just to get node values and traverse the tree.

https://developer.mozilla.org/en/docs/Web/API/Document/evaluate

If you don't mind getting into a library that removes alot of the headache, JQuery does it nicely.

https://api.jquery.com/jQuery.parseXML/

Sign up to request clarification or add additional context in comments.

4 Comments

That works great there. How about for the Month in the PubDate section. I can get the year but not the Month? v = x[i].getElementsByTagName("PubDate"); y += v[0].getElementsByTagName("Year")[0].childNodes[0].nodeValue; Sorry I don't know how to add a code block in the comments
I got it, some of the PubDate records don't have Month so it does not run. If I only process records with Years and Months it works fine. How do I test if Month exists.
Thanks - I got everything I need, now I need to work on speed. I'd like to pull 100 publications and parse then at once.. I'll use the javaScript XML process as my initial benchmark and now try jQuery and see what is faster. D you have any recommendations as to what will produce the fastest results?
I wouldn't be using getElementsByTagName without first drilling down the XML tree. Be as specific as possible so the code isn't searching through loads of data. Atm you're searching from <PubmedArticle> for items that are nested 5-6 deep from that node.
0

Here's a little example with JsJaxy(https://github.com/riversun/JsJaxy). It's easy to parse XML(xml document) and convert it into JavaScript object.

var xmlParser = new org.riversun.jsjx.XmlParser();

var xhr = new XMLHttpRequest();
xmlParser.addArrayElementName('PubmedArticleSet.PubmedArticle');
xmlParser.addArrayElementName('PubmedArticleSet.PubmedArticle.CommentsCorrectionsList.CommentsCorrections');


xhr.open('GET', 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=25092882,25260646,25242549&retmode=xml', true);
xhr.onreadystatechange = function () {

    if (xhr.readyState == 4) {

        if (xhr.status == 200) {
            var doc = xhr.responseXML;

            //do parse
            var root = xmlParser.parseDocument(doc);

            //show element
            console.log(root.PubmedArticleSet.PubmedArticle[0].MedlineCitation.CommentsCorrectionsList.CommentsCorrections[0].RefSource);
        }
    }

};
xhr.send(null);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.