1

I have a problem with function XmlService.parse in Google App Script. I am trying to create script, and I need to parse emails which I have in inbox. I tried to send several tests email which have this format

<div dir="ltr">test 1<div><br></div></div>

but if I use this line

var doc = XmlService.parse(messages[j].getBody());

I get this error

Error on line 1: The element type "br" must be terminated by the matching end-tag "". (line 18, file "Code")

What is recognizably beacuse there is only
in message. Is there any solution how to solve this problem? Or I have to use another way how to parse it? Thank you in advance.

edit: I have the same problem with img tag

Error Occured: Error on line 38: The element type "img" must be terminated by the matching end-tag "".

I need to parse the text which is in the red frame email to parse

In old script there was a function

Xml.parse(messag.getBody(),true)

however this function is deprecated. I tried to use

XmlService.parse(messages.getBody());

which I mentioned but I get errors with unpaired html tags. The message which I get by function .getBody() is here getbody email

Could someone help me? Thanks once more.

3
  • I'm not sure that the XML Service is the tool you want to be using. The message body is going to be HTML (and not always well-formed). Commented Jun 2, 2017 at 10:46
  • Are you just trying to get at the message text? If so, you can request the plain message text. Commented Jun 3, 2017 at 3:54
  • XmlService can not parse HTML. It can only parse Canonical XML. Commented Jun 5, 2017 at 14:14

1 Answer 1

3

XmlService can not parse HTML. It can only parse Canonical XML. But there are html parsing libraries for node JS. So you can take one of those modules run it through browserify, make a minor modification to the generated source, and get a Apps Script library that parses html.

https://github.com/fb55/htmlparser2

My generated library:

1TLbGgQBCztnB0lOhcTYKg2UpXtpdDwocvfcx44w1tqFnHDJC5ZXy_BDo
https://github.com/Spencer-Easton/Apps-Script-htmlparser2-library

Example code modified from htmlparser2 readme:

function myFunction() {   
  var htmlparser = htmlparser2.init();
  var parser = new htmlparser.Parser({
    onopentag: function(name, attribs){
      if(name === "div"){
        Logger.log("found div");
      }
    },
    ontext: function(text){
      Logger.log("-->" + text);
    },
    onclosetag: function(tagname){
      if(tagname === "div"){
        Logger.log("End Div");
      }
    }
  }, {decodeEntities: true});
  parser.write('<div dir="ltr">test 1<div><br></div></div>');
  parser.end();  
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. I am new to Google App Script and node.js, so it took me quite some time to find your answer, which is what I need. Unlike the Document.getElementById() I can use in a browser, htmlparser2 is not that intuitive but it does work for me.
By the way, is there any good resource on how to wrap a node.js library so that I can use it in GAS?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.