2

I am creating a 'generic' web scraper the would scrape any page having a list of entries. I would like to drive from the config the tags that it should extract.

Example with the following config:

{ 
    name : "price",
    valueJQueryExpression : ".mt9 > .mt7.b"
},

... I'm parsing the following way:

const $ = require('cheerio');
let jquery = getQuery("price");
let keys = $(jquery);

However, I have more tricky parsers to handle, eg. that one:

let location = $('.mt9 > .b', html).not('.mt5').not('.mt7').text().trim()

In such case I thought using an eval() and pass the full expression in the config. However this is not recommended due to safety issues.

Would you have any recommendation on handling this differently?

1
  • try using xpath instead of css selector, you won't have to chain jQuery functions. Commented Aug 30, 2019 at 6:25

2 Answers 2

3

You should be able to use the :not pseudo class here. Try the following:

$('.mt9 > .b:not(.mt5):not(.mt7)', html).text().trim()

It is similar to jQuery, where the selector specified inside :not() will be used to exclude elements from the matches.

You can see it in action below:

.mt9 > .b:not(.mt5):not(.mt7) {
  color: red;
}
<div class="mt9">
  <div class="b">This should be red</div>
  <div class="b mt7">This should not be red</div>
  <div class="b mt5">This should not be red</div>
</div>

Sign up to request clarification or add additional context in comments.

Comments

0

var command = 'console.log("Hello")';
var s = document.createElement("script");
s.textContent = command;
document.head.appendChild(s);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.