0

I need to make a Web Crawling do requests and bring the responses complete and quickly, if possible.

I come from the Java language. I used two "frameworks" and neither fully satisfied my intent.

The Jsoup had the request/response fast but wore incomplete data when the page had a lot of information. The Apache HttpClient was exactly the opposite of this, reliable data but very slow.

I've looked over some of Python modules and I'm testing Scrapy. In my searches, I was unable to conclude whether it is the fastest and brings the data consistently, or is there some other better, even more verbose or difficult.

Second, Python is a good language for this purpose?

Thank you in advance.

2 Answers 2

5

+1 votes for Scrapy. For the past several weeks I have been writing crawlers of massive car forums, and Scrapy is absolutely incredible, fast, and reliable.

Sign up to request clarification or add additional context in comments.

Comments

0

looking for something to "do requests and bring the responses complete and quickly" makes no sense.

A. Any HTTP library will give you the complete headers/body the server responds with.

B. how "quick" a web request happens is generally dictated by your network connection and server's response time, not the client you are using.

so with those requirements, anything will do.

check out the requests package. It is an excellent http client library for Python.

2 Comments

Thanks for the reply. To be practical: the fact is that a library is significantly faster than the other. It may in its internal implementation prioritize data consistency, instead of providing quick return. What I need to know, if there is one that has a good balance of it. I am interested in your link could post it again, please?
Even if this were not, I liked this link. I'm reading now, thanks @furas

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.