1

I want to help my friend to analyze Posts on Social Networks (Facebook, Twitter, Linkdin and etc.) as well as several weblogs and websites.

I have several questions and try to categorize them:

When it comes to Scraping Data, my idea is scraping data on social media via APIs and for sites via RSS or site crawling use Scrapy library. I like to know if Scrapy is optimal enough to give me the best result in short time and with the least usage of resources or not?

1 Answer 1

1

Technically, Scrapy should do the job just fine so long as you code it right and find the paths you need from the APIs or through analyzing the code of the sites.

Be aware though that using "automated means" to crawl or scrape data from these sites is a breach of their respective terms of use agreements (Twitter is pretty lax on this though). Which means, if they see a bunch of requests coming from your IP address and think you might be either A.) using a bot or B.) performing a DOS attack... they'll shut you down fast and you might have LEOs knocking on/down your door.

A lot of these do have ways to go about getting permission to do so, but I doubt they give permission to just anybody.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.