Web scraping

What is web scraping?

In my experience, there is often confusion about scraping on the one hand, and APIs (application programming interfaces) on the other. While scraping means literally scraping data from an html page using popular tools such as beautiful-soup and browser-automation like Selenium, APIs are intentionally made so that you, the user and researcher, can obtain data in an official (and usually more structured and simple) way. A Python library that is commonly used for this is requests.

Can you scrape social media data?

This can be a challenging task (depending on the scale), but it’s not impossible. You might want to consider using existing tools, like 4CAT which is developed at the university. We can set it up for you.

How does it look like?

Example project