-
Notifications
You must be signed in to change notification settings - Fork 29
How Zeeschuimer works
This page is work in progress :)
At its core, Zeeschuimer operates on a relatively simple principle: for any data requested from a platform by your browser, Zeeschuimer inspects that data, extracts the relevant parts, and makes them available for download.
To do so, it uses the webRequest API, a feature of modern browsers that allows browser extensions to ask that data is redirected to them before being 'delivered' to the browser window. In doing so the data can then be inspected, blocked, or even changed. Some ad blockers also use this feature, to block or filter requests from known advertisement servers.
Zeeschuimer does not block or edit any data, but it does process all requests made within a platform's website, if that platform is enabled in the extension. Each platform then has its own code module that parses the data, extracts metadata of posts from it, and re-packages it for download by the user. You can see the code per platform here.
This approach has a few important details:
- Data is mostly captured as-is, i.e. in the same format it was sent by the platform itself. This is often not the most convenient format for processing by researchers! It's designed to be parsed by the website, not by a (tool-assisted) human with different goals. The data is often structured in ways that seem nonsensical to a neutral observer. This is why Zeeschuimer is specifically designed to export data to 4CAT, which a tool that is well-equipped to then parse the data into a more useful format and process it further.
- Zeeschuimer can only capture that what is sent to your browser. That usually means that if you see something in the interface, it can be captured; if you don't, it can't.