Chapter 4: Big Stimulus Sets

Photo by Hector J. Rivas on Unsplash

This chapter focuses on how to create Big Datasets by thinking like a data scientist. It begins by discussing examples of impactful open access datasets. It then teaches the reader the basics of data scraping to allow them to create their own datasets, including an introduction to client-side web coding. The chapter concludes with discussion on the ethical questions around data scraping, and current practices in Open Science to make your datasets publicly available.


  • Collection of large datasets
  • Here is a list of big stimulus sets in the field of psychology. I will update this list as I come across more datasets:

  • Learn to code a webpage
  • A fantastic resource to learn HTML, Javascript, and CSS is W3 Schools.
    I am also a big fan of HTML Goodies, which I used even back in the 90s (!!!) to learn how to make webpages.
    If you are getting comfortable with coding, but need help making pretty webpages, HTML5 UP has beautiful free-to-use CSS templates (I used one for this site!)

  • Example HTML, Javascript, and CSS code
  • Here is code for a simple webpage that combines HTML, Javascript, and CSS to make a page where a button click changes the appearance of a piece of text. I coded this all in a simple text editor (Notepad++ in just a single file.) To look at the code, either right-click and "Save As" on the link, or right-click on the page and "View Source".

  • Learn to scrape from a website
  • Here is simple code I wrote in MATLAB to scrape flights from SkyScanner. It uses Java libraries and you could easily adapt this code for any other programming language (in fact this is not a very common type of use of MATLAB). This takes the point-and-click approach by moving the mouse and executing keyboard commands to navigate the site automatically.
    Beautiful Soup is a very helpful python library that can also navigate and scrape data from the web.

  • Learn about regular expressions
  • Here is an online tutorial by Jan Goyvaerts on how to make and understand regular expressions. Here is a debugger that can help you generate and test your regular expressions.

  • Learn about Bayesian statistics
  • Here is the textbook Statistical Rethinking by Prof. Richard McElreath on how to think using a more Bayesian framework.
    Here is also a podcast teaching you Bayesian Statistics (use at your own risk - I have not tried this yet!)