The Birthday Problem Illustrated in R

In this episode of the Quantitude podcast, the hosts discuss the sometimes very unintuitive nature of probabilities. The well-known birthday problem asks what the probability is that any two people in a group have the same birthday. Obviously, this depends on the group size: if two randomly selected people meet, it's very unlikely that they … Continue reading The Birthday Problem Illustrated in R

Citizenship, Bees, and Zucchini (ggplot2)

Opendata.swiss currently provides about 8000 open government data sets from agriculture to health to culture. Here, I'll be looking at a data set from the Federal Statistical Office containing the 500 most successful Swiss films by theater admissions. This post is mostly about preparing data for ggplot and customizing figures in R. You can download … Continue reading Citizenship, Bees, and Zucchini (ggplot2)

Media Technology Adoption in Europe

This post is inspired by the #tidytuesday CHAT data set and focuses on the diffusion and adoption rates of media technologies since 1992. Most interesting is probably the current data about internet users, whereas the statistics on radios, television sets, and newspaper circulation are only available up to about the year 2000. On the technical … Continue reading Media Technology Adoption in Europe

Analyzing Between-Person and Within-Person Associations

Explanations and implied causal mechanisms for digital media use often operate at the individual level. For example, the hypothesis photo sharing with friends increases social connectedness implies that when people share more photos they will feel more connected. A typical test of such a hypothesis might rely on a linear regression with a count measure … Continue reading Analyzing Between-Person and Within-Person Associations

What’s in a Library?

After not using a reference manager at first (2014–2016) and later being very frustrated with Mendeley after a couple of years, I started using Zotero in 2018. I am extremely happy with the software and its features – it just works very well for everything I do. The browser plugin to import the full citation … Continue reading What’s in a Library?

Simulating Sample Size Effects

Simulate and plot data in R to see the effects of sample size differences Results: https://twitter.com/MoritzBuchi/status/1394967444209471488 library(truncnorm) # modified version of rnorm() to allow min and max specification n <- 20 # base n f <- 1:75 # sample size multiplication vector N <- n * f # vector of 75 different sample sizes (20 … Continue reading Simulating Sample Size Effects

Quantifying Internet Use

This post summarizes key findings from our article How Long and What For? Tracking a Nationally Representative Sample to Quantify Internet Use published in the Journal of Quantitative Description: Digital Media. Read more about this new journal here. The internet is increasingly used across multiple devices, often on the go, and very much integrated into … Continue reading Quantifying Internet Use