I have been training my cat, Maja, to learn some tricks. So far, she’s learned to sit, come, and jump up onto or down from surfaces. She seems to have mastered some of these tricks, but then again, she doesn’t have a 100% success rate. Before I go around bragging about how many tricks my cat has mastered, I should probably do an experiment and a statistical test to be sure she has actually mastered them.

1 Experiment

For this experiment, I want to see if Maja understands the word “up.” I tested her 30 times on one day: I said the word “up” and tapped on the surface I wanted her to jump up onto and recorded whether she jumped up or not.

2 Analysis

2.1 Visualization

Let’s take a look at the data: It certainly looks like Maja knows the word up because she jumped up 21 out of 30 times. But does that mean she’s really mastered the skill? If so, I’d expect her to jump onto the surface more times than she would just by chance. To figure this out, we’ll need to do a statistical test!

# read in the data
up <- read.csv("maja_up.csv")

# add a column with text for the outcome variable for plotting
up <- up %>%
  mutate(jumped_up = ifelse(success == 1, "yes", "no"))

# created the count plot
ggplot(up, aes(x = jumped_up)) +
  geom_bar(fill = c("#F8766D", "#00BFC4")) + 
  geom_text(stat = "count", aes(label = after_stat(count)), vjust = -1) +
  labs(x = "Jumped up")

2.2 Statistical model

Our outcome variable is binary (e.g., yes/no, success/failure), and a logistic regression model is suitable to analyze this data.

We’ll set up our model as follows: I coded the dependent variable (whether Maja jumped up or not) as 1 or 0. 1 is a success (she did jump up), and 0 is a failure (she did not jump up). Since I only want to see if her performance was above chance, we just need an intercept in the model. If the intercept is significantly different from zero, that means that Maja performed above chance, suggesting she has indeed mastered the trick.

# fit model
m_up <- glm(success ~ 1, data = up, family = binomial())

# extract coefficients
table_coefs <- summary(m_up)$coefficients

# round numbers and make nice kable table
knitr::kable(
  round(table_coefs, 2),
  caption = "Model coefficients",
  align = "r"
)
Model coefficients
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.85 0.4 2.13 0.03

3 Conclusion

The estimate from the model, 0.85, is in log odds. I think it’s more intuitive to think about this in a proportion or percentage, so let’s do the inverse transformation to get it there:

# apply inverse transformation to intercept coef
int <- exp(coef(m_up)[1]) / (1 + exp(coef(m_up)[1]))

Once we do that, we can see that Maja’s success rate was 70%. That’s not bad, and as we can see from the model output, it’s significantly above chance (p = 0.03)!

I guess it’s safe to say that my cat has mastered at least one trick! I knew she was awesome. ;)