Resources and Set-up

Autor:in

Zugehörigkeit

Daniela Palleschi

Humboldt-Universität zu Berlin

Veröffentlichungsdatum

29. April 2024

Resources

This course is mainly based on Winter (2019), which is an excellent introduction into regression for linguists. For even more introductory tutorials, I recommend going through Winter (2013) and Winter (2014) For a more intermediate textbook, I’d recommend Sonderegger (2023).

If you’re interested in the foundational writings on the topic of (frequentist) linear mixed models in (psycho)linguistic research, I’d recommend reading Baayen (2008); Baayen et al. (2008);Barr et al. (2013); Bates et al. (2015); Jaeger (2008); Matuschek et al. (2017); Vasishth (2022); Vasishth & Nicenboim (2016).

Assumptions about you

For this course, I assume that you are familiar with more classical statistical tests, such as the t-test, Chi-square test, etc. I also assume you are familiar with measures of central tendency (mean, median, mode) measures dispersion/spread (standard deviation), and with the concept of a normal distribution. Lacking this knowledge will not impeded your progress in the course, but is an important foundation on which we’ll be building. We can review these concepts in-class as needed.

Software

R: a statistical programming language (the underlying language)
RStudio: an program that facilitates working with R; our preferred IDE integrated development environment
LaTeX: a typesetting system that generates documents in PDF format
why R?
- R and RStudio are open-source and free software
- they are widely used in science and business

Install R

we need the free and open source statistical software R to analyze our data
download and install R: https://www.r-project.org

Install RStudio

we need RStudio to work with R more easily
Download and install RStudio: https://rstudio.com
it can be helpful to keep English as language in RStudio
- we will find more helpful information if we search error messages in English on the internet
If you have problems installing R or RStudio, check out this help page (in German): http://methods-berlin.com/wp-content/uploads/Installation.html

Install LaTeX

we will not work with LaTeX directly, but it is needed in the background
Download and install LaTeX: https://www.latex-project.org/get/

resources

many aspects of this course are inspired by (nordmann_applied_2022?) and (wickham_r_nodate?)
- both freely available online (in English)
for German-language resources, visit the website of Methodengruppe Berlin

Troubleshooting (EN: Troubleshooting)

Error messages are very common in programming, at all levels.
How to find solutions for these error messages is an art in itself
Google is your friend! If possible, google in English to get more information

Session Information

The current version of this Quarto book was developed using R version 4.4.0 (2024-04-24) (Puppy Cup) in RStudioversion 2023.3.0.386 (Cherry Blossom). At the bottom of each chapter is a list of the packages (and version info) used in that chapter (under Session Information). I highly recommend you do the same at the bottom of each script that you write. You can easily do this by writing the following at the bottom of any Rmarkdown (.Rmd) or Quarto (.qmd) script:

# Session Info

```{r}
sessionInfo()
```

References

American Psychological Association. (2022). APA Style numbers and statistics guide. American Psychological Association.

Baayen, R. H. (2008). Analyzing Linguistic Data: A Practical Introduction to Statistics using R.

Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. https://doi.org/10.1016/j.jml.2007.12.005

Baayen, R. H., & Shafaei-Bajestan, E. (2019). languageR: Analyzing linguistic data: A practical introduction to statistics. https://CRAN.R-project.org/package=languageR

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013a). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013b). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001

Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious Mixed Models. arXiv Preprint, 1–27. https://doi.org/10.48550/arXiv.1506.04967

Biondo, N., Soilemezidi, M., & Mancini, S. (2022). Yesterday is history, tomorrow is a mystery: An eye-tracking investigation of the processing of past and future time reference during sentence reading. Journal of Experimental Psychology: Learning, Memory, and Cognition, 48(7), 1001–1018. https://doi.org/10.1037/xlm0001053

Brauer, M., & Curtin, J. J. (2018). Linear mixed-effects models and the analysis of nonindependent data: A unified framework to analyze categorical and continuous independent variables that vary within-subjects and/or within-items. Psychological Methods, 23(3), 389–411. https://doi.org/10.1037/met0000159

Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12(4), 335–359. https://doi.org/10.1016/S0022-5371(73)80014-3

Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time.

Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(4), 434–446. https://doi.org/10.1016/j.jml.2007.11.007

Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13

Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software, 6(60), 3139. https://doi.org/10.21105/joss.03139

Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001

Meteyard, L., & Davies, R. A. I. (2020). Best practice guidance for linear mixed-effects models in psychological science. Journal of Memory and Language, 112, 104092. https://doi.org/10.1016/j.jml.2020.104092

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632

Sonderegger, M. (2023a). Regression Modeling for Linguistic Data.

Sonderegger, M. (2023b). Regression Modeling for Linguistic Data.

Troyer, M., & Kutas, M. (2020). To catch a Snitch: Brain potentials reveal variability in the functional organization of (fictional) world knowledge during reading. Journal of Memory and Language, 113(August 2019), 104111. https://doi.org/10.1016/j.jml.2020.104111

Vasishth, S. (2022). Some right ways to analyze (psycho)linguistic data [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/y54va

Vasishth, S., & Nicenboim, B. (2016). Statistical methods for linguistic research: Foundational Ideas. Language and Linguistics Compass, 10(11), 591–613. https://doi.org/10.1111/lnc3.12207

Winter, B. (2011). PSEUDOREPLICATION IN PHONETIC RESEARCH.

Winter, B. (2013). Linear models and linear mixed effects models in R: Tutorial 1.

Winter, B. (2014). A very basic tutorial for performing linear mixed effects analyses (Tutorial 2).

Winter, B. (2019). Statistics for Linguists: An Introduction Using R. In Statistics for Linguists: An Introduction Using R. Routledge. https://doi.org/10.4324/9781315165547

Winter, B., & Grice, M. (2021). Independence and generalizability in linguistics. Linguistics, 59(5), 1251–1277. https://doi.org/10.1515/ling-2019-0049

Yarkoni, T. (2022). The generalizability crisis. Behavioral and Brain Sciences, 45, e1. https://doi.org/10.1017/S0140525X20001685