1 Introduction

1.1 Overview

Both researchers and practitioners often use Monte Carlo simulations to answer a variety of research questions. Over the past decade, R (R Core Team 2019) has been one of the most popular programming languages for conducting Monte Carlo simulation studies. R (https://www.r-project.org/) is a free, open-source programming language for statistical computing and data visualization. Both built-in functions and many user-created packages in R allow researchers and practitioners to design and implement a very simple to very comprehensive simulation studies.

This short book will explain the major steps in conducting Monte Carlo simulations using R. Here is the outline of the book1:

Part Description
1 Introduction   
Why Simulations?
Typical Simulation Scenarios
Additional Resources
2 Designing Simulations   
Simulation Factors
Evaluation Criteria
Other Design Elements
3 Running Simulation   
Custom Functions
Debugging the Code
Putting the Functions Together
Benchmarking
4 Summarizing Simulation Results   
Tables and Figures
Exporting the Results

1.2 Why Simulations?

There are many reasons to conduct Monte Carlo simulations. Researchers and practitioners often choose to simulate data instead of collecting empirical data because:

  • it is impractical and costly to collect empirical data while manipulating several conditions
  • it is not possible to investigate the real impact of the study conditions without knowing the characteristics of the target population as well as the variables of interest.
  • it is more difficult to deal with empirical data because it typically includes missingness – which may be in large amounts and nonrandom.

1.3 Typical Simulation Scenarios

We can use Monte Carlo simulations to answer various research questions. Typical research questions in which Monte Carlo simulations can be useful are:

  • Does a particular type of estimation (e.g., maximum likelihood) yield accurate results?
    • What is the level of bias?
    • What is the standard error of estimates?
    • What conditions would affect the accuracy of the estimation?
    • Does the estimation remain robust when assumptions are violated?
  • Which estimation method (e.g., maximum likelihood, EAP, and MAP) is more accurate?
    • Do the performances of these methods vary by different conditions?
    • Which estimator, method, or model is the most robust?
  • Can a statistical method or model (e.g., logistic regression) successfully detect a value of interest (e.g., differential item functioning)?
    • How accurate is the method when the null hypothesis is false?
    • How accurate is the method when the null hypothesis is true?

1.4 Additional Resources

If you are interested in learning more about Monte Carlo simulations, there are many online resources available. Some of these resources include:

References

Bulut, O., and O. Sunbul. 2017. “Monte Carlo Simulation Studies in Item Response Theory with the R Programming Language.” Journal of Measurement and Evaluation in Education and Psychology 8 (3): 266–87. https://doi.org/doi: 10.21031/epod.30582.

Chalmers, Phil. 2020. SimDesign: Structure for Organizing Monte Carlo Simulation Designs. https://CRAN.R-project.org/package=SimDesign.

Hallgren, K. A. 2013. “Conducting Simulation Studies in the R Programming Environment.” Tutorials in Quantitative Methods for Psychology 9 (2): 43–60. https://doi.org/10.20982/tqmp.09.2.p043.

Leschinski, Christian Hendrik. 2019. MonteCarlo: Automatic Parallelized Monte Carlo Simulations. https://CRAN.R-project.org/package=MonteCarlo.

R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.


  1. This book was created using the bookdown (Xie 2020a) and knitr (Xie 2020b) packages.