1 Review of the Literature

1.1 Introduction

Sports Scientists are increasingly being required to collect a lot of data and subsequently perform complex statistical analysis to support their decision making (Garfield & Ben-Zvi, 2008). Indeed, wearable technologies, such as receivers using the Global Navigation Satellite System (GNSS), local positioning systems, heart-rate monitors, and inertial movement unit sensors have provided the opportunity to collect large amounts of data (Crang et al., 2021; Seshadri et al., 2021). Recognising the need for Sports Scientists to possess the statistical analysis skills necessary to robustly analyse sports science data, Exercise & Sports Science Australia (ESSA), have now included ‘Data Handling and Management’ as one of their six standards for Sports Science accreditation (Exercise and Sports Science Australia, 2019). This standard for accreditation comprises three requirements, including: i. Assesses data critically to identify meaningful effects, ii. Uses data to evaluate and develop programs for service users, and iii. Translates the outcomes of data analysis into meaningful information for service users and other relevant stakeholders (Exercise and Sports Science Australia, 2019). While these requirements for Sports Science accreditation may emphasize the importance of statistical literacy, it is unclear whether simply meeting these requirements will result in a substantial improvement in the statistical literacy of Sports Scientists. Additional training, practical experience, and ongoing professional development may be necessary to truly enhance statistical literacy in this field.

1.2 Defining statistical literacy

While the etymological root of the word literacy refers as ‘an ability to read and write letters’, the Queensland Department of Education and Training (2023) defines literacy as:

“… the ability to read, view, write, design, speak and listen in a way that allows us to communicate effectively and to make sense of the world.”

This definition moves past the idea of simply reading and writing alphabetic letters, allowing the word literacy to be used to describe a level of competency in other concepts (Queensland Department of Education and Training, 2023). For example, Whitehead (2010) argued that the concept of physical literacy comprised more than skill per se, stating that:

“Physical literacy describes the motivation, confidence, physical competence, understanding and knowledge that individuals develop in order to maintain physical activity at an appropriate level throughout their life.”

(Whitehead, 2010)

Whitehead’s articulation of physical literacy is now a key component of physical education globally (Scott et al., 2021) and captures four key attributes to physical literacy: i. Knowledge and understanding of why physical activity is necessary, ii. Competence in performing a physical activity, iii. Confidence in knowing they are proficient in the physical activity and the associated benefits, and iv. Motivation to develop and improve in physical activity. In a similar vein, the idea of ‘statistical literacy’ was developed Gal (2002) who proposed that:

“\[Statistical literacy is\] people’s ability to interpret and critically evaluate statistical information … and their ability to discuss or communicate their reactions to such statistical information such as their understanding of the meaning of the information, their opinions about the implications of this information, or their concerns regarding the acceptability of the given conclusions.”

Within this definition there are two key components: i. The ability to interpret and critically evaluate statistical information and ii. The ability to discuss or communicate their reactions to statistical information. As Gal writes, these two abilities move past the minimal standard required for what used to ratify as literacy, to a deeper understanding built on inter-related knowledge bases.

As Sports Scientists are seeking to utilize data sets to make meaningful inferences, there is a minimum requisite level of statistical literacy required to correctly analyse sports science data sets. However, Sports Scientists may be lacking in statistical literacy if they have not received adequate formal training in appropriate statistical methods. Having adopted the four key attributes of Whitehead’s physical literacy framework and considering Gal’s statistical literacy definition, we propose the following framework for statistical literacy for Sports Scientists:

Knowledge and understanding of statistical principles and robust modeling techniques;
Competence in performing the required statistical analyses;
Confidence in performing statistical methods on unfamiliar new data sets and the subsequent interpretation of the analysis;
Motivation to ensure robustness and validity in the statistical analysis.

As undergraduate sports science programs typically contain an introductory statistics course, ‘Statistics for Sports and Exercise Science’ (Newell et al., 2014) provided Sports Scientists with complete data sets and software guides to assist readers through appropriate execution of more complex statistical analyses in a variety of situations. These resources are vital to provide Sports Scientists with tutorials on how to execute robust statistical analysis. Similarly, it also promotes the idea of ‘reproducible research’, in which the code used to run the analysis can be published as a supplementary material that can critiqued for statistical rigour. However, due to some idiosyncrasies seen in commonly found sports science data sets, Sports Scientists have been utilising alternative methods, such as ‘customised spreadsheets’ (Batterham & Hopkins, 2006), to provide their statistical inferences for their decision making. By relying on customised spreadsheets, a Sports Scientist needs to trust the underlying mathematics by the authors of the spreadsheets which, of recent times, have occasionally been shown to be flawed (Barker & Schofield, 2008; Sainani, 2018; Welsh & Knight, 2015). The subsequent conversations and rebuttals have challenged the status quo within sports science and have encouraged Sports Scientists to focus on understanding the data set and correctly identifying the suitable robust statistical methods.

This thesis presents the analyses of data collected in an applied setting using statistical principles, techniques, and methods that are not commonplace in sports-science research literature. Its intention is to contribute to the statistical literacy of Sports Scientists by presenting, and encouraging the further adoption of, statistical principles, techniques, and methods that could be utilised by Sports Scientists.

1.3 Common characteristics of sports science data sets

Data obtained by Sports Scientists often cannot conform to the rigid assumptions required to perform traditional parametric statistics. There are five key characteristics of data commonly collected by Sports Scientists that require careful consideration when conducting statistical analyses:

Repeated observations are measures taken on multiple occasions of the same participant either within a discrete session such as a single match (Newans et al., 2019) or across multiple sessions (Quinn et al., 2020). For example, high-speed locomotive data using the GNSS is routinely collected on athletes during match play, in training, and across multiple seasons (Griffin et al., 2021). In this instance, the assumption of ‘independence of observations’, a requirement for many parametric inferential statistics, is not met (Hopkins et al., 2009). Consequently, statistical methods that can accommodate for repeated observations are required.

Missing data points can be incurred when measurements are taken on several occasions in a participant group (i.e., repeated observations), but not all participants have the same number of observations (Nakai & Ke, 2011). For example, some athletes may play all matches across one season ensuring their GNSS data set is complete. However, other athletes may have missing GNSS data for one or more games due to injury or squad selection. In which case, the data is classified as Missing at Random (MAR), where these missing data could bias the results (Borg et al., 2021). Alternatively, there could be equipment malfunction leading to data that is Missing Completely at Random (MCAR), which is less likely to bias the results (Borg et al., 2021). Consequently, this creates an imbalanced data set, in which some athletes have more observations than other athletes, ruling out statistical methods, such as the repeated-measures ANOVA which require a ‘complete case analysis’ data set (Nakai & Ke, 2011). Thus, Sports Scientists require methods that can accommodate data sets with irregular missing data points.

Small sample sizes refers to a small number of participants included in a study. In sport, the number of athletes able to participate at the highest-level of competition are relatively few and sports scientists conducting studies at the elite level typically only having access to one or two teams/squads (Bernards et al., 2017; Glassbrook et al., 2019), thus small samples sizes are commonplace in sports science research. Indeed, Abt et al. (2020) randomly sampled 120 studies from the Journal of Sports Sciences and found that the median sample size was 19 participants. Importantly, small samples may lack sufficient statistical power to extrapolate the statistical analysis results to the overall population (Abt et al., 2020; Speed & Andersen, 2000); therefore, methods to accommodate for such small sample sizes are required.

Small effect sizes refers to the small true between-individual differences in competitive performances (Mengersen et al., 2016). As the highest-level of competition within a given sport is necessarily exclusive, the athletes within these competitions are at, or near, the peak of their sporting abilities. Consequently, when conducting research on these athletes, the improvements being sought after through interventions or ergogenic aids are relatively small compared to improvements that could be seen in a general population. For example, in the 2020 Tokyo Olympics 100-metre men’s final, only 0.18 s separated the field. While a change of 0.18 s would be deemed relatively immaterial for a person from the general population in improving their 100-metre sprint time, it would be the difference between a gold medal or no medal at all for an Olympic athlete. As a result, statistical methods that can detect small differences within a sample are also required.

1.4 Recurrent flaws in the statistical approach to sports science data sets

1.4.1 Inappropriate analysis of imbalanced data

One of the most-commonly violated assumptions in general linear models when analysing elite athletes’ data is ‘independence of observations’. This can be violated when describing the movement patterns of a sport where there are repeated measurements of the same participants (i.e., the athletes) across multiple matches (i.e., time points). The ‘independence of observations’ assumption of general linear models states that within each condition, an individual can only be included once. This is often seen where one player has been recorded in more matches than another athlete and, as such, the data set is weighted more heavily towards the athletes with more observations. These MAR data points creates an ‘imbalanced’ data set, further compounded by possible MCAR data points from errors during data collection (e.g., sensors malfunctioning, user error) (Borg et al., 2021). Consequently, for general linear models, unless the Sports Scientist is stringent in the inclusion of data, assumptions can easily be violated.

Repeated-measures analysis of variance (RM-ANOVA) is a commonly used general linear model applied in these studies (Dalton-Barron et al., 2020) due to the repeated measurements of the same participants (i.e., the athletes); however, unless strict guidelines are adhered to, it can produce misconstrued results. RM-ANOVA requires a ‘complete’ data set to perform the analysis, that is, every participant must have a value in every observation, and within a condition (e.g., position) every participant must have contributed an equal number of observations (Kenny & Judd, 1986). This is troublesome, as elite sports’ data sets are rarely ‘complete’ due to the frequent missing data points arising from injuries, player selections, and the previously mentioned lack of time, resources, and abilities. As a result, this limitation severely hampers the data set’s abilities to make inferences due to the frequent missing data points.

To account for missing data, data sets need to be manipulated to correctly perform general linear models. Whenever there is missing data within an RM-ANOVA, either the time-point or the athlete needs to be excluded or an imputed data point needs to be inserted to retain a ‘complete case analysis’ data set (Nakai & Ke, 2011). While removing an athlete if they do not have an observation in every time point or removing a time point if many athletes are missing is statistically appropriate if the missing data is proved to be random (i.e., no bias in missing data), this method can cause unnecessary deletion of large amounts of data. Further, if there are many participants and many time points, it can sometimes eliminate so much data that no analysis can be performed on the remaining data set. Therefore, it is improper to use just complete cases and different methods needs to be explored.

The second method of handling missing data that is commonly used in sports science is data imputation (Borg et al., 2021). Sports Scientists employ different methods of ‘imputing’ estimated data in place of missing data, with some utilising the time point mean, some using the athlete’s mean, while others will employ the “last observation carried forward” method, where the last observed value is imputed for the missing data. While imputing data can be useful when there was an error in data collection (e.g., GNSS receiver loses signal during a match), imputing data when an athlete never played in a match is inappropriate, Similarly, the ‘last observation carried forward’ method is also dangerous as it carries the strong assumption that there was no change in conditions between the previously observed time-point and the missing data time-point. For example, if the previously-observed time-point was in the middle of the day, while the missing data time-point was during the night, it would be improper to assume that the same conditions were present in both time points. Therefore, it is necessary for Sports Scientists to utilise a method that can appropriately account for missing data.

1.4.2 Univariate analyses of multivariate concepts

When conducting inferential statistics, the goal is to translate information learned about a sample to make inferences about a given population. As a theoretical framework, the population parameters are never fully known, thus inferences are made based on the sample statistics. Typically, the mean and the standard deviation of the sample are the two key statistics gathered from the sample for inferences regarding the population mean to be made via the use of a sampling distribution. This suits most research applications which focus on the population mean as the parameter of interest to answer a given research question. However, in sports science and talent identification, practitioners are rarely focused on finding the mean of a population; rather, they are focused on identifying extreme values that may possess superior performance than the average athlete within a cohort (Johnston et al., 2018). Consequently, statistical methods that can elucidate information regarding the extreme values of a given data set are also of use within a Sports Scientist’s statistical toolbox.

A stumbling block for Sports Scientists in identifying talent is the proliferation of testing protocols, leading to an abundance of metrics concerning multiple attributes of an athlete’s talent (e.g., physical, physiological, psychological, tactical etc.) (Dodd & Newans, 2018). While factor-reduction techniques can be used to develop a conjugate for these various attributes (Vaeyens et al., 2008), there is still a need to balance each of these attributes if each is equally desirable in an athlete. For example, assume there are three attributes of interest and that each attribute has a normal distribution and is completely independent of the other attributes. If an athlete that is at least 1 standard deviation above the mean (i.e., approximately top 16%) is deemed ‘good’, then the probability a randomly selected athlete is ‘good’ in all three attributes is approximately 0.4%. However, if we define that an athlete that is at least 2 standard deviations above the mean (i.e., approximately top 2.5%) is deemed ‘exceptional’, the likelihood that a randomly selected athlete would be exceptional in all three attributes drops to 0.002%. Unless a sporting organisation had the testing capability to identify the 2 in 100,000 that possess an exceptional standard in all three attributes, then they will necessarily have to compromise on some metrics in the process of recruitment. As a result, athletes possess the best compromise of the attributes and are only ‘good’ in all three attributes may get missed in talent identification processes if the scouts and coaches screen using a univariate analysis for each attribute.

1.4.3 Under-powered studies

Given the exclusiveness of the highest-level of competition in sports and that athletes are of similar standards within these competitions, Sports Scientists are often faced with small sample sizes and/or small effect sizes when conducting research to improve these athletes (Atkinson et al., 2012). As classical statistics (commonly referred to ‘frequentist statistics’) has revolved around traditional null hypothesis statistical testing (NHST), its ability to provide meaningful inferences in decision making for Sports Scientists is inherently difficult (Batterham & Hopkins, 2006). Statistical methods learned in undergraduate statistics courses use moderate-to-large size samples which are amply powered, Sports Scientists’ sample sizes are typically much smaller than that, due to the aforementioned exclusivity of elite sports and their association with typically only one team (Bernards et al., 2017; Glassbrook et al., 2019). For example, in team sports such as basketball, teams are only permitted limited to squads of 15 athletes, drastically limiting a Sports Scientist’s sample size. While the sample size and effect size issues are not limited to sports science (Bacchetti et al., 2011; Ploutz-Snyder et al., 2014), these issues provide a barrier to the ability to perform robust statistical tests. Similarly, as the slightest differences in score, time, or distance could affect the result of a sport, any marginal gains are seen as a necessity (Batterham & Hopkins, 2006). However, a small effect size in the true population may also not be detected (Borenstein, 2009; Mengersen et al., 2016; Speed & Andersen, 2000). Consequently, under-powered studies are pervasive within sports science research (Speed & Andersen, 2000) and methods to accommodate for such small sample sizes and effect sizes are required.

1.4.4 Probabilistic Statements of Inference

The use of probabilistic statements of inference within the sports science community has been of increasing interest over the past 15 years (Batterham & Hopkins, 2006; Borg et al., 2018; Mengersen et al., 2016; Sainani, 2018; Welsh & Knight, 2015). Of particular note, Batterham & Hopkins (2006) began compelling readers that NHST is unnecessarily restrictive because NHST has the assumption that the parameter of interest is ‘fixed’ (e.g., the population mean) and the probability that the data has arisen from this fixed parameter is calculated. If it is unlikely that the data has originated from this parameter, the null hypothesis is deemed to be rejected and that the parameter is different to what was hypothesised. Given the small samples and small effect sizes seen in sports science, there has been a push for more clarity around probabilistic distributions than what NHST offers. Consequently, Batterham and Hopkins provided an alternative to NHST named ‘magnitude-based inferences’ (Batterham & Hopkins, 2006). This newly established method aiming to provide probabilistic inference statements gained traction quickly, amassing over 1700 cites and was published in guidelines for authors for some journals (Hopkins et al., 2009). While this method received early criticism (Barker & Schofield, 2008), this criticism gained little traction. More attempts to discredit the concept of magnitude-based inferences were vocalized by Welsh & Knight (2015) who provided a statistical review of the method and illustrated the method’s unacceptably high levels of Type I error rates (Welsh & Knight, 2015). However, it was not until Sainani (2018) weighed into the statistical debate which triggered the sports science community to critically assess whether magnitude-based inferences should be accepted within literature (Sainani, 2018). As a consequence, the Medicine in Sports and Exercise (MSSE) editorial board has even explicitly stated that studies should not use magnitude-based inferences for any statistical inferences (Medicine & Science in Sports & Exercise, 2023). However, this has left a need for robust probabilistic statements of inference to help inform sports science practitioners and researchers.

1.4.5 Transparency in data

An additional concept to which statistical literacy is required is amidst the reproducibility crisis seen in sports science research (Caldwell et al., 2020). In a standard sports science research paper, researchers are expected to give a thorough explanation of the methods used to gather their data, or otherwise cite an already-published explicit methods study. This process allows another researcher to gather data in conditions as homogenous as possible to the original research. While this is seen as traditional practice, the same rigour is not currently applied to the implementation of statistical methods in which merely mentioning the statistical method used can pass through editorial review. This lack of academic rigour results in a replication and a reproducibility issue (John et al., 2012). With manipulation of hypotheses after results are known, p-hacking, data dredging, cherry picking, and the file drawer problem all resulting in a distorted view of the reality (Franco et al., 2014), the need for greater transparency in the way researchers explain their statistical approach within studies is paramount.

Borg et al. (2020) sampled 299 ‘sports science’ studies to understand the availability of data and the associated computer codes used in these studies and found that none of the articles provided the code used to perform the statistical tests and only 4.3% shared the data used in the study (Borg et al., 2020). This shift in ideology within sports science can help alleviate the commonly faced issues previously mentioned. While researchers may only have access to the data of one or two teams (Bernards et al., 2017; Glassbrook et al., 2019), by adopting a more transparent mindset to research, researchers can collaborate with other teams working on similar projects, thus increasing their sample size, to strengthen their research impact by alleviating struggles observed with small sample sizes (Batterham & Hopkins, 2006).

1.5 Being comfortable with variability and uncertainty

Within sports, the winner is typically determined using a quantitative scale, whether that be a tally of points, a judge’s score, a duration, or a distance. While declaring the winner can be determined through simple mathematics (e.g., one team scored higher than the other team), it does not take into consideration the events leading to the outcome. That is, if each of the events within a match were to be replicated, would it be predicted that the outcome would be the same? Consequently, there is a need for statistics to provide the context around the outcome of the sport. This difference between mathematics and statistics has been discussed by Cobb & Moore (1997) who state that:

“Statistics is a methodological discipline. It exists not for itself, but rather to offer to other fields of study a coherent set of ideas and tools for dealing with data. The need for such a discipline arises from the omnipresence of variability.”

Statistical literacy is required to appropriately handle data that exhibits inherent variability and, thus, understanding the uncertainty regarding inferences arising from the data. Statistical literacy is required to identify the correct statistical approach (knowledge), skillfully code and interpret the output (competence), and tackle an unfamiliar data set (confidence). The recognition and acceptance of variability in a data set is at the forefront of the GAISE report (Franklin et al., 2007). This document outlines that the objective of statistics education is for students to develop the ability to deal with the ubiquity of variability and the decision-making ability of interpreting the variability in the data (Franklin et al., 2007).

As a result, for Sports Scientists to be statistically literate, they need to understand there is an inherent variability surrounding that quantitative scale, often which can be the difference between winning and losing (Batterham & Hopkins, 2006). Similarly, when quantifying the response to a given external load, there will be an inherent variability due to the plethora of external factors contributing to the load within a session (Dalton-Barron et al., 2020). Consequently, there is a level of uncertainty with decision making as a function of the variability within the data being collected, most plainly seen in the standard error of the mean formula:

\[ SE_x = \frac{s_x}{\sqrt{n}} \]

The level of uncertainty (in this case, standard error denoted by \(SE_x\) is determined by two variables: a level of variability (the standard deviation denoted by \(s_x\)), and a sample size (denoted by \(n\)). As a result, there are two contributing factors to a high level of uncertainty: a) a high level of variability and b) a small sample size. Unfortunately, sports science data sets often feature both factors which suggests inferences are intrinsically difficult to deduce. As a result, statistical methods that can robustly account for the variability within sports science data sets and can provide estimates that reflect the uncertainty arising from the data are required for Sports Scientists to perform high-quality research.

1.6 Mixed Models

When quantifying the external load of an athlete within a match, researchers will often gather data from the GNSS for multiple athletes across multiple matches. When the data set is compiled, it is highly unlikely that the same athletes played for the team every week (due to injury, non-selection etc.). As a result, three possible options exist to ensure that the assumptions of a RM-ANOVA are not violated:

Eliminate any athlete who did not play every match;
Eliminate any match that contained players who didn’t play many matches;
Impute missing data for any athlete that did not play in a given match.

Both option 1 and 2 unnecessarily eliminate crucial data, while option 3 is improper to impute data into matches that a player never played in. Despite these limitations, RM-ANOVA remains a common statistical method to analyse GNSS-derived data (Dalton-Barron et al., 2020; Mara et al., 2017; Russell et al., 2016; Vigh-Larsen et al., 2018). These previously mentioned limitations severely hamper a general linear model such as an RM-ANOVA to make inferences due to the frequent missing data points. In contrast, there are few examples of sports scientists who have incorporated the use of mixed models into their research to accommodate the challenges of sports science data sets (Delaney et al., 2016; Kempton et al., 2017; Newans et al., 2019; Quinn et al., 2020).

Mixed models provide an improved alternative to general linear models within sports science data sets. Mixed models can estimate the population mean by accounting for the variability within each athlete as well as the variability in each time-point. Given the differing numbers of observations for each athlete, the mixed model ‘shrinks’ the estimates for each athlete based on their number of observations. This concept, known as ‘partial pooling’, means that athletes with few observations are shrunk closer to the mean intercept and slope than athletes with more observations (McElreath, 2018). Similarly, it can also account for correlations both between and within athletes (Kwon et al., 2014). For example, within rugby league there are nine distinct positions (plus interchange players); however, some of these positions have more similar movement patterns than other positions. If a five-eighth were to play at halfback for some matches, there would be minimal change in movement patterns; however, if they were to play fullback or lock, a marked change in their movement patterns would be apparent (Glassbrook et al., 2019). Consequently, the properties of mixed models are particularly useful in such data sets in which athletes differ substantially in baseline physiological measures, as well as fluctuate in their physiological measures based on differing training conditions. By accounting for the athlete’s baseline, the condition (e.g., position in rugby league) can be more easily interrogated. As a result, mixed models can more accurately and validly provide information on athletes, time points, and conditions much more robustly than general linear models.

1.7 Pareto Frontiers

Statistical methods that can evaluate the trade-off relationship between equally desirable attributes are required when identifying talent. In statistics, this is called ‘multi-objective optimisation’; that is, we are aiming to optimise the value of each variable at the least expense of another variable. Consequently, the solution is a series of values in which it is not possible to improve one variable without detracting from another variable. That is, the series of values that possess the best compromise between the variables of interest. This series of values is deemed the ‘Pareto frontier’. This series is easily visualised as seen by the red line in Figure 1.1.

Show the code:

library(tidyverse)
library(rPref)
set.seed(99.94)
df <- runif(35, 40, 80) %>%
  data.frame(x = .) %>%
  mutate(y = (1 / x) * rnorm(35, 100, 50)) %>%
  psel(high(x) * high(y), top_level = 99) ## Generate Pareto frontier for data set of 35 random points

ggplot(df, aes(x = x, y = y, color = as.factor(`.level`), alpha = as.factor(`.level`))) +
  geom_point(size = 3) +
  geom_line(data = df %>% filter(`.level` == 1),
            color = "red",
            linewidth = 1) +
  scale_color_manual(values = c("red", rep("grey20", 10))) +
  scale_alpha_manual(values = c(1.0, rep(0.4, 10))) +
  coord_cartesian(xlim = c(45, 80),
                  ylim = c(1, 4.5)) +
  labs(x = "Variable 1",
       y = "Variable 2") +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.text = element_blank(),
    axis.title = element_text(size = 10, color = "black", face = "bold")
  ) ## Plot the Pareto frontier of the points

Figure 1.1: Illustration of the Pareto frontier, highlighted in red.

The Pareto frontier allows decision-makers to assess the series of values and determine what trade-offs would be required if an individual were to improve in a certain variable. While originally explored in economics by Vilfredo Pareto in the early 1900’s to understand the wealth distribution in societies, the applications have been wide-reaching, with extensive use-cases within economics (Horn et al., 1994; Ponsich et al., 2012; Tapia & Coello, 2007) and engineering disciplines (Deb & Datta, 2012; Gunantara, 2018; Marler & Arora, 2004; Mastroddi & Gemma, 2013).

For example, Ottosson et al. (2009) explored the trade-off problem when providing radiation therapy for cancer treatment. There are two key factors: i. maximising the radiation dose on the targeted area of cancerous cells and ii. minimising the radiation damage to nearby healthy cells. Given the inter-dependency of these variables, no one solution is achievable; rather, a series of possible solutions were all returned as optimal solutions (that is, no other solution is superior in both variables of interest). Consequently, this set of observations can be provided to decision-makers to determine which solution is most suitable for the given use-case.

With the abundance of metrics concerning multiple attributes of an athlete’s talent, understanding the Pareto frontier of the relationship between these metrics allow Sports Scientists to identify more easily those with the best compromise between the attributes. For example, when batting in Twenty20 cricket, the objective is to score as many runs as possible; but, given the limited number of balls possible (120 balls in a standard Twenty20 innings), there is an impetus for the batter to score their runs as fast as possible. Consequently, this creates a trade-off relationship between the number of runs scored per dismissal (referred to as the batting average) and the numbers of runs scored per 100 balls faced (referred to as the batting strike rate). With univariate analyses, only the batters with either the highest batting average or highest strike rate would be identified; however, this disregards the batters that possess the optimal trade-off between these metrics (i.e., the ‘hybrid’ athlete). By constructing the Pareto frontier on the relationship between these metrics, not only will the best batter for batting average and best batter for strike rate be identified, but also the set of batters in which no other batter scores more runs per dismissal at a faster strike rate will also be identified.

1.8 Bayesian Modeling

Throughout all the rebuttals of magnitude-based inferences (Barker & Schofield, 2008; Borg et al., 2018; Mengersen et al., 2016; Sainani, 2018; Welsh & Knight, 2015), the recurring rhetoric is for Sports Scientists to adopt a fully Bayesian approach to provide probabilistic statements when inferring conclusions. In response to the growing use of magnitude-based inferences, Mengersen and colleagues (2016), provided a worked example and template to perform Bayesian statistics in exercise and sports science data sets. The paper clearly articulated the need for a Bayesian analysis for providing probabilistic statements, similar to those provided by magnitude-based inferences (Mengersen et al., 2016). With some tutorials being developed to provide readers with the skills in executing Bayesian modelling (Kruschke, 2014; Quintana & Williams, 2018), there appears to be a growing trend in the use of Bayesian statistical methods in sports science (Santos-Fernandez et al., 2019). However, it seems that there is still a lack of accessibility for Sports Scientists to grasp the knowledge of why Bayesian inference is necessary (Mengersen et al., 2016). Therefore, the pathway from realizing the lack of statistical rigour in these customized spreadsheets to perform fully Bayesian analysis needs to be continually improved.

In a frequentist framework, the parameter is fixed and the probability that the data fits the null parameter is calculated. As a result, only interpretations with respect to the null parameter can be made. However, in a Bayesian framework, the data is fixed and, therefore, the probabilities and subsequent distributions for each parameter can be calculated, providing much richer information to interpret. Consequently, a key benefit that Bayesian inference provides is the ability to incorporate prior information to curate the model and provide estimates with more certainty than frequentist models can provide. For example, consider the example used earlier in the 2020 Tokyo Olympics Men’s 100-metre running. When establishing the true population mean race time for elite male 100-m sprinters, a frequentist model has no assumed knowledge; therefore, assumes any race time from negative infinity to positive infinity is a possible result. While a different distribution could limit the sample space to only positive times, that is still untenable, as the margins of victory (i.e., 0.18 s separating 1st and last place) are so minute comparative to the sample space. However, as Sports Scientists have expert knowledge on the possible sample space (e.g., 8 to 12 s to be overly conservative), by inserting this prior knowledge into a Bayesian model will limit the possible sample space so that the Bayesian model will only need to iterate over a narrower range to establish where the true population mean lies.

1.9 Thesis Aims

The aim of this thesis is to provide Sports Scientists with access to applications of statistical methods that will expand their statistical toolkit to accommodate data sets regularly seen in a sports science context.

The specific aims of the experimental chapters were to:

Demonstrate the utility of mixed models for Sports Scientists when analysing longitudinal data sets;
Introduce Pareto frontiers to the sports science community and illustrate how they can identify players with the optimal balance of attributes that can be obfuscated when performing univariate analysis across each metric;
Explore how various prior distributions in a Bayesian framework can affect the inferences relating to the research hypothesis;
Examine the position-specific demographics, technical match statistics, and movement patterns of the National Rugby League Women’s (NRLW) Premiership;
Utilise Pareto frontiers to visualise the trade-off relationship between short- and long-duration running intensities.

1.10 Thesis Structure

There are five experimental chapters to this thesis. The first three of which are more methodology-focused, with each chapter highlighting a different statistical principle. Namely, Chapter 3 explores mixed models, Chapter 4 explores Pareto frontiers, and Chapter 5 explores Bayesian inference. Following this are another two chapters that are more application-focused, building off the three methodology chapters. Chapter 6 presents a relatively simple mixed model application study, while Chapter 7 serves as the ‘capstone’ study, presenting a Pareto frontier of conditional effects arising from a mixed model in Bayesian framework.

Abt, G., Boreham, C., Davison, G., Jackson, R., Nevill, A., Wallace, E., & Williams, M. (2020). Power, precision, and sample size estimation in sport and exercise science research. Journal of Sports Sciences, 38(17), 1933–1935. https://doi.org/10.1080/02640414.2020.1776002

Atkinson, G., Batterham, A., & Hopkins, W. (2012). Sports performance research under the spotlight. International Journal of Sports Medicine, 33(12), 949. https://doi.org/10.1055/s-0032-1327755

Bacchetti, P., Deeks, S., & McCune, J. (2011). Breaking free of sample size dogma to perform innovative translational research. Science Translational Medicine, 3(87), 87ps24. https://doi.org/10.1126/scitranslmed.3001628

Barker, R., & Schofield, M. (2008). Inference about magnitudes of effects. International Journal of Sports Physiology and Performance, 3(4), 547–557. https://doi.org/10.1123/ijspp.3.4.547

Batterham, A., & Hopkins, W. (2006). Making meaningful inferences about magnitudes. International Journal of Sports Physiology and Performance, 1(1), 50–57. https://doi.org/10.1123/ijspp.1.1.50

Bernards, J., Sato, K., Haff, G., & Bazyler, C. (2017). Current research and statistical practices in sport science and a need for change. Sports, 5(4), 87. https://doi.org/10.3390/sports5040087

Borenstein, M. (2009). The handbook of research synthesis and meta-analysis (H. Cooper, L. Hedges, & J. Valentine, Eds.; Vol. 2, pp. 221–235). https://doi.org/10.7758/9781610448864

Borg, D., Bon, J., Sainani, K., Baguley, B., Tierney, N., & Drovandi, C. (2020). Comment on: ‘Moving sport and exercise science forward: A call for the adoption of more transparent research practices’. Sports Medicine, 50(8), 1551–1553. https://doi.org/10.1007/s40279-020-01298-5

Borg, D., Minett, G., Stewart, I., & Drovandi, C. (2018). Bayesian methods might solve the problems with magnitude-based inference. A letter in response to Dr. Sainani. Medicine and Science in Sports and Exercise, 50(12), 2609–2610. https://doi.org/10.1249/mss.0000000000001736

Borg, D., Nguyen, R., & Tierney, N. (2021). Missing data: Current practice in football research and recommendations for improvement. Science and Medicine in Football, 0(ja), null. https://doi.org/10.1080/24733938.2021.1922739

Caldwell, A., Vigotsky, A., Tenan, M., Radel, R., Mellor, D., Kreutzer, A., Lahart, I., Mills, J., Boisgontier, M., Boardley, I., Bouza, B., Cheval, B., Chow, Z. R., Contreras, B., Dieter, B., Halperin, I., Haun, C., Knudson, D., Lahti, J., … Consortium for Transparency in Exercise Science (COTES) Collaborators. (2020). Moving sport and exercise science forward: A call for the adoption of more transparent research practices. Sports Medicine, 50(3), 449–459. https://doi.org/10.1007/s40279-019-01227-1

Cobb, G., & Moore, D. (1997). Mathematics, statistics, and teaching. The American Mathematical Monthly, 104(9), 801–823. https://doi.org/10.1080/00029890.1997.11990723

Crang, Z., Duthie, G., Cole, M., Weakley, J., Hewitt, A., & Johnston, R. (2021). The validity and reliability of wearable microtechnology for intermittent team sports: A systematic review. Sports Medicine, 51(3), 549–565. https://doi.org/10.1007/s40279-020-01399-1

Dalton-Barron, N., Whitehead, S., Roe, G., Cummins, C., Beggs, C., & Jones, B. (2020). Time to embrace the complexity when analysing GPS data? A systematic review of contextual factors on match running in rugby league. Journal of Sports Sciences, 38(10), 1161–1180. https://doi.org/10.1080/02640414.2020.1745446

Deb, K., & Datta, R. (2012). Hybrid evolutionary multi-objective optimization and analysis of machining operations. Engineering Optimization, 44(6), 685–706. https://doi.org/10.1080/0305215X.2011.604316

Delaney, J., Thornton, H., Duthie, G., & Dascombe, B. (2016). Factors that influence running intensity in interchange players in professional rugby league. International Journal of Sports Physiology and Performance, 11(8), 1047–1052. https://doi.org/10.1123/ijspp.2015-0559

Dodd, K., & Newans, T. (2018). Talent identification for soccer: Physiological aspects. Journal of Science and Medicine in Sport, 21(10), 1073–1078. https://doi.org/10.1016/j.jsams.2018.01.009

Exercise and Sports Science Australia. (2019). The professional standards documents. https://www.essa.org.au/Public/Professional_Standards/The_professional_standards.aspx

Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502–1505. https://doi.org/10.1126/science.1255484

Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., & Scheaffer, R. (2007). Guidelines for assessment and instruction in statistics education (GAISE) report. In Alexandria: American Statistical Association.

Gal, I. (2002). Adults’ statistical literacy: Meanings, components, responsibilities. International Statistical Review, 70(1), 1–25. https://doi.org/10.1111/j.1751-5823.2002.tb00336.x

Garfield, J., & Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. Springer Science & Business Media. https://doi.org/10.1007/978-1-4020-8383-9

Glassbrook, D., Doyle, T., Alderson, J., & Fuller, J. (2019). The demands of professional rugby league match-play: A meta-analysis. Sports Medicine - Open, 5(1), 24. https://doi.org/10.1186/s40798-019-0197-9

Griffin, J., Newans, T., Horan, S., Keogh, J., Andreatta, M., & Minahan, C. (2021). Acceleration and high-speed running profiles of women’s international and domestic football matches. Frontiers in Sports and Active Living, 3, 71. https://doi.org/10.3389/fspor.2021.604605

Gunantara, N. (2018). A review of multi-objective optimization: Methods and its applications. Cogent Engineering, 5(1), 1502242. https://doi.org/10.1080/23311916.2018.1502242

Hopkins, W., Marshall, S., Batterham, A., & Hanin, J. (2009). Progressive statistics for studies in sports medicine and exercise science. Medicine & Science in Sports & Exercise, 41(1), 3–12. https://doi.org/10.1249/MSS.0b013e31818cb278

Horn, J., Nafpliotis, N., & Goldberg, D. (1994). A niched Pareto genetic algorithm for multiobjective optimization. 82–87. https://doi.org/https://doi.org/10.1109/ICEC.1994.350037

John, L., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. https://doi.org/10.1177/0956797611430953

Johnston, K., Wattie, N., Schorer, J., & Baker, J. (2018). Talent identification in sport: A systematic review. Sports Medicine, 48(1), 97–109. https://doi.org/10.1007/s40279-017-0803-2

Kempton, T., Sirotic, A., & Coutts, A. (2017). A comparison of physical and technical performance profiles between successful and less-successful professional rugby league teams. International Journal of Sports Physiology and Performance, 12(4), 520–526. https://doi.org/10.1123/ijspp.2016-0003

Kenny, D., & Judd, C. (1986). Consequences of violating the independence assumption in analysis of variance. Psychological Bulletin, 99(3), 422–431. https://doi.org/10.1037/0033-2909.99.3.422

Kruschke, J. (2014). Doing bayesian data analysis: A tutorial with r, JAGS, and stan. Academic Press. https://doi.org/10.1016/B978-0-12-405888-0.00001-5

Kwon, S.-S., Lee, K. M., Chung, C. Y., Lee, S. Y., & Park, M. S. (2014). An introduction to the linear mixed model for orthopaedic research. JBJS Reviews, 2(12). https://doi.org/10.2106/JBJS.RVW.N.00009

Mara, J., Thompson, K., Pumpa, K., & Morgan, S. (2017). The acceleration and deceleration profiles of elite female soccer players during competitive matches. Journal of Science and Medicine in Sport, 20(9), 867–872. https://doi.org/10.1016/j.jsams.2016.12.078

Marler, T., & Arora, J. (2004). Survey of multi-objective optimization methods for engineering. Structural and Multidisciplinary Optimization, 26(6), 369–395. https://doi.org/10.1007/S00158-003-0368-6

Mastroddi, F., & Gemma, S. (2013). Analysis of Pareto frontiers for multidisciplinary design optimization of aircraft. Aerospace Science and Technology, 28(1), 40–55. https://doi.org/10.1016/j.ast.2012.10.003

McElreath, R. (2018). Statistical rethinking: A bayesian course with examples in r and stan. Chapman; Hall/CRC. https://doi.org/10.1201/9781315372495

Medicine & Science in Sports & Exercise. (2023). Information for authors. https://edmgr.ovid.com/msse/accounts/ifauth.htm

Mengersen, K., Drovandi, C., Robert, C., Pyne, D., & Gore, C. (2016). Bayesian estimation of small effects in exercise and sports science. PLoS One, 11(4), e0147311. https://doi.org/10.1371/journal.pone.0147311

Nakai, M., & Ke, W. (2011). Review of the methods for handling missing data in longitudinal data analysis. International Journal of Mathematical Analysis, 13.

Newans, T., Bellinger, P., Dodd, K., & Minahan, C. (2019). Modelling the acceleration and deceleration profile of elite-level soccer players. International Journal of Sports Medicine, 40(5), 331–335. https://doi.org/10.1055/a-0853-7676

Newell, J., Aitchison, T., & Grant, S. (2014). Statistics for sports and exercise science: A practical approach. Routledge. https://doi.org/10.4324/9781315847542

Ottosson, R., Engström, P., Sjöström, D., Behrens, C., Karlsson, A., Knöös, T., & Ceberg, C. (2009). The feasibility of using pareto fronts for comparison of treatment planning systems and delivery techniques. Acta Oncologica, 48(2), 233–237. https://doi.org/10.1080/02841860802251559

Ploutz-Snyder, R., Fiedler, J., & Feiveson, A. (2014). Justifying small-n research in scientifically amazing settings: Challenging the notion that only “big-n” studies are worthwhile. Journal of Applied Physiology, 116(9), 1251–1252. https://doi.org/10.1152/japplphysiol.01335.2013

Ponsich, A., Jaimes, A. L., & Coello, C. A. C. (2012). A survey on multiobjective evolutionary algorithms for the solution of the portfolio optimization problem and other finance and economics applications. IEEE Transactions on Evolutionary Computation, 17(3), 321–344. https://doi.org/10.1109/TEVC.2012.2196800

Queensland Department of Education and Training. (2023). Literacy and numeracy fact sheet. https://education.qld.gov.au/parents/Documents/factsheet-l-n.pdf

Quinn, K., Newans, T., Buxton, S., Thomson, T., Tyler, R., & Minahan, C. (2020). Movement patterns of players in the Australian Women’s Rugby League team during international competition. Journal of Science and Medicine in Sport, 23(3), 315–319. https://doi.org/10.1016/j.jsams.2019.10.009

Quintana, D., & Williams, D. (2018). Bayesian alternatives for common null-hypothesis significance tests in psychiatry: A non-technical guide using JASP. BMC Psychiatry, 18(1), 178.

Russell, M., Sparkes, W., Northeast, J., Cook, C., Love, T., Bracken, R., & Kilduff, L. (2016). Changes in acceleration and deceleration capacity throughout professional soccer match-play. The Journal of Strength and Conditioning Research, 30(10), 2839–2844. https://doi.org/10.1519/jsc.0000000000000805

Sainani, K. (2018). The problem with "magnitude-based inference". Medicine and Science in Sports and Exercise, 50(10), 2166–2176. https://doi.org/10.1249/MSS.0000000000001645

Santos-Fernandez, E., Wu, P., & Mengersen, K. (2019). Bayesian statistics meets sports: A comprehensive review. Journal of Quantitative Analysis in Sports, 15(4), 289–312. https://doi.org/10.1186/s12888-018-1761-4

Scott, J., Hill, S., Barwood, D., & Penney, D. (2021). Physical literacy and policy alignment in sport and education in Australia. European Physical Education Review, 27(2), 328–347. https://doi.org/10.1177/1356336X20947434

Seshadri, D., Thom, M., Harlow, E., Gabbett, T., Geletka, B., Hsu, J., Drummond, C., Phelan, D., & Voos, J. (2021). Wearable technology and analytics as a complementary toolkit to optimize workload and to reduce injury burden. Frontiers in Sports and Active Living, 2. https://doi.org/10.3389/fspor.2020.630576

Speed, H., & Andersen, M. (2000). What exercise and sport scientists don’t understand. Journal of Science and Medicine in Sport, 3(1), 84–92. https://doi.org/10.1016/S1440-2440(00)80051-1

Tapia, M. G. C., & Coello, C. A. C. (2007). Applications of multi-objective evolutionary algorithms in economics and finance: A survey. 532–539. https://doi.org/10.1109/CEC.2007.4424516

Vaeyens, R., Lenoir, M., Williams, M., & Philippaerts, R. (2008). Talent identification and development programmes in sport. Sports Medicine, 38(9), 703–714. https://doi.org/10.2165/00007256-200838090-00001

Vigh-Larsen, J., Dalgas, U., & Andersen, T. (2018). Position-specific acceleration and deceleration profiles in elite youth and senior soccer players. The Journal of Strength and Conditioning Research, 32(4), 1114–1122. https://doi.org/10.1519/JSC.0000000000001918

Welsh, A., & Knight, E. (2015). “Magnitude-based inference”: A statistical review. Medicine and Science in Sports and Exercise, 47(4), 874–884. https://doi.org/10.1249/MSS.0000000000000451

Whitehead, M. (2010). Physical literacy (pp. 8–19). Routledge. https://doi.org/10.4324/9780203881903