CrossFit® training constantly varies daily workouts to promote general physical preparedness [1]. While this strategy appears to be useful for eliciting adaptations across a variety of fitness domains [8, 9], gauging sport-specific progress and proficiency is difficult. Traditional field and laboratory measures (e.g., aerobic capacity, anaerobic threshold, peak power) are commonly accepted tools for monitoring athletic progress [2], and a few have been related to CrossFit® performance [3, 10]. However, in most instances, their precision is dependent on the availability of expensive equipment, and it may not be logistically feasible to assess several individuals from a single location or across locations, without sacrificing their validity and/or reliability. It is also difficult to simulate actual workouts or competitive environments with traditional assessment tools (e.g., metabolic cart, cycle ergometers, force plates) because of the likelihood that they would impair natural movement. Thus, CrossFit® practitioners commonly use standardized workouts to monitor sport-specific adaptations. These common benchmark workouts are identifiable by name (e.g., Fran, Grace), and their requirements are standardized across affiliates. Though commonly practiced, there is little information available to allow practitioners to determine the quality of their performance in such workouts. Here, we provide normative values for self-reported performance scores in five, common benchmark workouts for male and female practitioners across the three, primary age-classifications (i.e., teens, individuals, or master’s) of the CrossFit® Open. Practitioners can use these data to project their status among their peers, as well as to monitor their individual progress and set realistic goals for training.
In terms of absolute intensity, CrossFit® workouts prescribed for IM are the most challenging. For instance, in the workouts examined in the present study, men were typically required to lift more weight, jump onto a higher box, or throw a heavier medicine ball to a higher target than women. Workout prescription may be further scaled to accommodate less experienced and/or older individuals, but this does not occur in the common benchmark workouts (i.e., only one workout design exists for each sex, regardless of age). Accordingly, we observed that IM and IF performed better than their master’s counterparts in all workouts aside from F50 (i.e., no differences were found between IF and MF). This is not surprising because younger practitioners would be expected to perform better when given the same task [11, 12]. However, within the individual and master’s age classifications, men reported better scores than women for each workout. This is interesting because appropriate scaling should equate workout difficulty and result in similar scores between men and women. Typically, clear differences exist between men and women when comparisons are made with absolute values for traditional measures of strength and endurance, but not when using relative figures (e.g., percentage of one-repetition maximum, per kilogram of body mass) [13,14,15]. Though comparisons between sexes are not common in CrossFit®, it may be possible if relative standards are used when prescribing intensity. Another possible explanation may be related to the fact that more men (n = 7352) than women (n = 2546), in the individual and master’s age classifications, possessed a profile account and reported their performance scores. Likewise, only 102 teenage practitioners possessed an account in the present sample. Individuals who participate in CrossFit® and similar exercise forms are not required to create a profile on the CrossFit® website and have alternative platforms for tracking progress (e.g., Wodify, Zen Planner, beyond the whiteboard). Consequently, our findings may be limited to CrossFit® athletes who also possess an account on the CrossFit® website. Further, because the athletes report these data as their personal best performance in each workout, our findings may be most representative of peak fitness within each individual workout and not necessarily of ability across all workouts simultaneously.
These data may also be useful for developing more accurate inclusion/exclusion criteria in research. Currently, physiological research on CrossFit® is limited, and most studies have used training experience (i.e., the number of years of participation) as the primary indicator for training status. Though years of experience would likely indicate a degree of familiarity with the nuances of this training strategy, its use as an indicator of proficiency is complicated by individual variability in training frequency, regularity in utilizing prescribed (versus scaled) workouts, athletic talent, and previous experiences in other sports. Put simply, unless potential participants are recruited from a pool of individuals who have been previously ranked in international competitions (e.g., the Reebok CrossFit Games™), it is difficult to accurately identify their proficiency in the sport from experience alone. For instance, male and female participants have been previously recruited based on their experience (number of years was not reported) with CrossFit® to determine their physiological responses to two common benchmark workouts (including “Fran”) [16]. However, it may not be correct to extrapolate their findings to all CrossFit® practitioners. Based on our findings, the “Fran” scores for male (331 ± 82.4 s) and female (331 ± 92.1 s) participants in that study would have placed them within the 20th and 50th percentiles, respectively. It may have been more appropriate to describe those individuals as beginner or intermediate CrossFit® practitioners, rather than simply stating they had experience. Likewise, Butcher and colleagues (2015) recruited participants who had previously progressed to the regional round of the Reebok CrossFit Games™ or at least participated in the CrossFit® Open, and who possessed at least 1 year of experience (~ 3.7–4.3 years). However, by examining their measured performances in Fran (203 ± 48 s; range = 130–289 s) and Grace (136 ± 32 s; range = 93–194 s), and depending on sex category (not specified), they could have ranked above the 70th percentile for “Fran” or as low as the 20th percentile for “Grace”. Comparatively, less variability in reported performance scores can be observed in the study conducted by Serafini and colleagues (2017). In that study, the authors utilized final rankings in the 2016 CrossFit® Open to examine differences in benchmark workout scores reported by the top 1500 male and 1500 female athletes (i.e., the top ~ 1%). Although the reported scores would still vary by specific workout and sex, male and female participants typically ranked above the 80th and 70th percentiles, respectively. As more research is conducted on CrossFit®, it will become increasingly necessary to utilize more specific methods for participant recruitment to make accurate inferences across studies.