Performance on the Single-Leg Squat Task Indicates Hip Abductor Muscle Function

Kay M. Crossley, PhD, Wan-Jing Zhang,§ MBBS, Anthony G. Schache, PhD, Adam Bryant,§ PhD, and Sallie M. Cowan,y PhD Investigation performed at Biomechanics Laboratory, Department of Physiotherapy, The University of Melbourne, Melbourne, Australia

Anterior knee pain (AKP) is the most frequent cause of chronic knee pain in adults and results in poorly localized pain around the patellofemoral joint, usually of insidious onset. The high incidence of AKP among active populations is well documented, with incidence rates varying from 9% to 15%. The condition is characterized by a gradual onset of peripatellar pain and is frequently aggravated by common activities of daily living (eg, stair climbing, squatting, ambulation, and kneeling). Therefore, AKP affects many aspects of daily life, including the ability to perform exercise or work-related activities without pain. Accordingly, treatments with the capacity to reduce the burden of AKP are clearly required. A growing body of contemporary evidence indicates that hip muscle function is compromised in people with AKP. This is highlighted by a recent systematic review that found strong evidence for deficits in hip muscle strength (abduction, external rotation, extension) in women with AKP compared with uninjured controls. Similar patterns have also been observed in men. An alternative measure of hip muscle function is hip muscle electromyography (EMG) activation patterns. Two studies observed delayed onset of gluteus medius EMG activity in people with AKP compared with healthy controls. Hence, the available evidence, combined with current clinical consensus, indicates that treatments aimed at improving hip muscle function may result in greater symptom relief (reductions in pain, improvements in physical function) if they are targeted toward those who would most benefit. Consequently, there is a need to be able to clinically identify subgroups of people with chronic AKP who display compromised hip muscle function. A clinical assessment tool often used in screening or patient assessment to determine whether hip muscle function is compromised is the single-leg squat task. Currently, no study has evaluated whether clinical assessment of performance on the single-leg squat task reflects hip muscle function. The relationship between hip muscle strength and control of hip and knee motions during a single-leg squat task remains unclear. Discrepancies between study results may relate to the measurement of hip muscle strength (including how data are scaled or normalized to body size) and methodology for measuring hip and knee joint motions. Furthermore, in prior studies, hip and knee joint motions were recorded using sophisticated equipment only available in gait laboratories. Consequently, there is a need to develop a more simple and clinically applicable procedure to evaluate performance on the single-leg squat task. Visual analysis of movement patterns during a variety of single-leg tasks, designed to assess lower limb neuromuscular control, is currently employed in clinical practice. Studies of clinician agreement on rating performance of single-leg tasks reveal inconsistent results.5,12,20 Accordingly, the authors identified a need for future research in this area. We sought to determine whether the single-leg squat task could be used as a valid and reliable clinical assessment tool capable of identifying subgroups of people with compromised
hip muscle function. Therefore, the aims of this study were to devise a clinical rating of performance on the single-leg squat task based on the consensus of a panel of experienced physical therapists, examine the intra- and interrater reliability for physical therapists when performing a clinical rating of the single-leg squat, and determine whether people who received a ‘‘poor’’ rating on the single-leg squat exhibited different hip muscle function (onset of gluteus medius activity and hip/trunk muscle strength) compared to those who received a ‘‘good’’ rating on the single-leg squat.



We recruited 34 healthy adults, in the age range likely to develop AKP (mean 6 SD: age, 24 6 5 y; height, 1.69 6 0.10 m; weight, 65.0 6 10.7 kg). The mean 6 SD number of exercise sessions per week (duration 30 minutes) was 4.4 6 4.8. Participants were excluded if they had any history of lower limb injury or other disorder that might affect their hip muscle function or capacity to perform the single-leg squat task. The study was approved by the University of Melbourne Human Research Ethics Committee. All participants provided written informed consent. Fifteen of these participants (mean 6 SD: age, 25 6 5 y; height, 1.73 6 0.09 m; weight, 66.8 6 9.8 kg; exercise sessions/wk, 3.3 6 2.9) were included in the reliability study.

Single-Leg Squat Task

All participants were provided with standard instructions on how to complete the single-leg squat task. Participants wore shorts, singlet, and running sandals (provided) and were asked to stand on their dominant leg on a 20-cm box. Leg dominance was determined as the leg with which the participant would kick a ball. Participants were instructed to fold their arms across their chest and to squat down as far as possible 5 times consecutively, in a slow, controlled manner, maintaining their balance, at a rate of approximately 1 squat per 2 seconds. A single investigator demonstrated the single-leg squat procedure. All participants were allowed up to 3 practice attempts. During the single-leg squat trials, the participant was captured on digital video, with the video camera placed approximately 3 m in front of the participant on a tripod at the height of the patient’s pelvis. Digital images were stored in a coded (de-identified) manner and transferred to DVD for assessment.

Electromyographic Procedure

Anterior gluteus medius (AGM) and posterior gluteus medius (PGM) EMG activity was recorded in a manner previously described. Briefly, pairs of silver-silver chloride surface electrodes (Graphics Control Corporation, c/o Medical Equipment Services Pty Ltd, Richmond, Australia) were placed over the AGM muscle belly, 22-mm interelectrode distance, after skin preparation to reduce electrical impedance below 5 KO. For the PGM, bipolar intramuscular electrodes were fabricated from Teflon-coated stainless steel wire 75 mm in diameter (AM Systems, Carlsborg, Washington), with 1 mm of insulation removed and a hook formed by bending the tips of the wire back approximately 1 to 2 mm. The electrodes were sterilized, after insertion into a hypodermic needle (0.7 3 38 mm), before insertion into the PGM under the guidance of real-time ultrasound imaging (7.5-MHz curved linear array transducer; Dornier Performa, Acoustic Imaging Technologies Corp, Phoenix, Arizona). 15 The ground electrode was placed over the iliac crest of the non–test leg. Electromyographic data were sampled at 2000 Hz and bandpass filtered at 20 to 1000 Hz using a Power1401 data acquisition system and Spike5 software (Cambridge Electronic Design, United Kingdom, Cambridge) and analyzed using IGOR Pro (Igor Pro 5, Wavemetrics, Inc, Lake Oswego, Oregon). Participants completed a visual choice reaction time stair-stepping task, where they faced a force plate (model 9286AA [4003600 mm], Kistler, Winterhur, Switzerland; software Bioware version 3.21) placed on a step (combined height 22 cm). This task has been used previously by our research group and is capable of identifying neuromotor control differences between people with and without AKP. Participants stepped up as quickly as possible in response to a light, indicating left or right leg. Data were collected for 5 stepping repetitions. Electromyographic data were expressed relative to foot contact, determined from the vertical component of the ground-reaction force.

Hip External Rotation, Abduction, and Trunk Side Flexion Strength

A handheld dynamometer (Nicholas Manual Muscle Tester; Lafayette Instrument, Lafayette, Indiana) was used to measure isometric hip strength using published methods. For hip external rotation strength, the participants sat with their arms folded and thighs strapped together to maintain neutral hip adduction. The dynamometer was placed over the tibia, 10 cm above the medial malleolus. After 2 warm-up trials, 3 maximum trials were performed, and the peak force was recorded. The distance from the knee joint line to the position of the dynamometer was recorded (m) and used to convert the force (N) data to a torque (Nm). For hip abduction strength, participants lay supine with arms folded across their chest. Straps were used to stabilize the opposite thigh. The dynamometer was placed 10 cm above the lateral knee joint line. After 2 warm-up trials, the peak force from 3 maximal trials was recorded. The distance from the greater trochanter to the position of the dynamometer was recorded (m) and used to convert the force (N) data to a torque (Nm). The trunk side bridge test was used to provide an indication of lateral trunk muscle capacity. In this test, participants lay on one side, supported on their elbow with their opposite hand crossed over the chest. While maintaining a neutral trunk alignment (shoulders, hips, and ankles aligned), the dynamometer was placed just proximal to the greater trochanter. Participants were instructed to push up maximally, lifting their pelvis and trunk. After 2 warm-up trials, the maximum of 3 trials was recorded. Because of the difficulties in determining the moment arm to generate a torque for this measure, force (N) was used. This test was performed on each side.

Determining Criteria for Clinical Rating of Performance on the Single-Leg Squat Task 

A panel of 5 experienced physical therapists (4 with musculoskeletal expertise and 1 with neurological expertise) formed the consensus panel. These physical therapists had different training and clinical experiences. The panel met to discuss the task and to determine the criteria that would be used for future rating.

Clinical Rating of Single-Leg Squat Performance

All ratings were performed with no knowledge of the participants’ results on the hip muscle function tasks. Each member of the consensus panel reviewed a DVD with all digital recordings from all participants. The single-leg squat performance for each participant was assessed by each panel member independently. The panel thenmet as a group to review their decisions. When there was a discrepancy in the rating, the group reviewed the images and then reached consensus on whether the participant could be graded as ‘‘good,’’ ‘‘fair,’’ or ‘‘poor.’’ Of the participants recruited for this study, 9 were rated by the consensus panel as having good performance on the single-leg squat, 12 were rated as having poor performance, and 13 were rated as fair.

Reliability of Clinical Rating of Single-Leg Squat Performance 

Three different physical therapists, 2 with postgraduate musculoskeletal qualifications (average of 10 years of physical therapy practice) from different universities and 1 graduate from a third university, participated in the reliability study. These physical therapists had no formal interaction with the members of the consensus panel in the preceding 5 years. Each physical therapist was provided with a DVD containing digital images from 15 participants: 5 who had been rated good, 5 who had been rated fair, and 5 who had been rated poor. There were no differences in participant characteristics between the 3 groups (P . .05). The 15 participants were selected from those participants whose performance on the single-leg squat task was rated the same by all consensus panel members (ie, there was no discrepancy in the initial ratings). The physical therapists participating in the reliability study were also provided with the clinical rating criteria and one example of a typical participant from each of the 3 rating categories. The participants in the DVD were de-identified. After conducting their rating, the physical therapists returned their scoring sheet and the DVD to 1 examiner. One week later, they were provided with a second DVD. In this DVD, digital images of the same participants were included but in a different order and with different codes. The scoring sheets and DVDs were returned when scoring was completed.

Comparison of Hip Muscle Function Between Participants With Good and Poor Performance on the Single-Leg Squat Task

Those participants rated as fair were excluded from this comparison. Hip muscle function— onset of AGM and PGM EMG activity and hip abduction, external rotation, and trunk side flexion strength—was compared between the participants who were rated as good and poor. There were no between-group mean differences for age, height, or weight (Table 1). The ‘‘good’’ group contained 5 women and 4 men, whereas the ‘‘poor’’ group contained 8 women and 4 men (x2 = 0.269, P = .604).


Data Analysis

The examiner responsible for data extraction and processing was blind to single-leg squat performance ratings. The onset of EMG activity was identified visually from the raw data as the point at which EMG activity increased above the baseline activity.7,8 Traces were displayed in a de-identified manner, with no reference to group, muscle, or trial number. Onset times for each muscle were assessed by 2 separate examiners using data from 15 trials in 2 participants. There were no significant differences between the 2 examiners (P = .62), and the 2 scores were closely matched, with a mean absolute difference of 1.2 milliseconds and a standard deviation of 2.4 milliseconds. Muscle strength data were normalized to the participant’s body weight to enable comparisons between participants.

Statistical Analysis

To determine the reliability of the clinical rating, the agreement between the 3 physical therapists and the consensus panel was determined for each physical therapist using a kappa coefficient. The agreement between the repeated clinical ratings for each physical therapist was also determined using a kappa coefficient. A kappa coefficient of 0.60 was considered acceptable, 0.60 to 0.80 substantial, and 0.80 to 1.00 excellent.22 Electromyographic and muscle strength data were analyzed for normality and homogeneity of variance. Independent samples t tests were used to compare muscle onset, force, and hip and trunk muscle strength between the good and poor performance groups. An analysis of covariance was also performed to evaluate whether gender influenced the outcomes of the between-group comparisons. The alpha level was set at 0.05.

Clinical Rating of Performance on the Single-Leg Squat Task

The expert panel of physical therapists reached consensus on the criteria to rate the performance on the single-leg squat task (Table 2). The 5 criteria were overall impression for the 5 trials, posture of the trunk over the pelvis, posture of the pelvis, hip joint posture and movement, and (5) knee joint posture and movement. The ankle joint was not included in the criteria because it was considered to be reflected in the posture and movement of the knee joint. On the basis of these criteria, the panel determined that a person’s performance could be rated as good, fair, or poor (Figure 1; see video supplement, available online at The descriptions of the requirements to be considered ‘‘good’’ for every component of each criterion are listed in Table 2. To be considered good, the participant needed to achieve all of the requirements for 4 of the 5 criteria for all of the 5 trials. The participant’s performance was considered poor if he or she did not meet all of the requirements for at least 1 criterion for all of the trials. Those participants who could not be rated as good or poor were rated as fair. Importantly, the panel believed that the rating system reflected clinical decision making. For example, a participant rated as good would be considered not to require any intervention to address hip or trunk muscle function. Alternatively, a person rated as poor would require an intervention aimed at improving hip or trunk muscle function. A person rated as fair would be considered to not require any treatment to address the hip or trunk muscle function as the first priority, but the therapist may need to address these factors at some point during the rehabilitation process.



Agreement From Physical Therapists on Clinical Rating of the Single-Leg Squat

Concurrency with the consensus panel was excellent for the 2 more experienced raters (agreement 80%-87%; k = 0.700-0.800) and substantial for the least experienced rater (agreement 73%; k = 0.600). Similarly, intrarater agreement was excellent for rater A (agreement 87%; k = 0.800) and rater B (agreement 80%; k = 0.692) and substantial for rater C (agreement 73%; k = 0.613).

Difference in Hip Muscle Function Between Persons With Poor and Good Rating on the Single-Leg Squat

Participants rated as good performers had significantly earlier onset timing of AGM (mean difference, 2152; 95% confidence interval [CI], 2258 to 248 ms) and PGM (mean difference, 2115; 95% CI, 2227 to 23 ms) than those rated as poor performers (Table 3). Between-group comparisons of hip muscle strength were performed on 7 good performers and 12 poor performers (testing could not be completed on 2 of the good performers because of equipment malfunction). Participants rated as good performers exhibited greater hip abduction torque than poor performers (mean difference, 0.47; 95% CI, 0.10-0.83 NmBw21). There was no difference in hip external rotation torque (mean difference, 0.14; 95% CI, –0.20 to 0.48 NmBw21) between the 2 groups. On the trunk side bridge test, participants rated as good performers produced more force than poor performers on the weightbearing side (mean difference, 1.08; 95% CI, 0.25-1.91 NBw21). The inclusion of gender as a covariate to the between-group comparisons affected results only for the onset of PGM EMG (P = .055).



Anterior knee pain is a leading cause of pain and reduced physical function worldwide that is characterized by reduced quality of life and pain-free performance of workor exercise-related activities. Importantly, there is evidence that knee pain in adults is not self-limiting and that it may precede the development of osteoarthritis. Therefore, it is imperative to study means to reduce the burden of AKP. Nonoperative interventions are recommended to reduce pain and physical limitations associated with AKP. However, people with AKP have heterogeneous presentations, and there is limited information available to guide treatments that are targeted to subgroups of people with AKP. Contemporary clinical expertise, combined with emerging research in AKP, indicates that treatment of hip muscle function may result in greater effects (reductions in pain, improvements in physical function) if such treatments are primarily targeted toward the subgroup of people with AKP who have compromised hip muscle function. Thus, it is imperative to develop and evaluate a clinical assessment tool that is capable of identifying people who have altered hip muscle function. This study developed rating criteria for the clinical assessment of the single-leg squat and established its reliability and validity to detect altered hip muscle function (delayed gluteus medius EMG onset and reduced hip muscle strength).

Clinical Rating of the Single-Leg Squat

The consensus panel, consisting of 5 highly experienced physical therapists, formulated rating criteria for the clinical assessment of the single-leg squat. In contrast to other rating scales, the criteria did not award points for specific components but asked the assessor to evaluate the performance of the participant as a whole. The panel concurred that this was a method typically employed by physical therapists when clinically evaluating a patient’s performance. For example, a poor rating needed to reflect the situation whereby an intervention would be considered necessary to address the patient’s hip/trunk muscle function. By adopting this type of clinical reasoning approach, it was thought that physical therapists would have more confidence with their rating of performance. In contrast, physical therapists may be less likely to estimate ranges of motion from visual observation. Therefore, this holistic approach to the rating criteria facilitated a faster, more efficient assessment of performance, which may be readily transferred to the clinical environment.

Clinical Rating of Performance on the Single-Leg Squat Task Has Acceptable Agreement

The excellent to substantial agreement in the rating of the single-leg squat performance observed between the 3 raters and the consensus panel is considered to be ‘‘acceptable’’ reliability.22 This is an important finding because for the clinical assessment tool to be useful, agreement between and within raters is required. Previous studies in this
area have found varying levels of reliability. In the current study, we attempted to maximize our reliability by incorporating a consensus panel to devise rating criteria for the clinical assessment. Although the rating categories (‘‘good’’, ‘‘fair,’’ and ‘‘poor’’) are similar to those used in previous studies,5,12 the current study incorporated a clinical reasoning element to the rating criteria, which may have enhanced the reliability. Furthermore, in the absence of a ‘‘gold-standard’’ measure of single-leg squat performance, the consensus panel established a gold-standard rating of the assessment of single-leg squat performance. This differentiates our study from previous studies, in which the individual raters were compared with other individual raters. We also used a training DVD, with examples of the different rating categories. The lower concurrence noted by the least experienced rater may indicate that further training is required, perhaps incorporating a feedback element, to enhance reliability in less experienced assessors. Although further studies are required to confirm these results, the assessment of the single-leg squat performance appears to be a tool that can be used with confidence by experienced practitioners in a clinical setting.

Performance on the Single-Leg Squat Indicates Hip Muscle Function

The onset of AGM EMG was delayed in people who were rated as having a poor performance compared with those with good performance on the single-leg squat. This is the first study to evaluate whether clinical rating of performance on a functional task can indicate hip muscle function. This is an important finding because hip muscle dysfunction is considered a factor in the development and persistence of AKP. Because of the unique biomechanics of the patellofemoral joint and its intimate relationship with the femur, several hip muscle factors have the potential to influence the magnitude and distribution of patellofemoral joint stress. The prevailing theory is that aberrant hip muscle function manifests as altered hip and/or knee biomechanics during functional tasks (eg, walking, running, stair ambulation, jumping, and landing),23 but currently there is conjecture surrounding the precise nature of this relationship. The many methodological issues that vary between studies (including study populations, measurement methods, and task choice) make direct comparisons difficult. Notwithstanding the need for more research to confirm the relationship between hip muscle function and hip and/or knee biomechanics during functional activities, hip muscle dysfunction may warrant targeted interventions if it can be identified. Increasingly, the timing (or coordination) of muscle activations is considered important in musculoskeletal pain conditions, and delayed onset of gluteus medius activity has been identified in people with AKP. Measurement of EMG requires sophisticated laboratory equipment and considerable expertise. Thus, the ability of a simple clinical test to provide an indication of the gluteus medius EMG timing has useful applications for physical therapists. Hip abduction strength was 29% lower and lateral trunk strength (measured with the side bridge test) was 23% lower in those with poor compared with good performance on the single-leg squat. Therefore, the clinical assessment of performance on a functional task was able to indicate hip and trunk muscle strength. This finding is important because hip muscle strength is a feature of AKP, with potential for change with an intervention. Furthermore, in a study of college athletes, Leetun and colleagues 17 observed that reduced hip muscle strength was a predictor of sustaining a lower limb injury over one athletic season. Further studies are required to confirm the role of hip muscle dysfunction in the development of AKP in diverse populations. Nonetheless, the ability to indicate hip muscle strength from a single clinical test has considerable implications for the assessment of at-risk persons or to guide and monitor treatments. The interrelationships between hip muscle function (strength or neuromotor control), trunk muscle function, performance on a functional task, and the development or persistence of AKP require further elucidation. However, the weight of current evidence suggests that altered hip or trunk muscle function may be an important factor in AKP. Our study results, if confirmed in future studies, indicate that the clinical assessment of the single-leg squat may be capable of identifying patients with hip muscle dysfunction and hence may be a tool that can be used by clinicians when selecting treatment options (eg, strengthening or retraining hip muscle function) targeted to their patients’ findings. In addition, these results may be used in clinical research to establish subgroups of people with hip muscle dysfunction and evaluate targeted treatments.

Limitations and Future Studies

In the current study, assessment of performance on a single-leg squat task used a protocol that reflected routine clinical practice. Greater standardization of the test protocol (eg, squat depth, speed of squat, and the use of markers to identify anatomical landmarks) may enhance the reliability and vaidity of this test. In addition, because foot posture and motions may influence lower limb function via the closed kinetic chain, future studies could also include an evaluation of the foot and ankle. The study used only 5 experienced physical therapists, from different training and experiential backgrounds, in the consensus panel. The results of the consensus panel could be strengthened in future studies by evaluation of the assessment criteria by a broader cohort of physical therapists. Similarly, the reliability study could be repeated, using different physical therapists and different training techniques. Although we used a training DVD, future studies should evaluate the additional benefits of a more intensive training program, perhaps incorporating interactive feedback. Despite the study’s limitations, the results of this study indicate that the clinical assessment of the singleleg squat has clinical utility to determine hip muscle function. Future studies are required to establish the repeatability of a person’s performance on such a task and to determine the effects of targeted interventions on the performance of the single-leg squat. Such studies may identify whether the single-leg squat task is sensitive to change. Furthermore, additional studies are required to identify whether the results observed in the current study differ between men and women. In the current study, healthy participants were chosen because people with AKP frequently report pain with a single-leg squat that may affect the results. Research is now needed to explore whether similar results are also evident in a group of people with AKP.

Clinical Relevance

This study identified that the clinical assessment of performance on the single-leg squat is a reliable tool that may be used to identify people with hip muscle dysfunction.