If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
This study sought to determine the feasibility of collecting physiologic data in thoracic surgery residents and whether it would correlate with burnout and burnout with performance.
This was a prospective study of thoracic surgery residents over a 5-month period. Participants were evaluated with a wearable biometric device (heart rate variability and sleep) and the Maslach Burnout Inventory. Resident performance was quantified using Accreditation Council for Graduate Medical Education Milestones (scale, 1-5) normalized to program-designated targets (3 for postgraduate year 6 or lower residents and 4 for postgraduate year 7 residents).
The cohort consisted of 71% female participants (5/7) with 86% of residents having 1 or more children. High levels of emotional exhaustion (median, 30 [interquartile range, 20-36], where >26 is high) and high levels of depersonalization (median, 16 [interquartile range, 14-22], where >12 is high) were common, but personal accomplishment was also uniformly high (median, 43 [interquartile range, 41-46], where >38 is high). There was a significant correlation between heart rate variability and emotional exhaustion (r(12) = 0.65, P = .01) but not depersonalization (P = .28) or personal accomplishment (P = .24). Depersonalization and personal accomplishment did not correlate with resident performance (P = .12 and P = .75, respectively); however, increased emotional exhaustion showed a significant correlation with higher resident performance during periods when burnout was reported (r(6) = 0.76, P = .047).
Dynamic measurement of resting heart rate variability may offer an objective measure of burnout in thoracic surgery residents. Thoracic surgery residents who report high levels of burnout in this cohort maintained the ability to meet program-designated milestones at or above the level expected of their postgraduate year.
Despite progress in recognizing burnout and efforts to mitigate its effects, it has become endemic in cardiothoracic surgeons. The complex etiology and limited methods of measurement have made definitive interventions for burnout elusive. This proof of concept study shows that dynamic measurement of resting HRV may offer an objective measure of burnout in thoracic surgery residents.
Physician burnout, defined as emotional exhaustion, depersonalization, and decreased sense of personal accomplishment, affects the majority of thoracic surgical residents and approximately half of the thoracic surgeons in practice.
Burnout not only affects the health and performance of physicians but also affects our patients and healthcare systems. Physician burnout is associated with twice the likelihood of patient safety incidents and decreased quality of care.
The data amassed over the last decade documenting physician burnout are extensive. Despite this, there are no preventive measures or interventions that consistently address burnout, and it continues to rise among physicians. The cause of burnout is multifaceted, and the significance of each contributing variable may change as the level of training increases and differs between medical specialties.
The MBI is a scale designed to assess various aspects of the burnout syndrome. Although this is a well-validated resource to assess the prevalence of burnout among physicians, the nature of the instrument limits its sensitivity and reliability in measuring the impact of interventions. There is a critical need to identify a minimally invasive approach to objectively measure the physiologic as well as the psychologic impact of interventions aimed at reducing physician burnout.
Prolonged exposure to stressful environments is one of the key drivers of burnout.
Stressful stimuli induce a shift toward overriding sympathetic input (tone) that alters the cardiac rhythm, increasing the heart rate (HR) and decreasing the variability in the time between heartbeats. Heart rate variability (HRV) quantifies the variation in the time interval between consecutive heartbeats in milliseconds. The heart does not beat rhythmically like a metronome; there is constant variability, and that variability mirrors autonomic nervous system activity. In healthy adults exposed to an acute stressor, sympathetic tone is quickly downregulated by parasympathetic input, slowing the HR and increasing the HRV as the stress is reduced. In situations of prolonged stress, decreased parasympathetic tone and lower HRV occur as the individual can no longer balance sympathetic stimulation appropriately.
As a measurable parameter, HRV offers a noninvasive biomarker of stress and recovery. With sympathetic activation during stress, HRV decreases. With increased parasympathetic tone during recovery, HRV increases. HRV has been correlated to burnout in cardiac patients, college students, and athletes who use HRV to design training regiments to improve performance.
We hypothesized that HRV data could be applied to a cohort of thoracic surgical residents and might offer an objective physiologic measure of burnout. This study intends to report the feasibility of using the whoop strap to collect physiologic data from thoracic surgical trainees and whether these data correlate with burnout. The second objective was to determine whether burnout would correlate with performance as described by the Accreditation Council for Graduate Medical Education (ACGME) milestones.
Materials and Methods
This was a prospective study of full-time clinical thoracic surgery residents in an early specialization (4 + 3) or standard integrated (5 + 2) training program over a 5-month period ending January 2022. Participants were evaluated with a wearable biometric device (Whoop strap), the MBI, and the ACGME Milestones scores. Demographic data (sex, ethnicity, marital status, and number of children) and academic data (clinical training year, training focus, ACGME milestones report) were collected via survey and residency files. Physiologic data (resting HR, HRV, and sleep parameters) were measured using the Whoop strap. Selection of the physiologic data reported was based on the timing of survey responses. Of the enrolled trainees, 7 of 9 (78%) completed survey data. Participants who were pregnant, breastfeeding, or within 12 weeks of delivery were excluded from physiologic data analysis.
The entire cohort of ACGME fellows (n = 6, postgraduate year [PGY] 6-7) was included in performance and burnout assessments. Performance and burnout, but not physiologic data, were collected on 1 PGY 5 (4 + 3) resident. The remaining 2 residents enrolled but ultimately chose not to participate; 1 completed training and left the institution before data collection, and 1 was not compliant with wearing the device or answering survey. The Institutional Review Board of the Washington University reviewed our protocol before data collection and approved the study (201907040, January 31, 2022), and all study participants signed informed consent.
The Whoop strap is a commercially available device worn on the wrist or bicep that measures HR and HRV via photoplethysmography and movement via a 3-axis accelerometer 100 times per second on a continual basis. These measures are used to detect sleep and sleep stage. For each resident, HRV was quantified as the root mean square of successive beat to beat interval differences and was calculated during slow-wave, deep sleep.
Residents were asked to wear the devices for the duration of rest periods to capture resting HRV. The devices could be worn continuously providing feedback through the app to the resident, but nonrest periods were not tracked by the study.
Normal sleep in a healthy adult consists of 2 primary sleep states: rapid eye movement (REM) sleep and non-REM sleep. Non-REM sleep comprises approximately 75% to 80% of total sleep time broken into 4 distinct stages with stages 1 and 2 labeled “light sleep” and stages 3 and 4 labeled slow-wave or deep sleep. Cyclic REM sleep occurs 4 to 6 times nightly accounting for approximately 20% to 25% of total sleep.
For this study, REM sleep was reported as a percentage of total sleep as detected by the Whoop strap.
Assessment of Burnout Using the Maslach Burnout Inventory
The MBI is a 22-item survey scored on a 7-point Likert scale designed to measure burnout on 3 dimensions: emotional exhaustion (9 questions, score range 0-54), depersonalization (5 questions, range 0-30), and personal accomplishment (8 questions, range 0-48). The MBI was offered to participants multiple times over the course of 5 months. Participants were given the MBI at enrollment and could record responses to the survey weekly thereafter. There was no minimum response rate required after the initial survey, and participants were sent notifications of opportunity via e-mail with a unique link to the survey. Physiologic data were averaged over a 7-day period that included the day the MBI was submitted for each occurrence. MBI scores for each of the 3 dimensions were categorized as low, moderate, or high based on the scoring and interpretation key provided with the survey using common cutoff points. Emotional exhaustion was categorized as low if less than 17 and high if greater than 26. Depersonalization was categorized as low if less than 7 and high if greater than 12. Personal accomplishment was categorized as low if less than 32 and high if greater than 38. Burnout is defined as reporting a high score for emotional exhaustion, high score for depersonalization, or low score for personal accomplishment.
Resident performance was quantified using ACGME Thoracic Surgery–Independent Milestones 2.0, a criterion-based competency assessment of residents that occurs every 6 months. Specialty-specific milestones were scored on a scale of 1-5 normalized to a program-specific target (3 for PGY 6 or less residents and 4 for PGY 7 residents). Our 2-year program has evolved into one in which most residents track into general thoracic or adult cardiac, so chief residents spend very little, if any time in the nonspecialty rotations. As such, we analyzed only specialty-specific milestones to assess progress (Table E1).
Burnout and physiologic data were analyzed using univariate linear regression analysis (for continuous variables) or chi-square test (for categorical variables). The 3 MBI dimension scores were treated as continuous variables for HRV analysis. For analysis of burnout based on demographic characteristics, MBI dimensions were treated as categorical variables categorized into low, moderate, and high based on the highest value reported by each subject during the study period. Statistical analysis was performed using Sigmaplot version 12.5 for Windows (Systat Software, Inc).
The cohort consisted of 71% female participants (5 of 7), with 86% of residents having 1 or more children. No demographic characteristic was associated with any dimension of burnout (Table 1). Table 2 summarizes MBI scores for each dimension. Residents most often reported high levels of emotional exhaustion and depersonalization; however, a positive sense of personal accomplishment persisted. Burnout was experienced by 86% of residents on at least 1 MBI measurement and at least 1 of the criteria was met on 93% of all MBI measurements during the study period.
Table 1Association of thoracic surgery resident demographics with the 3 dimensions of the Maslach Burnout Inventory
The average resting HR was 61 ± 6 beats/min. There was no correlation between emotional exhaustion or depersonalization and resting HR (P >.5 for all). In contrast, personal accomplishment correlated with a decrease in resting HR (r(12) = 0.57, P = .04); as the resident's sense of personal accomplishment increased, resting HR decreased. There was a significant correlation between HRV and emotional exhaustion (r(12) = 0.65, P = .01, Figure 1) but did not correlate with depersonalization (P = .28) or personal accomplishment (P = .24). Likewise, depersonalization and personal accomplishment did not correlate with resident performance on normalized ACGME milestones (P = .12 and P = .75, respectively); however, increased emotional exhaustion showed a significant correlation with higher resident performance on program-specific normalized ACGME milestones (r(6) = 0.76, P = .047, Figure 2).
On average, thoracic surgery residents got 5 hours and 52 minutes (±76 minutes) of sleep nightly. Total sleep did not correlate with any dimension of burnout (P >.4 for all). When analyzing sleep stages, residents averaged 24% ± 4% of total sleep time in REM, 22% ± 5% of total sleep time in slow-wave sleep, and 54% ± 5% in light sleep. When analyzing sleep stages, HRV positively correlated with REM sleep (r(11) = 0.71, P = .01) (Figure 3, A), but had no correlation with any other sleep stage (Video Abstract). Additionally, there was a significant negative correlation between emotional exhaustion and percent time spent in REM sleep (r(11) = 0.67, P = .02) (Figure 3, B).
HRV has been used as an objective measure of alterations in stress-related physiology across a multitude of professions, including physicians.
found that HRV reliably detected differences in mental stress in an operative setting. The review included 17 studies of surgeons (n = 1-20 surgeons) and showed that HRV could be used to pinpoint stressors within the operative environment with regard to time, role, and technique. Within this collection of studies, there were no designs aimed at using HRV to measure long-term stress or burnout specifically. The Dresden burnout study, a large-scale longitudinal study of burnout in adults, measured burnout (MBI), and HRV (root mean square of successive beat to beat interval differences) in 410 subjects.
Within the thoracic surgery residents included in the current report, HRV decreased as emotional exhaustion increased.
When looking at resting HR, previous work found that subjects with significant burnout (n = 22), defined as requiring 2 weeks to 6 months away from work, had increased resting HR compared with healthy controls (n = 23).
In line with these data, we found that a decrease in personal accomplishment was correlated to an increase in resting HR, further supporting the hypothesis that burnout has physiologic changes that can be measured.
In addition to HR and HRV, the Whoop device detects sleep stages and has been validated against polysomnography. The Swedish National Institute for Psychosocial Factors and Health studied sleep in subjects with severe burnout compared with healthy controls. They classified severe burnout as having Shirom Melamed Burnout Questionnaire greater than 4.5 on scale of 1 to 7 and being on full-time sick leave for 3 months. This approach offers a unique perspective given that subjects in most studies on burnout are recruited from the “working sick” excluding the most severely affected who are no longer able work. When compared with controls, subjects with high burnout had 6% less total REM sleep (19% vs 24% of total sleep).
The present study is limited in that sleep fragmentation cannot be accurately documented because frequent sleep disturbances are the hallmark of call shifts. However, REM sleep was found to correlate negatively with increased emotional exhaustion and positively with HRV, strengthening hypothesis that HRV may offer an objective measure of burnout in thoracic surgical residents.
The ACGME Milestones project was designed to provide a framework for evaluating performance throughout training. Residents are expected to move from novice (a score of 1) toward expert (a score of 5) aiming to achieve a score of 4 at the completion of training.
Unexpectedly, the current report demonstrates high levels of emotional exhaustion had a significant positive correlation with higher scores on ACGME Milestones performance. One explanation for this finding is that both performance and burnout are a reflection of workload in thoracic surgery residents. The Areas of Worklife Scale is a measure aimed at analyzing the work environment as components that play a role in altering the continuum between engagement and burnout.
workload has been shown to have a consistent relationship with emotional exhaustion. Within surgical training, workload, as quantified by operative volume, correlates with increased performance, as measured by ACGME milestones and simulated procedural tasks.
With workload leading to both increased performance and emotional exhaustion, it is something of a double-edged sword for surgical training programs. HRV may offer a mechanism by which we can allow the resident to push to the limits of their capacity and safely monitor for adequate recovery allowing for maximum efficiency and improved performance.
Applications and Future Directions
In athletes, the phenomena of decreased performance despite increased or maintained levels of effort and training is referred to as “overtraining syndrome” (OTS). Also originally termed “burnout,” OTS is a continuum. The burnout continuum begins with undertraining, where demand is less than allostatic load and the athlete meets demands without a required recovery period. The consequence is maintenance of performance but no further improvement. The burnout continuum advances to OTS, where the allostatic load exceeds athletic capacity and the athlete is unable to recover, even with prolonged rest. The ultimate consequence is that performance continually declines. In successful athletic training programs, planned workloads beyond the athlete's current capacity called “overtraining” are often used as a means to stimulate adaptation resulting in improved performance. Planned overtraining is termed “over-reaching” and is followed by a period of relative rest or tapered effort to allow for compensation and increased capacity.
The European College of Sport Science and the American College of Sports Medicine consensus statement further supports this construct endorsing the use of the terms functional overreach (FOR) and nonfunctional overreach (NFOR). FOR is determined by the athlete's ability to recover in a period of days to weeks. This is in contrast to NFOR or continued intense training into the period of declined performance requiring weeks to months to recover to baseline performance.
investigated whether HRV could be used to distinguish athletes with NFOR and OTS (n = 43) from those in the same discipline with the same training load without it (n = 40) and compared all athletes with 35 sedentary subjects of the same age. Similar to what is seen in our cohort with respect to emotional exhaustion, they found that athletes with NFOR/OTS had a lower HRV compared with athletes without NFOR/OTS. Despite obvious superior physical conditioning compared with sedentary controls, the group of NFOR/OTS athletes had HRV similar to sedentary controls. The similarities among the technical demands, performance measures, and required endurance of training in thoracic surgery and elite athletes are remarkable when viewed through the lens of the physiologic response to stress. This parallel offers an opportunity to further explore whether surgical training might benefit from being modeled on the construct of FOR and NFOR.
We hypothesize that adjusting training intensity based on a physiologic indicator of physical and psychological allosteric load (HRV in the case of our thoracic surgery residents) would allow us to maximize the efficiency of training (Figure 4, A). Having real-time data on resident capacity would allow faculty to optimize the timing of increases in responsibility and autonomy. Adjusting expectations on this level would maximize efficiency for both the resident and faculty given that faculty demonstrate more stress when supervising than when performing as the primary operator.
Having continuous, longitudinal HRV data could alert programs when a resident is unlikely to improve from continued maximal effort and might benefit from a pivot in focus similar to cross-training (Figure 4, B). At these points, residents might be encouraged to make progress in less intense requirements such as simulation, patient safety and quality improvement projects, or less physically demanding procedures such as endoscopy or endovascular interventions. These pivots would allow for a period of recovery while continuing to meet the educational goals of the program. We could optimize the targets to better match capacity, that is, work smarter, not harder. Novel contingency plans for flexing coverage will need to be developed to balance program needs to individual workload capacity.
HRV could be used not only to guide training but also to measure the impact of proposed interventions aimed at both the individual and systems levels. Having a physiologic measure of effectiveness would allow programs and hospital systems to focus investments (time and funding) on data-driven interventions. By eliminating perfunctory measures aimed at burnout that are found to be ineffective, we could free up additional resources for the more costly system-level changes. Further investigation into the utility and reliability of HRV as an objective measure of burnout is warranted and has the potential for stimulating positive changes in the efficacy of our training.
To accomplish data-driven interventions, a multi-institutional study would be required for several reasons. The cohort size and single institution of this pilot study limit the ability to report many factors known to influence HRV (gender, health conditions, and medications) due to the substantial risk of unintentionally reporting identifiable health information on female and underrepresented minority subjects. Given the disproportionately small number of these individuals currently in training, only a large sample would allow for meaningful analysis and keeping any results truly anonymous.
Additionally, the majority of residents had high levels of emotional exhaustion or depersonalization at each MBI assessment, not allowing for comparisons between HRV in subjects with and without burnout. For this feasibility study, a control group was not identified. To offer a meaningful assessment, control subjects would need to be trainees with similar roles, responsibilities, and attending physicians. Using trainees from other fields or at junior levels would not assess device performance or study design feasibility in the target environment. Additionally, milestones are specialty-specific, limiting comparisons between subjects and nonthoracic surgery controls. For these reasons, we propose that a large multi-institutional study would provide enough data to select a control cohort of individuals or time points where burnout is not reported. Finally, we were unable to compare changes from baseline because of the inability to calculate a true baseline HRV because all subjects have been in training for 7 to 10 years at the time of this study. To address this limitation, a longitudinal study following residents over the course of the training period would be required and is recommended by the authors.
These data suggest that HRV may offer an objective measure of burnout in a homogenous population (Figure 5). A large multi-institutional study design would be necessary for a representative control group to be identified and to report health information known to affect HRV while maintaining the anonymity of the subjects. Additional longitudinal studies are needed to determine whether HRV changes may precede or are the result of changes in emotional exhaustion. Further characterization of HRV as a measure of recovery may allow for a more targeted and efficacious approach to maximizing the training period in thoracic surgery.
Conflict of Interest Statement
M.R.M. is a consultant/advisory board member for Medtronic. The other author reported no conflicts of interest.
The Journal policy requires editors and reviewers to disclose conflicts of interest and to decline handling or reviewing manuscripts for which they may have a conflict of interest. The editors and reviewers of this article have no conflicts of interest.