Associations of timing of physical activity with all-cause and cause-specific mortality in a prospective cohort study

Participants and accelerometer assessment

This large cohort study was conducted based on the UK Biobank. The UK Biobank received ethical approval from the North West Multi-center Research Ethics Committee to collect and distribute samples and data from the participants (Reference numbers: 16/NW/0274& 21/NW/0157;, which covers the work in this study under approved Application 58082. All UK Biobank participants provided informed consent. In addition, we have obtained approvals from our institutions for analyzing the UK Biobank data in this study. UK Biobank is a large population-based cohort that recruited over 500,000 participants aged 40–73 years between 2006 and 201038. Participants visited one of 22 assessment centers across England, Scotland, and Wales, and underwent detailed baseline assessments, including various sociodemographic, lifestyle, health, and physical assessments. Details of the rationale, design, and measurements for the UK Biobank are available online ( Between February 2013 and December 2015 (on average, approximately 5.5 years after their initial baseline recruitment), 236,519 UK Biobank participants were invited to participate in an accelerometer study. Among them, 106,053 participants agreed to participate and were provided with a wrist-worn accelerometer (Axivity AX3)39. Participants who accepted accelerometry measurement showed similar baseline demographic and health-related characteristics as those who declined the measurement40. The accelerometer was set up to start at 10 a.m. two working days after postal dispatch (to ensure that the accelerometer would not start recording during delivery), and capture triaxial acceleration data over 7 days at 100 Hz with a dynamic range of ±8 gravity. Participants were instructed to wear the device on their dominant wrist continuously for seven days while continuing with their usual activities. Participants were asked to mail the device in a pre-paid envelope back to the coordinating centers, after the seven-day monitoring period.

Using the raw accelerometer data from 103,682 participants, the UK Biobank accelerometer expert working group conducted data processing and generated physical activity intensity data (average vector magnitude in milligravity units) in 5-s epochs (field ID 90004) using the raw accelerometer data (field ID 90001). The raw acceleration signals were calibrated to gravity. Non-wear time was defined as consecutive stationary episodes lasting for at least one hour where all three axes had a standard deviation of less than 13.0 milligravity39. Epochs representing non-wear time were imputed based on all wear-time data at a similar time of the day on different days for each participant. More details about the data processing and analysis have been published39.

The exclusion criteria are as follows: (1) those who withdrew from UK Biobank; (2) those who had no PA data in any one hour of the 24-h cycle; (3) Similar to the previous study20, those who had high nocturnal activity (>10% PA accumulated between 01:00 and 04:00), as we focused on individuals with a diurnal lifestyle; (4) those with unreliable or invalid accelerometry data. The criteria for unreliable or invalid accelerometry data included (1) unexpectedly small or large size (Field ID: 90002); (2) less than 72 h or did not provide data for all 1-h periods within a 24-h cycle during the 7-day data collection (Field ID: 90015); (3) not well-calibrated (Field ID: 90016); (4) recalibrated using the previous accelerometer record from the same device worn by a different participant (Field ID: 90017); (5) data with a non-zero count of interrupted recording periods (Field ID: 90180); (6) data with more than 768 (Q3 + 1.5 × IQR) data recording errors (Field ID: 90182). In total 11,543 participants were excluded. Finally, 92,139 participants (88.87%) with valid data were included in the current study for the main analysis with the imputation of missing data, while 89,141 participants with the complete set of data were included for sensitivity analysis (Supplementary Fig. 4).


The processed activity intensity data were further used to yield MVPA. MVPA, often defined as requiring a moderate to a large amount of effort and with a notable to substantial acceleration in heart rate, is a well-validated surrogate for PA40. More importantly, as suggested by the previous research20, focusing on high-intensity levels of PA, such as MVPA, helps to determine a clear timing effect. Light-intensity PA was not included in this study because it occurs during walking and even sitting hours, thereby obscuring the temporal distribution of the more effective PA with higher intensity20. In this study, we tried to generate PA timing grouping using the averaged acceleration data, and only 0.74% (n = 680), 10.1% (n = 9316), and 0.70% (n = 641) of the participants were assigned to the morning, midday-afternoon, and evening group, respectively. The remaining 88.5% (n = 81,502) of the participants were assigned to the mixed group. Therefore, similar to previous research20, we focused on PA at a relatively high-intensity level (i.e., MVPA) to determine a robust timing phenotype.

Moderate-intensity physical activity was collected in sessions (5-min periods where more than 80% of 5-s epochs had a mean acceleration of 100 to 400 milligravity)40. Vigorous-intensity physical activity was defined as the 5-s epochs, where the mean acceleration was above 400 milligravity41. We included individuals with a diurnal lifestyle. Furthermore, high nocturnal activity always means sleep disturbances. Due to these reasons, we calculated the total minutes of MVPA by summing the minutes of moderate-intensity physical activity and vigorous-intensity physical activity between 05:00 and 24:00. Among 8354 (9.07%) participants who provided <1 week of accelerometry data, we extrapolated MPVA data to seven days.

Then, we ran the exploratory analyses to determine the appropriate boundaries for the time windows used for the main analyses. Exposure-dependent methods, such as the equally spaced intervals (which do not appear to be clinically driven), are generally arbitrary and may not be helpful in assessing a variable’s actual predictive value27. In contrast, the outcome-based methods allow an “optimal” cutoff to be estimated42,43. Therefore, this study ran an exploratory analysis for the identification of time window boundaries using the ‘outcome-based’ method. In addition, to balance sample size and accuracy, we chose the 2-h time window intervals (3-h only for the 21:00–24:00 period) in the exploratory analyses. The 50% method of assigning timing groups was similar to that previously used20, and this method avoids the participants being assigned to multiple timing groups. If over 50% of total daily MVPA occurred during the same 2-h period, participants would be assigned to the corresponding groups. For those who spent less than 50% of the total daily MVPA in any of the 2-h time windows, we assigned them to the mixed group. As shown in Supplementary Fig. 5, which mimics the nonlinear exposure-outcome curves, two change points at 11:00 and 17:00 were consistently observed for all mortality outcomes. Compared with the mixed group, the 2-h timing groups of the morning (05:00–11:00) and evening (17:00–24:00) periods seemed to have higher mortality risks. The 2-h timing groups of the midday to afternoon period (11:00–17:00) presented comparable mortality risks with the mixed group (Supplementary Fig. 5). Finally, morning (05:00–11:00), midday to afternoon (11:00–17:00), and evening (17:00–24:00) time windows were used in subsequent MVPA timing grouping and statistical analyses.

Similar to the previous study20, if ≥50% of total daily MVPA occurred during the same time window, participants would be assigned to the corresponding MVPA timing groups: morning (05:00–11:00), midday-afternoon (11:00–17:00), and evening (17:00-24:00) groups. For those who spent <50% of the total daily MVPA in any of three time windows, we assigned them to the mixed group.


The outcomes were all-cause, CVD, and cancer mortality. Cause-specific mortality was ascertained using the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (Supplementary Table 20). We measured specific mortality due to CVDs (codes I00–I99) and cancer (codes C00-C97) using the death registry. The date and cause of death were obtained from the death datasets of the National Health Service Information Center and the NHS Central Register. At the time of analysis, we censored the Cox regression analyses at the date of death or the date of available mortality data (12 November 2021), whichever came first.


Data on these possible covariates were obtained using self-reported questionnaires, accelerometers, and registry records. Age (continuous) was calculated from the date of birth and the date of wearing the accelerometer. Self-reported questionnaires were used to determine sex (female/male), ethnicity (white/others), recruitment center (England/Wales/Scotland), and Townsend deprivation index (continuous) based on the postcode of residence using aggregated data on unemployment, car and home ownership, and household overcrowding. The data on the season of accelerometry wear (i.e., spring, March to May; summer, June to August; autumn, September to November; winter, for December to February; UK Meteorological Office definitions) was obtained from accelerometer data. Other covariate data including educational attainment (degree or above/any other qualification/no qualification), smoking status (never/previous/current), frequency of alcohol intake (not current/less than three times a week/three or more times a week), and diet-related factors were obtained from touchscreen questions. We calculated the healthy diet score by using the following factors: vegetable intake of at least four tablespoons each day (median), fruit intake of at least three pieces each day (median), fish intake of at least twice per week (median), unprocessed red meat intake of no more than twice per week (median), and processed meat intake of no more than two per week (median). Sleep duration (<7 h per day/7–8 h per day/> 8 h per day) and sleep midpoint (<02:30/02:30-03:30/> 03:30) were measured using the accelerometer. Obesity (body mass index ≥30 kg/m2) was obtained from touchscreen questions. Previous diagnoses of diabetes, longstanding illness, depression, CVDs, and cancer were obtained from the self-reported questionnaires, hospital records, and death registry. The values of some covariates, including education level, smoking status, alcohol consumption, healthy diet score, obesity, diabetes history, longstanding illness, and cancer history, were obtained from touchscreen questionnaires at the time-point closest to the accelerometry (Supplementary Fig. 6). Detailed information sources, assessment timeline, and missing percentages are shown in Supplementary Fig. 6 and Supplementary Tables 20–21.

Statistical analyses

The event numbers of all outcomes were sufficient as per the rule-of-thumb estimation44, which requires at least ten events per variable. We conducted multiple imputations to assign any missing covariate values using the “mice” package (v3.13.0) in R45. The overall sample and complete case sample showed similar baseline characteristics (Supplementary Table 22). Before investigating the timing effect of MVPA, the linear and nonlinear associations of total MVPA volume and MVPA within the three time windows with mortality risk were assessed using penalized cubic splines fitted in the fully adjusted Cox models. In addition, we assessed the linear and nonlinear associations between the proportions of MVPA accumulated within the three time windows and mortality risk. Based on the fully adjusted model, cumulative risk curves were generated to show the standardized risks of mortality outcomes according to MVPA timing groups. Collinearity between all covariates was examined via correlation matrix analysis, which revealed no problem of multicollinearity. Cox proportional hazard regression (using the “survival” package v3.2-11 in R) was used to examine the associations of the timing of MVPA with mortality. HRs and their 95% CIs were calculated. We conducted careful adjustments. Model 1 adjusted for age and sex. Model 2 additionally adjusted for ethnicity, Townsend deprivation index, recruitment center, education level, the season of accelerometer wear, smoking status, alcohol intake, and healthy diet score. To investigate the associations independent of sleep duration, sleep phase, and total MVPA volume, model 3 further adjusted for these factors.

We performed a series of sensitivity analyses. First, we used different MVPA fraction cutoffs (55, 60, 65, and 70%) to assign timing groups. We did not conduct sensitivity analyses using cutoffs of >70% due to the small sample size for timing groups (other than mixed group). Second, Fine-Gray subdistribution hazards (using the “cmprsk” package v2.2-10 in R) were calculated, incorporating other-cause death as a competing risk for cause-specific mortality46. Third, we further adjusted for health-related variables potentially on causal pathways6, including obesity, diabetes history, longstanding illness, depression history, CVDs, and cancer. Fourth, we restricted the analyses to participants without shift work history and those without any missing covariate data, respectively. Fifth, we excluded participants who wore accelerometers during daylight-saving time transitions. Sixth, we ran the analyses by controlling for the month of accelerometer wear instead of the season of accelerometer wear. Seventh, we excluded events that occurred within one year of follow-up. In addition, we performed analyses by censoring up to 31 Dec 2019 (the start of the COVID-19 pandemic47). Finally, we repeated the analyses among those with ≥6 days of accelerometer wear.

Multiplicative and additive interaction (using the “interactionR” package v0.1.3.9000 in R) analyses and subgroup analyses were performed on age, sex, MVPA level (meeting the WHO recommendation1,3 or not), CVDs, and obesity (BMI ≥30 kg/m2). All statistical tests were two-sided, and a P value of <0.05 was regarded as statistically significant. To account for multiple testing, P values in fully adjusted models were corrected using the false-discovery rate (FDR)48. All statistical analyses were performed using R v4.0.4 and SPSS v26. R codes are available with the online version of this article (Supplementary Code) and at

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.


Leave a Reply

Your email address will not be published. Required fields are marked *