Source Data
We use cross-sectional data from the second round of data collection (2010/2011) of the Understanding Society Survey [6] (n = 54,587). Understanding Society is a longitudinal household panel survey of approximately 40,000 household in the UK which began in 2009 [7]. Individual participants are interviewed annually on diverse topics such as health, work, education, income, family, and social life. Further information on the study design and sampling methodology are discussed elsewhere [8]. In the second round of data collection, a representative sub-set of the main sample participated in a nurse-led health assessment (n = 15,777) [9]. A total of 13,107 respondents had data on at least one biomarker. For this study, we further limited the sample to respondents that had valid measures for triglycerides, HDL cholesterol, and total cholesterol biomarkers (n = 12,867). The final restriction placed on our sample was that participants needed to have valid measures on socioeconomic status, physical activity, and demographic characteristics reducing our sample to n = 4823.
Comparing CVD biomarkers between the two groups of the sample population and those who were excluded from the analysis because of missing SES variables generated similar results. The only exemptions were the education variables (where a higher number of people had a less healthy triglyceride level among those with missing data) and access to a car (where a higher number of people had a healthier level of both triglycerides and cholesterol ratio); for both variables, the difference in the number of individuals between groups was less than 5 %. As the nurse assessment sample use for the analysis was chosen to be nationally representative, this suggests that our results should be fairly a representative of the target population as there is less than a 5 % difference between those who reported the SES variables and those that were missing for whatever reason.
Ethical approval was not required for the secondary analysis of this anonymised data source. Respondents provided written consent for their blood to be taken and to be stored for future scientific and genetic analyses [10].
Outcomes and Key Variables
Biomarkers for CVD risk that were included as key outcome variables were cholesterol ratio [11] and triglyceride levels [12]. Different cholesterol levels were measured from blood serum using enzymatic methods with a Roche module P analyser calibrated to CDC guidelines [10]. Triglycerides were measured from serum blood using an enzymatic method on a Roche P module analyser [10]. Individual total cholesterol and HDL cholesterol level were used to calculate the cholesterol ratio (as \( \frac{\mathrm{HDL}\ \mathrm{cholesterol}}{\mathrm{total}\ \mathrm{cholesterol}} \)) which was classified as a binary variable equal to 0 if the ratio of HDL to total cholesterol was less than or equal to 3.8 mmol/L (a healthy HDL cholesterol ratio) and equal to 1 if the cholesterol ratio was greater than or equal to 3.9 mmol/L (an unhealthy HDL cholesterol ratio) [13]. Triglycerides were classified as a binary variable where the base category was between 0.3 and 1.9 mmol/L and was equal to 1 if triglycerides were between 2 and 31.9 mmol/L [14].
Three different measures of physical activity were used in the main analysis. Moderate intensity physical activity was defined based upon a positive response to engaging in 29 sports activities that would classify as moderate activity [7]. A binary variable was created that equaled zero if the respondent engaged in moderate activity less than three times a week and was equal to 1 if the respondent engaged in moderate activity three or more times a week. The second measure was a self-assessed sports activity rating where individuals rated on a scale of 1 to 10 how active they were through leisure-based sport. This was classified as a binary variable for high activity which was equal to 0 if respondent scored themselves a 4 or less and was equal to 1 if respondents reported a score of between 5 and 10. The final physical activity variable captured individual walking activity. A binary variable was created that was equal to 0 if respondents walked for 30 min (or less) for four times during the last 4 weeks and was equal to 1 if respondents walked more than 30 min for at least four times in the last 4 weeks [15]. As a validity check on our findings, we used a measure of mild physical activity that should not be significantly associated with reducing CVD risk. Mild intensity physical activity was based upon individuals reporting that they engaged in a sporting activity that would require mild exertion. This was classified as a binary variable that was equal to 0 if respondents engaged in mild activity less than three times a week and was equal to 1 if respondent participated in mild activity three or more times a week.
We controlled for a number of other factors that may confound the relationship between the biomarkers for CVD risk and physical activity participation. The biomarkers used in this analysis, especially triglycerides [10], may have been affected by medications and consumption of food or drink. We therefore controlled for the individual currently taking lipid reducing medication and if they had eaten 30 min before blood was taken. Demographic factors such as age, age squared, marital status, presence of children under the age of 12 in the household, and region [16] were included in the analysis. To determine if the relationship between biomarkers and physical activity were mediated by socioeconomic status and difficulty accessing sports facilities, in some model specifications, socioeconomic status was measured by binary variables for having access to a car or van, owning one’s house or having a mortgage on it, if the individual was employed, highest level of educational attainment achieved, and log of equivalised household income [17]. In addition, in some model specifications, we included a binary variable for if the respondent reports difficulty in accessing sports facilities [14].
Statistical Analysis
Descriptive analysis was undertaken to gain a better understanding of the prevalence of unhealthy cholesterol ratio and triglyceride levels and to identify physical activity levels and patterning of the confounding variables in the study population. These findings were used to inform the multivariate analysis.
We also performed a number of different multicollinearity tests between the physical activity variables and separately for the socioeconomic status variables. For the physical activity variables, the correlation within different intensities of physical activity was less than 0.40, suggesting that there was no evidence for correlations between the different physical activity measures. Therefore, separate consideration in different models was considered appropriate to describe the behaviours and associations with CVD risk among the sample population. We also tested for multicollinearity between the different socioeconomic status (SES) variables. Correlations between all seven SES variables were very small less than 0.1, the correlation between income and being educated to a degree level was 0.23, and there was a slightly higher correlation between the different educational levels of just above 0.40 which is to be expected as one level of educational attainment is usually correlated with lower levels of educational attainment.
The basic statistical analysis involved multivariate logistic regression models in which the two biomarkers for CVD are a function of one of the four physical activity variables as well as cofounding variables including demographic characteristics, currently taking lipid lower medication, if the respondent has eaten a half hour before blood is taken. To determine if socioeconomic status attenuates the relationship between physical activity and the biomarkers, variables related to socioeconomic status were added to the basic model. Finally, a variable controlling for difficulty in accessing sports facilities was added to the logistic regression. Significant differences were found between physical activity and gender for all types of physical activity except walking (χ
2, p = 0.000). Significant differences by gender were found in the outcome variables of cholesterol level and triglyceride levels (t test, p = 0.000). All analysis was therefore stratified by gender. Survey respondents with missing responses to any of the outcome or explanatory variables required for the analysis were excluded. The analysis was undertaken in Stata v.13 [18].