Association of COL5A1 gene polymorphisms and risk of tendon-ligament injuries among Caucasians: a meta-analysis

Background Tendons and ligaments are common sites of musculoskeletal injuries especially during physical activity. The multifactorial etiology of tendon-ligament injury (TLI) includes both genetic and environmental factors. The genetic component could render influence on TLI risk to be either elevation or reduction. Objective Inconsistency of reported associations of the collagen type V alpha 1 chain (COL5A1) polymorphisms, mainly rs12722 (BstUI) and rs13946 (DpnII), with TLI warrant a meta-analysis to determine more precise pooled associations. Methods Multi-database literature search yielded eight articles (11 studies) for inclusion. Pooled odds ratios (ORs) and 95% confidence intervals were used to estimate associations. Heterogeneity of outcomes warranted examining their sources with outlier treatment. Results All rs12722 effects indicated reduced risk (OR < 1.0). The significant outcomes (ORs 0.59–0.77, p = 0.0009–0.04) in the pre-outlier analysis were non-heterogeneous (p > 0.10). The non-significant and heterogeneous (ORs 0.63–0.98, p = 0.13–0.95; up to I2 = 86%) pre-outlier rs12722 and rs13946 results became significant (ORs 0.32–0.78, p = 10−5−0.01) and heterogeneity eliminated (I2 = 0%) with outlier treatment. Significant associations (ORs 0.26–0.65, p = 0.002–0.03) were also observed in other COL5A1 polymorphisms (rs71746744 and rs16399). Sensitivity analysis deemed all significant outcomes to be robust. Conclusions In summary, COL5A1 polymorphisms reduce the risk of TLI among Caucasians. These findings are based on the evidence of significance, homogeneity, consistency, and robustness. Additional studies are warranted to draw more comprehensive conclusions. Electronic supplementary material The online version of this article (10.1186/s40798-018-0161-0) contains supplementary material, which is available to authorized users.

Non-heterogeneous significant reduced risk of tendon injury observed in rs71746744 and rs16399 obviated the need for outlier treatment.

Background
Normal tendons and ligaments differ in function and are impacted under conditions of injury [1]. Tendon-ligament injury (TLI) includes Achilles tendon pathology (ATP), Achilles tendinopathy (AT), tennis elbow (TE), and anterior cruciate ligament rupture (ACLR). These common sites of musculoskeletal injuries are occupational and sports-related [2,3]. ATP is a broad term that refers to AT which results from acute or repetitive mechanical loading during occupational and sporting activities [4]. TE (lateral epicondylitis) is a painful musculotendinous condition originating from the lateral epicondyle of the humerus related to overuse [5] and involves highly repetitive movements [6]. ACLR involves sprain or tear of the anterior cruciate ligament which scaffolds the bones within the knee helping to keep it stable. As over 70% of ACLR injuries are noncontact [7], it puts high risk (up to 10 times) on athletes performing sudden decelerations or changes in direction [8]. ACL ruptures are considered one of the most severe injuries sustained in sports [9]. Etiology of TLI involves genes and protein structure changes [3]. Changes in collagen composition and expression of the genes that encode for these proteins have been shown to be altered in TLI [10]. Collagen is the main component of tendons and ligaments [11]; its fibrils comprising collagen type I, III, V, VI, XI, and XIV [12]. Type V collagen may be a structurally minor player in the collagen hierarchy but is functionally prominent where it plays an important role in regulating fiber diameter as well as assembly (fibrillogenesis) of collagen fibers [13]. Type V collagen protein is encoded by the collagen type V alpha 1 chain (COL5A1) gene, located on the long (q) arm of chromosome 9 [14] and is expressed in both tendons and ligaments [1]. Polymorphisms in the COL5A1 gene have been found to impact upon TLI, the most prominent being rs12722 (BstUI) and rs13946 (DpnII) which are found within the 3′-untranslated region (UTR) [15]. Other COL5A1 polymorphisms (rs71746744, rs16399, and rs3196378) have been considered as risk modifiers [16]. Primary studies have been conducted to investigate genetic risk factors involving COL5A1 polymorphisms in TLI, but results have been inconsistent. Inconsistency of results may be attributed to lack of statistical power because of small sample sizes. Meta-analysis synthesizes primary study data yielding an aggregate sample size to indicate raised statistical power. Therefore, we perform a meta-analysis of all available data on COL5A1 polymorphisms and their relationship with TLI so that we could obtain better estimates of associations.

Selection of studies
We searched MEDLINE using PubMed, Science Direct, and Google Scholar for association studies as of April 15, 2018. The terms used were "type V collagen," "COL5A1," "ligament injury," "tendon injury," and "polymorphism" as medical subject heading and text, restricted to English. References cited in the retrieved articles were also screened manually to identify additional eligible studies. Inclusion criteria were (i) case-control studies evaluating the association between COL5A1 polymorphisms and risk for TLI, (ii) sufficient genotype/allele frequency data presented to calculate the odds ratios (ORs) and 95% confidence intervals (CIs), and (iii) participants were either athletes or non-athletes. Exclusion criteria were (i) non-English articles and (ii) studies whose genotype or allele frequencies were unavailable or, when they are, combined with other polymorphisms, preventing proper data extraction.

Data extraction
Two investigators (NP and PT) independently extracted data and arrived at a consensus. Extracted data were tabulated; when needed, we contacted authors of the original articles to request for additional information. The following information were obtained from each publication: first author's name, published year, country of origin, ethnicity, TLI and COL5A1 polymorphism type, and basis for matching ( Table 1). Departures of genotypic frequencies from the Hardy-Weinberg Equilibrium (HWE) in control subjects were determined with the χ 2 test.

Modifier treatment and subgrouping
Additional file 1: Table S1 shows the sample sizes, number of cases and controls, and genotype frequencies, including the minor allele (maf ) and p values for HWE. Confining the analyses to HWE-compliant studies constituted modifier treatment. Subgrouping was based on injury type (tendon or ligament).

Quality assessment of the studies
We used the Clark-Baudouin (CB) scale to evaluate methodological quality of the included studies [17]. We found this scale most appropriate because it uses criteria such as p values, power, corrections for multiplicity, comparative sample sizes between cases and controls, use of the HWE, and genotyping methods. CB scores range from 0 (worst) to 10 (best) where scoring is based on quality (low < 5, moderate 5-7, and high ≥ 8).

Meta-analysis
Examining five COL5A1 polymorphisms (rs12722, rs13946, rs71746744, rs16399, and rs3196378) warranted the use of a common notation indicating variant (var) and wild-type (wt) alleles. Additional file 1: Table S1, however, details the genotypes for each of the five COL5A1 polymorphisms. After estimating TLI risk (OR) for each study, pooled ORs with 95% CIs were calculated for the following genetic models: (i) homozygous: (varvar and wt-wt) genotypes compared with wt-wt, (ii) recessive: (var-var versus wt-var+wt-wt), (iii) dominant: (wt-wt versus wt-var+var-var), and (iv) codominant: (var versus wt). To compare effects on the same baseline, we used raw data for genotype frequencies to calculate pooled ORs, using either fixed [18] or random [19] effects models. Heterogeneity between studies was (i) estimated with the χ 2 -based Q test [20], (ii) quantified with the I 2 statistic which measures degree of inconsistency among studies [21], and (iii) its sources (outliers) detected with the Galbraith plot [22], then subjected to outlier treatment which involves elimination of the outliers followed by reanalysis. Sensitivity analysis, which involves omitting one study at a time and recalculating the pooled ORs, was used to test for robustness of the summary effects. Robustness indicates that the pooled effects are stable, unaltered even when each study is removed. Publication bias was not examined because the qualitative and quantitative tests have low sensitivity when the number of studies is < 10 [23]. Data were analyzed using Review Manager 5.3 (Cochrane Collaboration, Oxford, England), SIGMASTAT 2.03 and SIGMAPLOT 11.0 (Systat Software, San Jose, CA, USA). Two-sided p values of < 0.05 were considered significant except in estimations of heterogeneity. Given the low power of the χ 2 -based Q test for heterogeneity, the p value was set at < 0.10 [24]. Figure 1 outlines the study selection process in a flowchart following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A total of 281 citations during the initial search was subjected to a series  of omissions that eventually yielded eight articles for inclusion [25][26][27][28][29][30][31][32]. Table 1 shows the year range of studies from 2006 to 2016. We confined our meta-analysis to Caucasians only in order to reduce the likelihood of confounding by population stratification. Injury type subgroups were tendon (AT/ATP/TE) and ligament (ACLR). CB scores for the mean (7.88 ± 0.99), median (8.0), and range (6)(7)(8)(9) indicate that methodological quality of the component studies was high. Additional file 1: Table S1 shows the quantitative features of the component studies under each of the five polymorphisms. Independent data from three articles [25,29,31] (on account of geography), put the number of studies to nine, five, three, two, and three for rs12722, rs13946, rs71746744, rs16399, and rs3196378, respectively. Respective aggregate sample sizes (case/control) for these polymorphisms are 1234/1667, 546/934, 191/299, 120/254, and 270/520. Assuming small effect size (d = 0.20) and an α level of 0.05 (two-tail), statistical powers are adequate for rs12722 (99%) and rs13946 (96%). Two studies [26,28] reported data on rs12722 A1 and A2 alleles which are BstUI products of restriction fragment length polymorphism due to two single-nucleotide polymorphism substitutions, the A2 allele referring to the variant C allele [33]. Extracted separately, these data yielded highly significant p values for HWE (up to 10 −5 ). We minimized this by combining the A1 and A2 data. Three studies from two articles were HWE-non-compliant [26,31].

Study characteristics
The PRISMA checklist provides detailed description of this meta-analysis (Additional file 2: Table S2).

rs12722 and rs13946 effects
Pooled effects (p < 0.05) indicating reduced (protective) risk (OR < 1.0) were observed in all significant ORs and non-significant ORs that were altered with outlier treatment as well as all post-outlier outcomes ( Table 2). All post-outlier outcomes were also non-heterogeneous (fixed-effects). Outlier treatment induced the following outcomes: (i) significance either acquired or elevated, (ii) heterogeneity either reduced or eliminated, and (iii) narrowing of CIs indicating increased precision.

Overall
Our hypothesis-driven approach indicates associations of COL5A1 with TLI. Table 2 shows significant pre-outlier effects in rs12722 but not rs13946. These associations were observed in the overall (ORs 0.69-0.72, p = 0.003-0.04) and modifier (OR 0.68, p = 0.004) analyses. Non-significant heterogeneous pooled effects in rs12722 and rs13946 (overall and HWE: ORs 0.76-0.90, p = 0.13-0.69) were altered to significance with outlier treatment, outcomes of which are summarized in Table 3.

Mechanism of outlier treatment in rs12722
Operation of outlier treatment is visualized in Figs. 2, 3, and 4 for rs12722 in the dominant model. In Fig. 2, the pooled effect is non-significant (OR 0.90, p = 0.49) and heterogeneous (I 2 = 69%). Sources of this heterogeneity were identified with the Galbraith plot visualized in Fig. 3 which shows the two outlying studies [29,31] located above the + 2 confidence limit. Figure 4 shows the post-outlier value of acquired significance (OR 0.72, p = 0.0003) and eliminated heterogeneity (I 2 = 0%).   Table 3).

Sensitivity analysis
This treatment was applied in all comparisons and stratified by genetic model with emphasis on significant effects (*). Table 5 summarizes our sensitivity analysis findings. The total number of significant effects across comparisons and genetic models is indicated by (S).
Comparisons that were robust are labeled as such (robust) and those that were not are identified by reference number. Reference numbers reduce robustness of the comparisons. The total number of robust comparisons is indicated by (B), and the total number of identified reference numbers is indicated by (A). This approach to sensitivity treatment allows identification of the most and least robust comparisons and genetic models. Aggregate robustness was based on the least number of A and most counts for B and S. Thus, the most robust comparisons were rs12722 (pre-and post-outlier outcomes) and rs71746744 effects and in terms of genetic model, recessive and codominant effects.

Summary of findings
Applying meta-analytical techniques is methodologically complex. This complexity arises from interpreting outcomes resulting from the combined applications of genetic modeling, modifier, outlier, and subgroup analyses as well sensitivity treatment. These considered, rs12722 was more significant (p < 0.05) than rs13946. For both polymorphisms, the most associated models with TLI were homozygous and recessive more than dominant/ codominant. On the whole, this meta-analysis delineated which polymorphisms in the COL5A1 gene have associations with TLI (rs12722, rs71746744, rs16399) and those that do not (rs319378) as well as those that were altered (rs12722 and rs13946) with meta-analytical treatment (outlier analysis) such that significance was either intensified or gained. The main finding of this study points to significant reduced risk effects (all ORs < 1.0), up to 41% and 55%, pre-and post-outlier, respectively, in rs12722 and up to 68% in rs13946, post-outlier only. Outlier treatment impacted upon heterogeneity, significance, and precision of outcomes, seen in the overall, modifier, and subgroup analyses. Significant effects (up to 69%) of the other COL5A1 polymorphisms were consistent and  homogeneous (most had I 2 = 0%) indicating associations of rs71746744 and rs16399 with tendon injury. The combination of significance, consistency, homogeneity, and increased precision of pooled effects in these comparisons improved our findings.

Comparisons with other meta-analysis
We compare our findings with a recent (March 2018) meta-analysis [34] which examined rs12722 only, compared to five COL5A1 polymorphisms in ours. This additional array enabled us to contrast and compare effects of rs12722 with rs13946, not only in the overall analysis but also in the modifier and subgroup outcomes. Caucasian rs12722 comparisons between Lv et al. [34] and ours were based on sample sizes of 1381 and 2677, respectively. Recessive outcomes from these two meta-analyses had contrasting effects. This led us to examine qualitative and quantitative differences at the data level of the primary studies in rs12722 among Caucasians. First, both meta-analyses had six studies in common [26][27][28][29][30]32]. The previous study [34] included Raleigh et al. [35] who examined interaction between matrix metalloproteinase 3 (MMP3) and COL5A1 but did not differentiate genotype data between these two genes, which was our reason for excluding it in our meta-analysis. Of note, MMP3 was genotyped but not COL5A1. Second, we had two added studies [25,31] which were not in the previous study [34]. Third, the recessive model was defined differently, TT versus TC+CC for the previous study [34] and var-var versus wt-var+wt-wt in our study. Given this contrast, calculations of the ORs would inevitably have taken diverging outcomes. Thus, OR risks were increased for the previous study [34] and reduced for ours. Other major differences were as follows: (i) our use of outlier treatment but not the previous study [34]; (ii) they tested publication bias, we did not; and (iii) our use of standard genetic models versus model-free approach for Lv et al. [34].

Comparisons with primary studies
Comparing our reduced risk pooled findings with the component study-specific ORs shows the following for the rs12722 CC genotype: (i) decreased AT risk of up to 62% in different populations (Australia and South Africa) [31], (ii) reduced ACLR risk from three studies [29,30,32], and (iii) contrasting effects of the A2 and A1 alleles in rs12722 which elicited protective and increased risks in two studies, respectively [26,28]. In rs12722, individuals with the CC genotype had a significantly decreased risk of developing AT compared with those with a T allele in either TT or TC genotypes [31]. Genetic variations in the COL5A1 gene 3′-UTR region affect mRNA stability and its export from the nucleus after transcription where regulatory sequences control gene expression at the posttranscriptional level [25]. Therefore, mutations or single-nucleotide variations within this region may alter mRNA secondary structure and thus protein characteristics [36]. Functionally, the rs12722 and rs71746744 variants are believed to alter stability of the COL5A1 mRNA [25]. Abrahams et al. [25] investigated other variants in the 3′-UTR of COL5A1 gene where rs71746744 and rs16399 were found to have significant association with AT. The rs71746744 variant del/del genotype was found to be associated with reduced risk of AT. Furthermore, they surmised linkage of rs12722 with rs71746744 and rs16399 [25] which suggest that the protective effects observed in this meta-analysis might be attributed to any or all of these three polymorphisms.
Reported findings on the role of genetic variants in TLI have differed between studies. Several methodological problems may explain the discrepancies, including limited statistical power, unrecognized confounding factors, misleading definition of phenotypes, and stratification of populations [17]. Reporting study-specific effects of COL5A1 polymorphisms ranged from presence to absence of associations. In their presence, risk effects for the variant genotype were increased or reduced, significant or not. Meta-analysis, however, gives more information in reporting effects for COL5A. These involve exploring magnitude and precision of effects, as well as consistency and stability, all in consideration of heterogeneous outcomes. These features raise the levels of evidence to support conclusions on the associations of COL5A1 polymorphisms with TLI.

Strengths and limitations
Interpreting our findings should be contextualized in view of its strengths and limitations. Limitations of our study include the following: (i) we did not examine gender effects due to insufficiency of data. Nevertheless, Posthumus et al. [30] examined gender differences of the CC genotype among ACLR participants; (ii) A1 and A2 alleles for BstUI were not examined separately, but instead, we combined them which may have sacrificed precision of outcomes, given the reported contrasting effects of these two alleles in the literature [26,28]; (iii) linkage disequilibrium was reported for the COL5A1 polymorphisms in the component studies [25,31,32], which may have introduced bias [37] by masking identity of the true causal variant; and (iv) the low number of studies (n = 2-3) and underpowered status (58% and 44%) warrant caution in interpreting the significant effects of rs71746744 and rs16399.
On the other hand, these are the following strengths of the meta-analysis: (i) confining our meta-analysis to Caucasians rendered epidemiological homogeneity to the study which thus excludes potential confounding effects of population stratification; (ii) non-HWE-compliant studies were a minority which minimizes the issue of genotyping errors thus avoiding methodological weaknesses in the summary outputs [38]. Besides, confining our analyses to HWE-compliant studies did not materially alter the outcomes in all genetic models; (iii) overall methodological quality (determined by CB) of the included studies was high; (iv) aggregate case/control totals for rs12722 and rs13946 show that the significant findings from these two polymorphisms have statistical powers of 99% and 96%, respectively; (v) all controls were defined as healthy; (vi) most (75%) tissue sources were blood; (vii) all controls were matched with cases, with 88% based on age; and (viii) sensitivity treatment deemed the significant outcomes to be robust.

Conclusions
The importance of our results is underpinned by the fact that each component study in this meta-analysis lacked adequate statistical power, but when combined using meta-analysis, clear reduced risk associations of COL5A1 polymorphisms with TLI are uncovered. Genetic structure of the homozygous and recessive models point to the variant allele as protectively associated with TLI. TLI is a complex condition involving interactions of several genetic and non-genetic risk factors. Gene-gene and gene-environment interactions have been reported to have roles in the associations of COL5A1 polymorphisms with TLI [3]. None of the eight included articles mentioned gene-environment interaction, but haplotype analysis has been addressed in the component studies [25,27,29,31,32]. However, it should be emphasized that phenotypic variations between tendon (AT, ATP, and TE) and ligament (ACLR) engender different aetiologies. Additional well-designed studies that explore other ethnic groups based on sample sizes commensurate with detection of small genotypic risks would allow more definitive conclusions about the association of COL5A1 polymorphisms and TLI. Injury-related issues in sports medicine beg comprehensive investigation of its risks. Our contribution is the synthesis approach offered by meta-analysis in elevating the level of evidence. Given the focus of this study, we hope to have clarified the genetic risks posed by COL5A1 polymorphisms to TLI. The mainly protective findings of COL5A1 rs12722 and rs13946 polymorphisms may be modest given our focus on just one gene. However, the evidence we present hopes to contribute to better understanding of the genetic nature of TLI.