figure b

Introduction

Globally, an estimated 460 million individuals have type 2 diabetes, most of whom require long-term use of glucose-lowering medications to maintain glycaemic control [1]. Several different classes of oral glucose-lowering medications are used to manage this condition, including biguanides (e.g. metformin), sulfonylureas, thiazolidinediones, dipeptidyl peptidase-4 (DPP-4) inhibitors, sodium–glucose cotransporter 2 (SGLT2) inhibitors and glucagon-like peptide-1 receptor (GLP1R) agonists, with diverse mechanisms of action [2].

Preclinical studies have variably reported both carcinogenic and antineoplastic effects of glucose-lowering medications. For example, in vitro studies have suggested that metformin, an insulin sensitiser and first-line therapy for type 2 diabetes, can reduce cell proliferation, induce apoptosis and cause cell cycle arrest [3]. Thiazolidinediones, insulin sensitisers and selective peroxisome proliferator activated nuclear receptor (PPARG) agonists have been suggested to increase cellular differentiation, reduce cellular proliferation and induce apoptosis in some cell lines but to promote metastatic prostate cancer in vivo [4,5,6]. There is also some evidence that sulfonylureas, secretagogues that lower blood glucose levels by stimulating pancreatic insulin secretion, may promote carcinogenesis, potentially via increasing circulating insulin levels [7, 8]. Finally, in vitro studies have reported potential antiproliferative effects of GLP1R agonists in various cancer cell types [9,10,11].

Epidemiological studies of glucose-lowering medication use have provided some support for findings from laboratory studies. For example, some observational studies have reported that metformin users have lower risk of several cancers while sulfonylurea use has been associated with an increased risk of site-specific (i.e. colorectal, metastatic prostate) and overall cancer [12,13,14,15,16,17]. In addition, some thiazolidinediones (i.e. pioglitazones) have been linked to an elevated risk of bladder, prostate and pancreatic cancer, though use of rosiglitazone has been associated with lower breast cancer risk [18, 19]. Finally, GLP1R agonist use has been associated with a decreased risk of prostate cancer when compared with sulfonylurea use [20].

The causal nature of associations reported between glucose-lowering medication use and cancer risk in conventional epidemiological studies is often unclear. This is because of the susceptibility of such studies to residual confounding (e.g. due to indication) and various forms of bias (e.g. immortal time, prevalent user), which can undermine robust causal inference [21]. While clinical trials of glucose-lowering medications have not consistently reported differences in rates of cancer among users of these medications, such studies are often underpowered to detect effects for individual cancer sites [22, 23]. Further, such studies often have limited follow-up periods, thus are not able to adequately capture outcomes with long induction periods, such as cancer.

Drug target Mendelian randomisation (MR) uses germline variants in genes encoding drug targets as instruments (‘proxies’) for these targets to estimate the effect of their pharmacological perturbation on disease endpoints [24]. Since germline genetic variants are randomly assorted at meiosis and fixed at conception, analyses using variants as instruments should be less prone to conventional issues of confounding and reverse causation. In addition, given the length of time required for solid tumour development, the use of germline genetic variants as instruments is advantageous as it permits estimation of the long-term effects of medication use on cancer risk [25].

Given the widespread use of glucose-lowering medications and reports of both adverse and protective associations of these medications with cancer risk in preclinical and epidemiological studies, there is a need to further evaluate the role of these medications in the risk of common adulthood cancers. Additionally, given the long induction period of cancers, using MR to examine target-mediated effects of medications that have been on the market for relatively short periods of time (e.g. SGLT2 inhibitors and GLP1R agonists) can be informative in predicting their long-term safety profiles. We thus aimed to develop genetic instruments for the targets of five approved type 2 diabetes medications with known mechanisms of action (sulfonylurea receptor 1 [ATP binding cassette subfamily C member 8 (ABCC8)], PPARG, SGLT2, DPP4 and GLP1R). We also aimed to evaluate associations of genetically proxied perturbation of three of these targets with reliable cis-acting instruments (ABCC8, PPARG and GLP1R) with risk of breast, colorectal and prostate cancer, common cancers with epidemiological evidence suggesting a link between glucose-lowering medication use and their onset, and overall (i.e. site-combined) cancer [5, 12,13,14, 18, 19, 26,27,28].

Methods

Summary genetic association data were obtained from three cancer-specific genome-wide association study (GWAS) consortia. Summary genetic association estimates for overall and oestrogen receptor (ER)-stratified breast cancer risk in up to 122,977 cases and 105,974 controls were obtained from the Breast Cancer Association Consortium (BCAC) [29]. Summary genetic association estimates for overall and site-specific (i.e. colon, rectal) colorectal cancer risk in up to 58,221 cases and 67,694 controls were obtained from an analysis of the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO), Colorectal Transdisciplinary Study (CORECT), and Colon Cancer Family Registry (CCFR) [30]. Summary genetic association estimates for overall and advanced prostate cancer risk (i.e. metastatic disease, Gleason score ≥8, prostate-specific antigen >100 or prostate cancer-related death) in up to 79,148 cases and 61,106 controls were obtained from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium [31]. These analyses were restricted to participants of European ancestry.

Overall (i.e. site-combined) cancer risk data in 27,483 incident cases and 372,016 controls were also obtained from a GWAS performed in the UK Biobank cohort study [32]. Briefly, cancer cases were classified according to ICD-9 (http://www.icd9data.com/2007/Volume1/default.htm) and ICD-10 (http://apps.who.int/classifications/icd10/browse/2016/en) with data completed to April 2019 and controls were defined as individuals who did not have any cancer code (ICD9 or ICD10) and did not self-report a cancer diagnosis. GWAS were performed using a linear mixed model as implemented in BOLT-LMM (v2.3) (to account for relatedness and population stratification) and adjusted for age, sex and genotyping array [33]. Further information on imputation and quality control measures have been reported elsewhere [33].

Further information on statistical analysis, imputation, and quality control measures for summary genetic association data obtained from cancer consortia is available in the original publications. All studies contributing data to these analyses had the relevant institutional review board approval from each country, in accordance with the Declaration of Helsinki, and all participants provided informed consent.

Instrument construction

To generate genetic instruments to proxy glucose-lowering drug target perturbation, summary genetic association data were obtained from a GWAS of type 2 diabetes in the Million Veteran Program (148,726 cases; 965,732 controls of European ancestry) [34]. Analyses were adjusted for age, sex and ten principal components of genetic ancestry. Instruments were constructed in PLINK by obtaining SNPs associated with type 2 diabetes at genome-wide significance (p<5×10−8) that were in or within ±500 kb from the gene encoding each respective target (PPARG, Chr3: 12328867–12475855; ABCC8, Chr11: 17414432–17498449; GLP1R, Chr6: 39016574–39055519) using the 1000 Genomes Phase 3 reference panel [35, 36]. We were unable to identify genome-wide significant SNPs within 500 kb windows from SLC5A2 and DPP4 (i.e. instruments for SGLT2 and DPP-4 inhibitors, respectively) and therefore did not proceed with MR analyses for these targets. We also did not include putative metformin targets due to the unclear mechanism(s) of action of this medication [37]. For PPARG, ABCC8 and GLP1R, SNPs used as instruments were permitted to be in weak linkage disequilibrium (r2<0.20) with each other to increase the proportion of variance in each respective drug target explained by the instrument, maximising instrument strength. In total, nine SNPs that met these criteria were obtained for PPARG, six for ABCC8 and four for GLP1R.

In a separate population (i.e. the UK Biobank cohort study), we then evaluated the association of type 2 diabetes SNPs in drug target regions with HbA1c levels, a marker of long-term blood glucose levels, in order to minimise winner’s curse bias. The UK Biobank is a prospective cohort study of ~500,000 individuals aged 40–69 years when recruited in 2006–2010 [38]. SNP summary statistics were re-scaled to represent a mmol/mol (0.09%) unit reduction in HbA1c to provide more interpretable effect estimates in MR analyses. HbA1c values were obtained from a GWAS of 407,766 participants of the UK Biobank performed using a linear mixed model as implemented in BOLT-LMM and adjusted for age, sex, batch and ten principal components of genetic ancestry. For the purposes of this analysis, we sequentially removed participants according to the following exclusion criteria: withdrawn from the study (N=502,506 retained); non-European ancestry (N=462,898 retained); missing HbA1c data (N=442,529 retained); missing or ‘prefer not to answer’ response to self-reported diabetes status (N=442,268); self-reported diabetes diagnosis (N=418,574); ICD-10 diabetes diagnosis (N=409,812); missing data on glucose-lowering medication use (N=409,762); self-reported glucose-lowering medication use (N=409,614); HbA1c >48 mmol/mol (6.5%) (N=408,319); and HbA1c <21.88 mmol/mol (4.2%) (N=407,766). Further information on imputation and quality control measures have been reported elsewhere [39].

For the PPARG instrument, two SNPs where the effect on HbA1c was in the opposite direction to that of type 2 diabetes were removed from the instrument (rs17036160, rs11712085), as these associations likely represent pleiotropic mechanisms that would bias consequent MR analyses.

Instrument validation

Instruments were validated by examining the association of genetically proxied drug target perturbation with endpoints influenced by these medications in randomised controlled trials. For PPARG, alanine aminotransferase (ALT) and aspartate aminotransferase (AST) levels were used as positive controls (i.e. PPARG agonists lower levels of ALT and AST) and for ABCC8 and GLP1R, BMI was used (i.e. sulfonylureas cause weight gain and GLP1R agonists cause weight loss) [40,41,42,43]. Co-localisation was then performed to assess whether genetic liability to type 2 diabetes and traits representing positive controls share the same causal variant at each locus encoding a drug target (i.e. PPARGABCC8, GLP1R). Such an analysis can permit exploration of whether genetic liability to type 2 diabetes and positive control traits at each drug target locus are influenced by distinct causal variants that are in linkage disequilibrium with each other, indicative of horizontal pleiotropy (an instrument influencing an outcome through pathways independent to that of the exposure), a violation of the exclusion restriction criterion.

Co-localisation analysis was performed using the coloc (version 2.0) R package (https://cran.r-project.org/web/packages/coloc/index.html), which uses approximate Bayes factor computation to generate posterior probabilities that associations between two traits represent each of the following configurations: (1) neither trait has a genetic association in the region (H0); (2) only the first trait has a genetic association in the region (H1); (3) only the second trait has a genetic association in the region (H2); (4) both traits are associated but have different causal variants (H3); and (5) both traits are associated and share a single causal variant (H4) [44]. Co-localisation analysis was performed by generating ±500 kb windows around the gene encoding each respective drug target. We used a posterior probability of >50% to indicate support for a configuration tested. Where there was not support for H4, we then examined the possibility of co-localisation across other secondary conditionally independent signals for either genetic liability to type 2 diabetes or positive controls within drug target loci by performing pairwise conditional and co-localisation analysis on all conditionally independent association signals using GCTA-COJO and the coloc package as implemented in pwCoCo [45]. We employed default priors for p1 (i.e. prior probability that a SNP is associated with type 2 diabetes liability within a drug target locus, 1×10−4), p2 (i.e. prior probability that a SNP is associated with positive controls or cancer risk within a drug target locus, 1×10−4) and p12 (i.e. prior probability that a SNP is associated with both traits, 1×10−5). As sensitivity analyses, we re-performed co-localisation analysis employing two alternate priors for p12 (5×10−5, 5×10−6).

Statistical analysis

Causal estimates were generated using inverse-variance weighted (IVW) random-effects models (permitting overdispersion in models). These models were adjusted for weak linkage disequilibrium between SNPs with reference to the 1000 Genomes Phase 3 reference panel [46]. Where there was under-dispersion in causal estimates generated from individual genetic variants, the residual SE was set to 1 (i.e. equivalent to a fixed-effects model).

MR analysis makes the following assumptions: (1) that a genetic instrument is associated with a modifiable exposure or drug target (‘relevance’); (2) the instrument does not share a common cause with an outcome (‘exchangeability’); and (3) the instrument has no direct effect on the outcome (‘exclusion restriction’).

The ‘relevance’ MR assumption was evaluated by generating estimates of the proportion of variance of each drug target (in HbA1c units) explained by the instrument (r2) and F statistics. F statistics can be used to examine whether results are likely to be influenced by weak instrument bias (i.e. reduced statistical power and bias when an instrument explains a limited proportion of the variance in a drug target). As a convention, an F statistic of >10 is indicative of minimal weak instrument bias.

We evaluated the ‘exclusion restriction’ MR assumption by performing co-localisation to examine whether drug targets and cancer endpoints showing nominal evidence of an association in MR analyses (p<0.05) share the same causal variant at a given locus. Iterative leave-one-out analysis was performed by removing one SNP at a time from instruments to examine whether findings showing nominal evidence of association were driven by a single influential SNP.

To account for multiple testing across analyses, a Bonferroni correction was used to establish a p value threshold of <0.0019 (false-positive rate = 0.05/27 statistical tests [three drug targets tested against nine primary cancer endpoints]), which we used as a heuristic to define ‘strong evidence’, with findings between p≥0.0019 and p<0.05 defined as ‘weak evidence’.

There was no formal prespecified protocol for this study. All statistical analyses were performed using R version 3.3.1 (https://www.r-project.org/).

Results

Characteristics of genetic variants used to instrument glucose-lowering drug targets are presented in Table 1. Across all three drug targets, F statistics for their respective instruments ranged from 56.32 to 487.14, suggesting that weak instrument bias was unlikely to affect the conclusions (ESM Table 1). Power calculations suggested that we had 80% power to detect ORs ranging from 1.40 to 2.62 (in PPARG analyses), 2.03 to 8.34 (in ABCC8 analyses) and 2.22 to 8.78 (in GLP1R analyses) per mmol/mol reduction in target-mediated inverse rank normal transformed [IRNT] HbA1c across all cancer endpoints (α=0.05). Complete power estimates across all MR analyses are presented in ESM Table 2.

Table 1 Characteristics of SNPs used as instruments to proxy drug targets

Instrument validation

Genetically proxied PPARG perturbation was associated with lower levels of ALT (SD change in ALT per PPARG perturbation equivalent to 1 unit IRNT HbA1c reduction: −0.57 [95% CI −1.01, −0.13], p=0.01) and AST (−0.49 [95% CI −1.79, −0.19], p=1.53×10−3). Co-localisation analysis suggested that type 2 diabetes associations in the PPARG locus had a 92% and 84% probability of sharing a causal variant with ALT and AST, respectively (Figs 1, 2 and 3 and ESM Tables 3, 4).

Fig. 1
figure 1

Regional Manhattan plot of associations of SNPs with type 2 diabetes ±500 kb from the PPARG locus. rs17036160 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the PPARG locus

Fig. 2
figure 2

Regional Manhattan plot of associations of SNPs with ALT concentrations ±500 kb from the PPARG locus. rs17036160 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the PPARG locus. SNPs in unclear linkage disequilibrium with sentinel SNP are in grey

Fig. 3
figure 3

Regional Manhattan plot of associations of SNPs with AST concentrations ±500 kb from the PPARG locus. rs17036160 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the PPARG locus

Genetically proxied ABCC8 perturbation was associated with elevated BMI (SD change in BMI per ABCC8 perturbation equivalent to 1 unit IRNT HbA1c reduction: 0.530 [95% CI 0.004, 0.172], p=3.75×10−3). Co-localisation analysis suggested that type 2 diabetes and BMI associations had a 94.0% posterior probability of sharing a causal variant in ABCC8 (Figs 4, 5 and ESM Table 5).

Fig. 4
figure 4

Regional Manhattan plot of associations of SNPs with type 2 diabetes ±500 kb from the ABCC8 locus. rs5219 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the ABCC8 locus

Fig. 5
figure 5

Regional Manhattan plot of associations of SNPs with BMI ±500 kb from the ABCC8 locus. rs5219 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the ABCC8 locus

There was little evidence to support an association of genetically proxied GLP1R perturbation with BMI (SD change in BMI equivalent to 1 unit IRNT HbA1c reduction: −0.08 [95% CI −0.30, 0.15], p=0.51). Co-localisation analysis applied to both marginal and conditionally independent associations for type 2 diabetes and BMI in the GLP1R locus did not support shared causal variants across these traits (posterior probability of shared causal variants across models: 0.22–0.49%) (Figs 6, 7 and ESM Table 6).

Fig. 6
figure 6

Regional Manhattan plot of associations of SNPs with type 2 diabetes ±500 kb from the GLP1R locus. rs10305420 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the GLP1R locus

Fig. 7
figure 7

Regional Manhattan plot of associations of SNPs with BMI ±500 kb from the GLP1R locus. rs10305420 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the GLP1R locus

Genetically proxied PPARG perturbation and cancer risk

There was weak evidence for an association of genetically proxied PPARG perturbation with an elevated risk of prostate cancer (OR 1.75 [95% CI 1.07, 2.85], p=0.02) but little evidence of association with other cancer endpoints (Table 2). Findings for prostate cancer risk were consistent in iterative leave-one-out analysis (ESM Table 7). Co-localisation using marginal and conditional associations for type 2 diabetes and prostate cancer in the PPARG locus suggested that type 2 diabetes was unlikely to share a causal variant with this cancer in this region (posterior probability of a shared causal variant across models: ≤0.09%, posterior probability of distinct causal variants: ≤25%) (Fig. 8 and ESM Table 8).

Table 2 MR estimates examining the association of genetically proxied perturbation of PPARG with site-specific and overall cancer risk
Fig. 8
figure 8

Regional Manhattan plot of associations of SNPs with prostate cancer risk ±500 kb from the PPARG locus. rs17036160 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the PPARG locus

In subtype-stratified analyses, genetically proxied PPARG perturbation was weakly associated with lower risk of ER+ breast cancer (OR 0.57 [95% CI 0.38, 0.85], p=6.45×10−3). This finding was consistent in iterative leave-one-out analysis (ESM Table 9). Co-localisation using marginal and conditional associations for type 2 diabetes and ER+ breast cancer in the PPARG locus reported a low posterior probability (H4<5%; posterior probability of distinct causal variants: ≤23%) of both traits sharing one or more causal variants within this region (Fig. 9 and ESM Table 10).

Fig. 9
figure 9

Regional Manhattan plot of associations of SNPs with ER+ breast cancer risk ±500 kb from the PPARG locus. rs17036160 (purple dot) represents the sentinel SNP associated with genetic liability to type 2 diabetes in the PPARG locus

Genetically proxied ABCC8 and GLP1R perturbation and cancer risk

There was little MR evidence of association of genetically proxied ABCC8 or GLP1R perturbation with site-specific or overall cancer risk (Tables 3, 4).

Table 3 MR estimates examining the association of genetically proxied perturbation of ABCC8 with site-specific and overall cancer risk
Table 4 MR estimates examining the association of genetically proxied perturbation of GLP1R with site-specific and overall cancer risk

Sensitivity analyses altering priors for co-localisation

Across positive control traits and cancer outcomes, findings from co-localisation analyses remained robust to using two alternate priors for p12 (5×10−5, 5×10−6) (ESM Table 11).

Discussion

In this MR analysis of up to 287,829 cases and 606,790 controls, we found weak evidence for an association of genetically proxied PPARG perturbation with a higher risk of prostate cancer and lower risk of ER+ breast cancer. In co-localisation analysis, however, there was little evidence that genetic liability to type 2 diabetes and these cancer endpoints shared one or more causal variants within PPARG, though these analyses were likely underpowered given low posterior probabilities to support both H3 (i.e. distinct causal variants) and H4 (i.e. shared causal variants) across these analyses. We found little evidence of association of genetically proxied GLP1R or ABCC8 perturbation with cancer risk.

Despite in vivo studies suggesting an important role for PPARG in prostate tumour growth and conventional epidemiological studies suggesting a link between pioglitazone use and elevated prostate cancer risk, our combined MR and co-localisation analyses did not find consistent evidence for an association of genetically proxied PPARG perturbation with prostate cancer risk [6, 18]. Likewise, our findings are not consistent with some previous epidemiological studies that have reported links between rosiglitazone use and lower breast cancer risk and thiazolidinedione use and lower colorectal cancer risk [5, 19]. Though our analyses were powered to detect effect sizes comparable with those reported in some previous studies (e.g. ~60% increased prostate cancer risk among pioglitazone users and ~60% lower risk of colorectal cancer among thiazolidinedione users), they were likely less powered to detect other, more modest, effect sizes reported in the literature (e.g. ~10% lower risk of breast cancer in rosiglitazone users) [19, 26, 47]. Interpretation of the pharmacoepidemiological literature linking glucose-lowering medication use with cancer risk is challenging because of the likely susceptibility of many previous studies to residual confounding (e.g. by indication) due to the use of inappropriate comparator groups (i.e. non-medication users), the inclusion of ‘prevalent users’ of medications in analyses and the possibility of ‘immortal time’ bias arising due to misalignment of the start of follow-up, eligibility and treatment assignment of participants [21].

Among the strengths of our analysis is the strict instrument selection and validation process employed. By using cis-acting variants, in close proximity to the genes that code for the drug targets of interest, horizontal pleiotropy should be minimised. In addition, we used strict positive control analysis (i.e. testing drug targets against established secondary effects of medications) and co-localisation analyses (including co-localisation analyses permitting multiple causal variants) to validate the selected instruments. Our use of a summary-data MR approach permitted us to leverage large-scale genetic data from several GWAS consortia, enhancing statistical power and precision of causal estimates.

There were several limitations to this analysis. First, we had sufficient statistical power to detect large effect sizes only per SD decrease in HbA1c (~6.75 mmol/mol [~0.61%]) and therefore cannot rule out more modest effects of the drug targets examined on cancer risk. In clinical trials, monotherapy with sulfonylureas, thiazolidinediones (rosiglitazone, pioglitazone) and the GLP1R agonist liraglutide has been shown to reduce HbA1c by around 8–17 mmol/mol (0.7–1.5%), as compared with placebo [48,49,50]. Second, although co-localisation analyses of PPARG and cancer endpoints provided low posterior probabilities for shared causal variants, it should be noted that this may also reflect limited power. The low posterior probabilities supporting either shared or distinct causal variants across several co-localisation analyses suggests that many of these analyses may have been too underpowered to support either of these configurations evaluated. Third, the low posterior probability of shared causal variants in ‘positive control’ co-localisation analyses for GLP1R and BMI could reflect distinct signalling mechanisms influencing type 2 diabetes and BMI in GLP1R, the presence of which would not necessarily influence the validity of this as an instrument for GLP1R signalling perturbation’s effect on glycaemic control [51]. Fourth, we were unable to evaluate the role of some glucose-lowering drug targets (i.e. DPP-4 and SGLT2) due to the absence of reliable genetic instruments for these targets. Fifth, our analyses were restricted to the examination of target-mediated (i.e. ‘on-target’) effects of glucose-lowering medications on cancer endpoints. Sixth, our analyses assume no gene–environment or gene–gene interactions and linear and time-dependent effects of drug targets on cancer risk. Seventh, though associations of genetically proxied PPARG perturbation and prostate and ER+ breast cancer risk attenuated towards the null in iterative leave-one-out analysis removing rs4135247 from the PPARG instrument, 95% CIs overlapped across models with and without this variant. Though this attenuation in association is consistent with sampling error, we cannot rule out the possibility that this attenuation was driven, in part, through horizontally pleiotropic mechanisms linking this variant to cancer risk. Eighth, though we found strong and suggestive evidence for associations of genetically proxied PPARG perturbation with ER+ breast cancer and prostate cancer risk, respectively, after applying a Bonferroni correction to account for multiple testing, we cannot rule out the possibility that these findings represent false-positive results. Ninth, the MR estimates reported represent long-term effects of target modulation in non-diabetic populations, whereas the clinical effects of these medications may be more pronounced among individuals with type 2 diabetes and could depend on length of medication use. Tenth, we cannot rule out the possibility that controls in cancer GWAS included individuals with latent, undiagnosed cancer, the presence of which would bias associations towards or away from the null, depending on the site of undiagnosed cancer and the relationship between drug targets examined and this cancer. We also cannot rule out the possibility of survival bias influencing genetic association estimates from cancer GWAS consortia that employed case–control study designs. If, for example, genetic variants used to instrument glucose-lowering drug target perturbation increased cancer risk and subsequent mortality prior to enrolment in a case–control study, this could induce an artificial ‘protective’ association between perturbation of this drug target and cancer risk. Finally, samples were restricted to individuals of European ancestry and therefore the generalisability of these findings to non-European populations is unclear.

In conclusion, we developed novel instruments for PPARG, ABCC8 and GLP1R using strict validation protocols and evaluated the association of genetically proxied perturbation of these targets with risk of cancer. In MR analysis we found weak evidence that genetically proxied PPARG perturbation was associated with a higher risk of prostate cancer and a lower risk of ER+ breast cancer. There was little evidence of co-localisation for these findings, a necessary precondition to infer causality between PPARG perturbation and these cancer endpoints, possibly reflecting either the absence of shared causal variants across type 2 diabetes liability and these cancer endpoints in PPARG or the low statistical power of these analyses. Further assessment of these drug targets using alternative molecular epidemiological approaches (e.g. using protein or expression quantitative trait loci or using direct circulating measures of these proteins) and/or studies using medical registry data (e.g. ‘target trial’ analyses) may help to further corroborate findings presented in this analysis. Finally, we found little evidence for an association of genetically proxied ABCC8 and GLP1R perturbation with risk of breast, colorectal, prostate or overall cancer risk.