Post by Nadica (She/Her) on Aug 5, 2024 20:51:58 GMT
Genetic risk factors for COVID-19 and influenza are largely distinct - Published Aug 5, 2024
Abstract
Coronavirus disease 2019 (COVID-19) and influenza are respiratory illnesses caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza viruses, respectively. Both diseases share symptoms and clinical risk factors1, but the extent to which these conditions have a common genetic etiology is unknown. This is partly because host genetic risk factors are well characterized for COVID-19 but not for influenza, with the largest published genome-wide association studies for these conditions including >2 million individuals2 and about 1,000 individuals3,4,5,6, respectively. Shared genetic risk factors could point to targets to prevent or treat both infections. Through a genetic study of 18,334 cases with a positive test for influenza and 276,295 controls, we show that published COVID-19 risk variants are not associated with influenza. Furthermore, we discovered and replicated an association between influenza infection and noncoding variants in B3GALT5 and ST6GAL1, neither of which was associated with COVID-19. In vitro small interfering RNA knockdown of ST6GAL1—an enzyme that adds sialic acid to the cell surface, which is used for viral entry—reduced influenza infectivity by 57%. These results mirror the observation that variants that downregulate ACE2, the SARS-CoV-2 receptor, protect against COVID-19 (ref. 7). Collectively, these findings highlight downregulation of key cell surface receptors used for viral entry as treatment opportunities to prevent COVID-19 and influenza.
Main
To understand the extent to which the same host genetic factors influence the risk of coronavirus disease 2019 (COVID-19) and influenza, we first performed a genome-wide association study (GWAS) of influenza infection based on survey data from 296,313 participants of the AncestryDNA COVID-19 study who consented to the research8. Although the focus of that study was on risk factors for COVID-19, participants also indicated if they were tested for influenza in either the 2019–2020 or 2020–2021 flu seasons (Methods). Overall, 18,448 (6.2%) participants reported a positive test for influenza, and thus were considered cases for our analysis, while the remaining 277,865 participants (including 23,985 with a negative test) were considered population-level controls. We refer to this phenotype as ‘reported influenza infection’, but recognize that it does not represent true susceptibility to infection because the control group includes an undetermined number of individuals not exposed to influenza in either season or who were infected but not tested (for example, asymptomatic). As such, this phenotype may capture symptomatic influenza infection that required seeking (or being prescribed) a viral test.
Using these data from AncestryDNA, we tested the association between reported influenza infection and 10 million common (frequency >1%) imputed variants using REGENIE9, separately in three ancestral groups (with >100 influenza cases) defined based on genetic similarity to three superpopulations studied by the 1000 Genomes Project10 (Methods): from Europe (EUR; n = 254,750, 86.0%), Africa (AFR; n = 12,951, 4.4%) and the Americas (AMR; n = 26,928, 9.1%), totaling 18,334 cases and 276,295 controls (Supplementary Table 1). Results were meta-analyzed across ancestries using an inverse-variance, fixed-effects meta-analysis (Extended Data Fig. 1), identifying two loci associated with reported influenza infection at P < 5 × 10−8 (near ST6GAL1 and B3GALT5, respectively on chromosomes 21q22.2 and 3q27.3; Table 1). We describe these loci in detail later, including sensitivity and replication analyses in independent cohorts that demonstrate the reproducibility of these associations.
Results from the AncestryDNA GWAS of reported influenza infection were then used to determine if severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza infections have a shared genetic etiology. To address this question, we initially focused on 24 variants associated with COVID-19 identified by the Host Genetics Initiative (HGI)2 (freeze 6; Supplementary Table 2). Of these, only one was associated with reported influenza infection (P < 0.05/24 = 0.002), despite adequate power for most (Supplementary Table 3): rs505922 in ABO (odds ratio (OR) = 1.05 for the T allele, 95% confidence interval (CI) = 1.02–1.07, P = 2.2 × 10−4; heterogeneity test P = 0.13; Fig. 1). This variant increased the risk of reported influenza infection, while it decreased the risk of COVID-19 (OR = 0.92; 95% CI = 0.92–0.93, based on the HGI GWAS of reported infection2), in line with previous reports11. We explore the ABO locus in greater detail in the Supplementary Note, concluding that its association with influenza is (1) only partially attenuated after accounting for COVID-19 status and (2) probably tags an underlying causal variant shared with other diseases (for example, childhood ear infections, allergic disease) but not COVID-19. Overall, only 10 (42%) of 24 variants had a consistent direction of effect on both influenza and COVID-19 (Fig. 1).
The lack of significant and directionally consistent associations between reported influenza infection and COVID-19 loci suggests that the two diseases share few—if any—genetic risk factors. Consistent with these findings, the two risk variants for reported influenza identified in the AncestryDNA GWAS (in or near B3GALT5 and ST6GAL1) did not have a directionally consistent association with COVID-19 in the HGI analysis (Supplementary Table 4). Furthermore, the genetic correlation (rg)12 between reported influenza infection and both SARS-CoV-2 infection (rg = 0.30, P = 0.009) and COVID-19 hospitalization (rg = 0.34, P = 0.007) was modest (Supplementary Table 5). Collectively, these results suggest some sharing, but substantial divergence, in the genetic etiology underpinning influenza infection and COVID-19.
The AncestryDNA GWAS of reported influenza infection identified two associated loci (Table 1), with lead variants rs16861415 in ST6GAL1 (3q27.3; OR = 0.86 for C allele, 95% CI = 0.83–0.90, P = 1.4 × 10−10) and rs2837112 in B3GALT5 (21q22.2; OR = 0.90 for A allele, 95% CI = 0.88–0.92, P = 1.3 × 10−19). The effect allele ranged in frequency between 3% (AFR) and 8% (EUR) for rs16861415, and between 39% (AFR) and 49% (EUR) for rs2837112, with no evidence for heterogeneity of effect sizes across ancestries or cohorts (Supplementary Table 6). The reduction in influenza risk observed in homozygous carriers was 37% for ST6GAL1 and 20% for B3GALT5 (Table 1), with no evidence for epistasis between the two loci (Supplementary Note).
Next, we performed sensitivity and replication analyses to determine if the two influenza associations were robust to phenotype definition and reproducible. In the AncestryDNA cohort, excluding 253,880 individuals without influenza test results from the control group (resulting in 18,448 positive test cases versus 23,985 negative test controls) did not impact the effect size estimate for either locus: OR = 0.86 (versus 0.86) and P = 5.2 × 10−6 for ST6GAL1, and OR = 0.89 (versus 0.90) and P = 4.9 × 10−12 for B3GALT5 (Fig. 2). In contrast, defining influenza infection more loosely based on whether a participant reported having flu-like symptoms in the 2019–2020 or 2020–2021 flu seasons (43,956 cases versus 250,673 controls) led to attenuated effect sizes but still highly significant associations: OR = 0.93 and P = 1.7 × 10−7 for ST6GAL1, and OR = 0.95 and P = 4.0 × 10−11 for B3GALT5 (Fig. 2).
To determine if the associations were reproducible, we used data from medical records to define lifetime influenza infection status across 1,153,291 individuals from seven biobanks and five ancestral groups (Methods). Based on the presence of International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10) codes J09, J10 or J11 in hospital admissions, general practitioner records or death registries, we identified 22,022 (2%) individuals with (cases) and 1,131,269 without (controls) a lifetime medical record of influenza (Supplementary Table 1). As with the AncestryDNA GWAS, the control group in this replication analysis probably includes both individuals not exposed to influenza and individuals who had influenza but not an associated medical record. In a multiancestry meta-analysis of medical record influenza (Extended Data Fig. 2), we observed directionally consistent and genome-wide significant associations with both rs16861415 in ST6GAL1 (OR = 0.90, P = 3.0 × 10−10) and rs2837112 in B3GALT5 (OR = 0.93, P = 2.5 × 10−11; Table 1). Two measures of recent influenza infection also supported both associations. First, we found consistent and significant associations with a positive culture for influenza A (Methods), an indicator of current infection available in 82,348 individuals from the Geisinger Health Study (GHS) biobank: OR = 0.82 and P = 0.005 for ST6GAL1, OR = 0.86 and P = 3.02 × 10−5 for B3GALT5 (Fig. 2). Second, both variants lowered the risk of a positive seropositive test for influenza A in a published study of 1,000 individuals6, significantly so for B3GALT5 (OR = 0.70, P = 0.001; Fig. 2). Lastly, the B3GALT5 variant significantly lowered the risk of flu-related hospitalization among influenza cases (1,696 hospitalized cases versus 8,239 nonhospitalized cases, OR = 0.88, P = 0.005), with a similar, albeit nonsignificant, protective effect for the ST6GAL1 variant (OR = 0.89, P = 0.17; Fig. 2). Collectively, these findings establish both loci as reproducible genetic risk factors for influenza and indicate that the B3GALT5 variant also reduces disease severity.
We did not find any additional associated loci in the meta-analysis of discovery (AncestryDNA) and replication (biobank) cohorts (40,356 cases versus 1,407,564 controls; Extended Data Fig. 3). As observed in the AncestryDNA GWAS, aside from ABO, published COVID-19 risk variants were not associated with influenza in this larger analysis (Extended Data Fig. 4).
Next, to help understand how each influenza locus contributes to disease pathophysiology, we identified the likely effector genes of the GWAS signal, concentrating on the lead variant at the 3q27.3 and 21q22.2 loci in the meta-analysis of the discovery and replication cohorts, that is, rs13322149 and rs2837113, respectively (Extended Data Fig. 3). Based on high linkage disequilibrium (LD, r2 > 0.80) between each variant, and sentinel expression quantitative trait loci (eQTLs) and enhancer-overlapping variants (Supplementary Tables 7 and 8), four genes were prioritized: ST6GAL1 and ADIPOQ at the 3q27.3 locus; and B3GALT5 and IGSF5 at the 21q22.2 locus. Analysis of rare loss-of-function (LOF) and missense variants assayed via exome sequencing of 14,189 cases with influenza and 811,714 controls did not identify any significant genome-wide associations (Extended Data Fig. 5); however, when we focused on the four genes highlighted above, we found a missense variant in IGSF5 (frequency 0.01%) associated with a 9.2-fold higher risk of medical record influenza, which was significant after correcting for 631 rare variant tests performed across the four genes (P = 2.3 × 10−5; Supplementary Table 9). This observation provides additional support for IGSF5 as one of the likely effector genes underlying the common variant association with flu at the 21q22.2 locus.
Of the four likely effector genes of the influenza loci, ST6GAL1 and B3GALT5 are strong biological candidates (ADIPOQ and IGSF5 are discussed in the Supplementary Note). ST6GAL1 codes for the enzyme β-galactoside α-2,6-sialyltransferase 1, which catalyzes the addition of sialic acid to galactose by an α-2,6 linkage13; it is most highly expressed in the liver and in Epstein–Barr virus-transformed B cells in humans (Extended Data Fig. 6a)14. Critically, influenza virus infection is initiated when the viral hemagglutinin glycoprotein binds to an α-2,6-linked sialic acid found on human host cell surface glycoproteins and glycolipids in the upper respiratory tract, which are used by the virus as attachment factors that facilitate the subsequent engagement with a functional receptor required to enter the target cell15,16,17. The lead variant at this locus (rs13322149) colocalized with a sentinel eQTL (rs73187789:A, r2 = 0.95) that is associated with lower expression of ST6GAL1 in thyroid tissue from the Genotype-Tissue Expression (GTEx) project14 (P = 3.4 × 10−12; Supplementary Table 7), with consistent directional effects in other tissues, including the lung (Extended Data Fig. 6b). B3GALT5 codes for β-1,3-galactosyltransferase 5 and is most highly expressed in the small intestine and salivary gland (Extended Data Fig. 6c)14. This enzyme catalyzes the addition of galactose in the β-1,3 conformation to an N-acetylglucosamine (GlcNAc) saccharide during the synthesis of glycan core structures18. As noted above, ST6GAL1 adds sialic acid to a galactose. The lead variant at this locus (rs2837113) is a sentinel eQTL for B3GALT5 in skin and salivary gland tissue, with the rs2837113:A influenza-protective allele associating with higher gene expression (Supplementary Table 7).
Lastly, we performed in vitro experiments to study the impact of gene expression knockdown on influenza virus H1N1 (Puerto Rico 8 strain) infectivity. For these experiments, we focused on two likely effector genes of influenza-associated variants—ST6GAL1 and B3GALT5—because of their potential role in a critical step of influenza virus infectivity, that is, modulation of α-2,6-linked sialic acid abundance at the cell surface. We tested two small interfering RNAs (siRNAs) per gene in the A549 and Calu-3 cell lines, respectively, performing two independent experiments per siRNA. siRNAs against ST6GAL1 achieved approximately 90% expression knockdown and resulted in approximately 80% reduction in sialic acid abundance at the cell surface and approximately 50% reduction in influenza infectivity (Extended Data Figs. 7 and 8), which is consistent with previous findings19. These results support the notion that lower ST6GAL1 enzymatic activity reduces the ability of influenza virus to infect host cells, a mechanism that probably explains the association between variants at the 3q27.3 locus and lower risk of influenza infection. In contrast, knockdown of B3GALT5 expression was not associated with a consistent effect on influenza infectivity (Extended Data Fig. 9). As such, despite being a good biological candidate, it is unclear if B3GALT5 underlies the association at the 21q22.2 locus.
There are several important limitations that should be considered when interpreting the results from this study (discussed in detail in the Supplementary Note), including (1) phenotype misclassification, (2) potential confounding effects of unmeasured risk factors for influenza infection, (3) the use of self-reported influenza information in the AncestryDNA cohort; and (4) an undetermined influenza strain infecting GWAS participants.
In conclusion, we demonstrated that the genetic architectures of COVID-19 and influenza are mostly distinct, with few shared common genetic risk factors. We identified and replicated the first genome-wide-significant loci for influenza and demonstrated that inhibition of ST6GAL1 reduces viral infectivity in vitro. Host genetic studies of infectious diseases commonly identify protective variants that putatively downregulate (or ablate) proteins required for viral entry (CCR5 in HIV20, ACE2 in SARS-CoV-2 (ref. 7) and FUT2 in noroviruses21). Our findings provide the latest vignette to this evolving narrative.
Abstract
Coronavirus disease 2019 (COVID-19) and influenza are respiratory illnesses caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza viruses, respectively. Both diseases share symptoms and clinical risk factors1, but the extent to which these conditions have a common genetic etiology is unknown. This is partly because host genetic risk factors are well characterized for COVID-19 but not for influenza, with the largest published genome-wide association studies for these conditions including >2 million individuals2 and about 1,000 individuals3,4,5,6, respectively. Shared genetic risk factors could point to targets to prevent or treat both infections. Through a genetic study of 18,334 cases with a positive test for influenza and 276,295 controls, we show that published COVID-19 risk variants are not associated with influenza. Furthermore, we discovered and replicated an association between influenza infection and noncoding variants in B3GALT5 and ST6GAL1, neither of which was associated with COVID-19. In vitro small interfering RNA knockdown of ST6GAL1—an enzyme that adds sialic acid to the cell surface, which is used for viral entry—reduced influenza infectivity by 57%. These results mirror the observation that variants that downregulate ACE2, the SARS-CoV-2 receptor, protect against COVID-19 (ref. 7). Collectively, these findings highlight downregulation of key cell surface receptors used for viral entry as treatment opportunities to prevent COVID-19 and influenza.
Main
Please follow the link for tables and sources and to read the entire research letter!
To understand the extent to which the same host genetic factors influence the risk of coronavirus disease 2019 (COVID-19) and influenza, we first performed a genome-wide association study (GWAS) of influenza infection based on survey data from 296,313 participants of the AncestryDNA COVID-19 study who consented to the research8. Although the focus of that study was on risk factors for COVID-19, participants also indicated if they were tested for influenza in either the 2019–2020 or 2020–2021 flu seasons (Methods). Overall, 18,448 (6.2%) participants reported a positive test for influenza, and thus were considered cases for our analysis, while the remaining 277,865 participants (including 23,985 with a negative test) were considered population-level controls. We refer to this phenotype as ‘reported influenza infection’, but recognize that it does not represent true susceptibility to infection because the control group includes an undetermined number of individuals not exposed to influenza in either season or who were infected but not tested (for example, asymptomatic). As such, this phenotype may capture symptomatic influenza infection that required seeking (or being prescribed) a viral test.
Using these data from AncestryDNA, we tested the association between reported influenza infection and 10 million common (frequency >1%) imputed variants using REGENIE9, separately in three ancestral groups (with >100 influenza cases) defined based on genetic similarity to three superpopulations studied by the 1000 Genomes Project10 (Methods): from Europe (EUR; n = 254,750, 86.0%), Africa (AFR; n = 12,951, 4.4%) and the Americas (AMR; n = 26,928, 9.1%), totaling 18,334 cases and 276,295 controls (Supplementary Table 1). Results were meta-analyzed across ancestries using an inverse-variance, fixed-effects meta-analysis (Extended Data Fig. 1), identifying two loci associated with reported influenza infection at P < 5 × 10−8 (near ST6GAL1 and B3GALT5, respectively on chromosomes 21q22.2 and 3q27.3; Table 1). We describe these loci in detail later, including sensitivity and replication analyses in independent cohorts that demonstrate the reproducibility of these associations.
Results from the AncestryDNA GWAS of reported influenza infection were then used to determine if severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza infections have a shared genetic etiology. To address this question, we initially focused on 24 variants associated with COVID-19 identified by the Host Genetics Initiative (HGI)2 (freeze 6; Supplementary Table 2). Of these, only one was associated with reported influenza infection (P < 0.05/24 = 0.002), despite adequate power for most (Supplementary Table 3): rs505922 in ABO (odds ratio (OR) = 1.05 for the T allele, 95% confidence interval (CI) = 1.02–1.07, P = 2.2 × 10−4; heterogeneity test P = 0.13; Fig. 1). This variant increased the risk of reported influenza infection, while it decreased the risk of COVID-19 (OR = 0.92; 95% CI = 0.92–0.93, based on the HGI GWAS of reported infection2), in line with previous reports11. We explore the ABO locus in greater detail in the Supplementary Note, concluding that its association with influenza is (1) only partially attenuated after accounting for COVID-19 status and (2) probably tags an underlying causal variant shared with other diseases (for example, childhood ear infections, allergic disease) but not COVID-19. Overall, only 10 (42%) of 24 variants had a consistent direction of effect on both influenza and COVID-19 (Fig. 1).
The lack of significant and directionally consistent associations between reported influenza infection and COVID-19 loci suggests that the two diseases share few—if any—genetic risk factors. Consistent with these findings, the two risk variants for reported influenza identified in the AncestryDNA GWAS (in or near B3GALT5 and ST6GAL1) did not have a directionally consistent association with COVID-19 in the HGI analysis (Supplementary Table 4). Furthermore, the genetic correlation (rg)12 between reported influenza infection and both SARS-CoV-2 infection (rg = 0.30, P = 0.009) and COVID-19 hospitalization (rg = 0.34, P = 0.007) was modest (Supplementary Table 5). Collectively, these results suggest some sharing, but substantial divergence, in the genetic etiology underpinning influenza infection and COVID-19.
The AncestryDNA GWAS of reported influenza infection identified two associated loci (Table 1), with lead variants rs16861415 in ST6GAL1 (3q27.3; OR = 0.86 for C allele, 95% CI = 0.83–0.90, P = 1.4 × 10−10) and rs2837112 in B3GALT5 (21q22.2; OR = 0.90 for A allele, 95% CI = 0.88–0.92, P = 1.3 × 10−19). The effect allele ranged in frequency between 3% (AFR) and 8% (EUR) for rs16861415, and between 39% (AFR) and 49% (EUR) for rs2837112, with no evidence for heterogeneity of effect sizes across ancestries or cohorts (Supplementary Table 6). The reduction in influenza risk observed in homozygous carriers was 37% for ST6GAL1 and 20% for B3GALT5 (Table 1), with no evidence for epistasis between the two loci (Supplementary Note).
Next, we performed sensitivity and replication analyses to determine if the two influenza associations were robust to phenotype definition and reproducible. In the AncestryDNA cohort, excluding 253,880 individuals without influenza test results from the control group (resulting in 18,448 positive test cases versus 23,985 negative test controls) did not impact the effect size estimate for either locus: OR = 0.86 (versus 0.86) and P = 5.2 × 10−6 for ST6GAL1, and OR = 0.89 (versus 0.90) and P = 4.9 × 10−12 for B3GALT5 (Fig. 2). In contrast, defining influenza infection more loosely based on whether a participant reported having flu-like symptoms in the 2019–2020 or 2020–2021 flu seasons (43,956 cases versus 250,673 controls) led to attenuated effect sizes but still highly significant associations: OR = 0.93 and P = 1.7 × 10−7 for ST6GAL1, and OR = 0.95 and P = 4.0 × 10−11 for B3GALT5 (Fig. 2).
To determine if the associations were reproducible, we used data from medical records to define lifetime influenza infection status across 1,153,291 individuals from seven biobanks and five ancestral groups (Methods). Based on the presence of International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10) codes J09, J10 or J11 in hospital admissions, general practitioner records or death registries, we identified 22,022 (2%) individuals with (cases) and 1,131,269 without (controls) a lifetime medical record of influenza (Supplementary Table 1). As with the AncestryDNA GWAS, the control group in this replication analysis probably includes both individuals not exposed to influenza and individuals who had influenza but not an associated medical record. In a multiancestry meta-analysis of medical record influenza (Extended Data Fig. 2), we observed directionally consistent and genome-wide significant associations with both rs16861415 in ST6GAL1 (OR = 0.90, P = 3.0 × 10−10) and rs2837112 in B3GALT5 (OR = 0.93, P = 2.5 × 10−11; Table 1). Two measures of recent influenza infection also supported both associations. First, we found consistent and significant associations with a positive culture for influenza A (Methods), an indicator of current infection available in 82,348 individuals from the Geisinger Health Study (GHS) biobank: OR = 0.82 and P = 0.005 for ST6GAL1, OR = 0.86 and P = 3.02 × 10−5 for B3GALT5 (Fig. 2). Second, both variants lowered the risk of a positive seropositive test for influenza A in a published study of 1,000 individuals6, significantly so for B3GALT5 (OR = 0.70, P = 0.001; Fig. 2). Lastly, the B3GALT5 variant significantly lowered the risk of flu-related hospitalization among influenza cases (1,696 hospitalized cases versus 8,239 nonhospitalized cases, OR = 0.88, P = 0.005), with a similar, albeit nonsignificant, protective effect for the ST6GAL1 variant (OR = 0.89, P = 0.17; Fig. 2). Collectively, these findings establish both loci as reproducible genetic risk factors for influenza and indicate that the B3GALT5 variant also reduces disease severity.
We did not find any additional associated loci in the meta-analysis of discovery (AncestryDNA) and replication (biobank) cohorts (40,356 cases versus 1,407,564 controls; Extended Data Fig. 3). As observed in the AncestryDNA GWAS, aside from ABO, published COVID-19 risk variants were not associated with influenza in this larger analysis (Extended Data Fig. 4).
Next, to help understand how each influenza locus contributes to disease pathophysiology, we identified the likely effector genes of the GWAS signal, concentrating on the lead variant at the 3q27.3 and 21q22.2 loci in the meta-analysis of the discovery and replication cohorts, that is, rs13322149 and rs2837113, respectively (Extended Data Fig. 3). Based on high linkage disequilibrium (LD, r2 > 0.80) between each variant, and sentinel expression quantitative trait loci (eQTLs) and enhancer-overlapping variants (Supplementary Tables 7 and 8), four genes were prioritized: ST6GAL1 and ADIPOQ at the 3q27.3 locus; and B3GALT5 and IGSF5 at the 21q22.2 locus. Analysis of rare loss-of-function (LOF) and missense variants assayed via exome sequencing of 14,189 cases with influenza and 811,714 controls did not identify any significant genome-wide associations (Extended Data Fig. 5); however, when we focused on the four genes highlighted above, we found a missense variant in IGSF5 (frequency 0.01%) associated with a 9.2-fold higher risk of medical record influenza, which was significant after correcting for 631 rare variant tests performed across the four genes (P = 2.3 × 10−5; Supplementary Table 9). This observation provides additional support for IGSF5 as one of the likely effector genes underlying the common variant association with flu at the 21q22.2 locus.
Of the four likely effector genes of the influenza loci, ST6GAL1 and B3GALT5 are strong biological candidates (ADIPOQ and IGSF5 are discussed in the Supplementary Note). ST6GAL1 codes for the enzyme β-galactoside α-2,6-sialyltransferase 1, which catalyzes the addition of sialic acid to galactose by an α-2,6 linkage13; it is most highly expressed in the liver and in Epstein–Barr virus-transformed B cells in humans (Extended Data Fig. 6a)14. Critically, influenza virus infection is initiated when the viral hemagglutinin glycoprotein binds to an α-2,6-linked sialic acid found on human host cell surface glycoproteins and glycolipids in the upper respiratory tract, which are used by the virus as attachment factors that facilitate the subsequent engagement with a functional receptor required to enter the target cell15,16,17. The lead variant at this locus (rs13322149) colocalized with a sentinel eQTL (rs73187789:A, r2 = 0.95) that is associated with lower expression of ST6GAL1 in thyroid tissue from the Genotype-Tissue Expression (GTEx) project14 (P = 3.4 × 10−12; Supplementary Table 7), with consistent directional effects in other tissues, including the lung (Extended Data Fig. 6b). B3GALT5 codes for β-1,3-galactosyltransferase 5 and is most highly expressed in the small intestine and salivary gland (Extended Data Fig. 6c)14. This enzyme catalyzes the addition of galactose in the β-1,3 conformation to an N-acetylglucosamine (GlcNAc) saccharide during the synthesis of glycan core structures18. As noted above, ST6GAL1 adds sialic acid to a galactose. The lead variant at this locus (rs2837113) is a sentinel eQTL for B3GALT5 in skin and salivary gland tissue, with the rs2837113:A influenza-protective allele associating with higher gene expression (Supplementary Table 7).
Lastly, we performed in vitro experiments to study the impact of gene expression knockdown on influenza virus H1N1 (Puerto Rico 8 strain) infectivity. For these experiments, we focused on two likely effector genes of influenza-associated variants—ST6GAL1 and B3GALT5—because of their potential role in a critical step of influenza virus infectivity, that is, modulation of α-2,6-linked sialic acid abundance at the cell surface. We tested two small interfering RNAs (siRNAs) per gene in the A549 and Calu-3 cell lines, respectively, performing two independent experiments per siRNA. siRNAs against ST6GAL1 achieved approximately 90% expression knockdown and resulted in approximately 80% reduction in sialic acid abundance at the cell surface and approximately 50% reduction in influenza infectivity (Extended Data Figs. 7 and 8), which is consistent with previous findings19. These results support the notion that lower ST6GAL1 enzymatic activity reduces the ability of influenza virus to infect host cells, a mechanism that probably explains the association between variants at the 3q27.3 locus and lower risk of influenza infection. In contrast, knockdown of B3GALT5 expression was not associated with a consistent effect on influenza infectivity (Extended Data Fig. 9). As such, despite being a good biological candidate, it is unclear if B3GALT5 underlies the association at the 21q22.2 locus.
There are several important limitations that should be considered when interpreting the results from this study (discussed in detail in the Supplementary Note), including (1) phenotype misclassification, (2) potential confounding effects of unmeasured risk factors for influenza infection, (3) the use of self-reported influenza information in the AncestryDNA cohort; and (4) an undetermined influenza strain infecting GWAS participants.
In conclusion, we demonstrated that the genetic architectures of COVID-19 and influenza are mostly distinct, with few shared common genetic risk factors. We identified and replicated the first genome-wide-significant loci for influenza and demonstrated that inhibition of ST6GAL1 reduces viral infectivity in vitro. Host genetic studies of infectious diseases commonly identify protective variants that putatively downregulate (or ablate) proteins required for viral entry (CCR5 in HIV20, ACE2 in SARS-CoV-2 (ref. 7) and FUT2 in noroviruses21). Our findings provide the latest vignette to this evolving narrative.