Abstract
Transcriptomics is often used to investigate changes in an organism’s genetic response to environmental contamination. Data noise can mask the effects of contaminants making it difficult to detect responding genes. Because the number of genes which are found differentially expressed in transcriptome data is often very large, algorithms are needed to reduce the number down to a few robust discriminative genes. We present an algorithm for aggregated analysis of transcriptome data which uses multiple fold-change thresholds (threshold screening) and p values from Bayesian generalized linear model in order to assess the robustness of a gene as a potential indicator for the treatments tested. The algorithm provides a robustness indicator (ROBI) as well as a significance profile, which can be used to assess the statistical significance of a given gene for different fold-change thresholds. Using ROBI, eight discriminative genes were identified from an exemplary dataset (Danio rerio FET treated with chlorpyrifos, methylmercury, and PCB) which could be potential indicators for a given substance. Significance profiles uncovered genetic effects and revealed appropriate fold-change thresholds for single genes or gene clusters. Fold-change threshold screening is a powerful tool for dimensionality reduction and feature selection in transcriptome data, as it effectively reduces the number of detected genes suitable for environmental monitoring. In addition, it is able to unmask patterns in altered genetic expression hidden by data noise and reduces the chance of type II errors, e.g., in environmental screening.
Similar content being viewed by others
References
Aardema MJ, MacGregor JT (2002) Toxicology and genetic toxicology in the new era of “toxicogenomics”: impact of “-omics” technologies. Mutat Res Fundam Mol Mech Mutagen 499:13–25. doi:10.1016/S0027-5107(01)00292-5
Benjamini Y, Drai D, Elmer G et al (2001) Controlling the false discovery rate in behavior genetics research. Behav Brain Res 125:279–284. doi:10.1016/S0166-4328(01)00297-2
Breitling R, Armengaud P, Amtmann A, Herzyk P (2004) Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett 573:83–92. doi:10.1016/j.febslet.2004.07.055
Busquet F, Strecker R, Rawlings JM et al (2014) OECD validation study to assess intra- and inter-laboratory reproducibility of the zebrafish embryo toxicity test for acute aquatic toxicity testing. Regul Toxicol Pharmacol 69:496–511. doi:10.1016/j.yrtph.2014.05.018
Chapman PM (2000) The sediment quality triad: then, now and tomorrow. Int J Environ Pollut 13:351–356
Chapman PM, Anderson J (2005) A decision-making framework for sediment contamination. Integr Environ Assess Manag 1:163–173. doi:10.1897/2005-013R.1
Cordero F, Botta M, Calogero RA (2007) Microarray data analysis and mining approaches. Brief Funct Genomic Proteomic 6:265–281. doi:10.1093/bfgp/elm034
Cuello S, Ximénez-Embún P, Ruppen I et al (2012) Analysis of protein expression in developmental toxicity induced by MeHg in zebrafish. Analyst 137:5302–5311. doi:10.1039/c2an35913h
Dalman MR, Deeter A, Nimishakavi G, Duan Z-H (2012) Fold change and p-value cutoffs significantly alter microarray interpretations. BMC Bioinf 13:S11. doi:10.1186/1471-2105-13-S2-S11
DeConde RP, Hawley S, Falcon S et al (2006) Combining results of microarray experiments: a rank aggregation approach. Stat Appl Genet Mol Biol 5:15. doi:10.2202/1544-6115.1204
Denslow ND, Garcia-Reyero N, Barber DS (2007) Fish “n” chips: the use of microarrays for aquatic toxicology. Mol Biosyst 3:172–177. doi:10.1039/b612802p
Ding C, Hanchuan P (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinforma Comput Biol 3:185–205
Draghici S (2002) Statistical intelligence: effective analysis of high-density microarray data. Drug Discov Today 7:55–63
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 95:14863–14868. doi:10.1073/pnas.95.25.14863
Fedorenkova A, Vonk J (2010) Ecotoxicogenomics: bridging the gap between genes and populations. Environ Sci Technol 44:4328–4333
Feiler U, Höss S, Ahlf W et al (2013) Sediment contact tests as a tool for the assessment of sediment quality in German waters. Environ Toxicol Chem 32:144–155. doi:10.1002/etc.2024
Gelman A, Jakulin A, Pittau MG, Su Y-S (2008) A weakly informative default prior distribution for logistic and other regression models. Ann Appl Stat 2:1360–1383. doi:10.1214/08-AOAS191
Hallare AV, Seiler TB, Hollert H (2011) The versatile, changing, and advancing roles of fish in sediment toxicity assessment-a review. J Soils Sediments 11:141–173. doi:10.1007/s11368-010-0302-7
Hausen J, Otte JC, Yang L et al (2015) Fishing for contaminants: aggregated analysis of gene expression data reveals discriminative genes for effects on Danio rerio embryogenesis in response to sediment-typical pollution. Environ Sci Eur (in this issue)
Ho NY, Yang L, Legradi J et al (2013) Gene responses in the central nervous system of zebrafish embryos exposed to the neurotoxicant methyl mercury. Environ Sci Technol 47:3316–3325. doi:10.1021/es3050967
Hollert H, Keiter S, König N et al (2003) A new sediment contact assay to assess particle-bound pollutants using zebrafish (Danio rerio) embryos. J Soils Sediments 3:197–207. doi:10.1065/jss2003.09.085
Höss S, Ahlf W, Fahnenstich C et al (2010) Variability of sediment-contact tests in freshwater sediments with low-level anthropogenic contamination—determination of toxicity thresholds. Environ Pollut 158:2999–3010. doi:10.1016/j.envpol.2010.05.013
Jönsson ME, Jenny MJ, Woodin BR et al (2007) Role of AHR2 in the expression of novel cytochrome P450 1 family genes, cell cycle genes, and morphological defects in developing zebra fish exposed to 3,3’,4,4’,5-pentachlorobiphenyl or 2,3,7,8-tetrachlorodibenzo-p-dioxin. Toxicol Sci 100:180–193. doi:10.1093/toxsci/kfm207
Keiter S, Peddinghaus S, Feiler U et al (2010) DanTox—a novel joint research project using zebrafish (Danio rerio) to identify specific toxicity and molecular modes of action of sediment-bound pollutants. J Soils Sediments 10:714–717. doi:10.1007/s11368-010-0221-7
Keiter SH, Braunbeck T, Feiler U et al (2013) DanTox - Entwicklung und Anwendung eines Verfahrens zur Ermittlung spezifischer Toxizität und molekularer Wirkungsmechanismen sedimentgebundener Umweltschadstoffe mit dem Zebrabärbling (Danio rerio) : Schlussbericht. Aachen
Kerr MK, Martin M, Churchill GA (2000) Analysis of variance for gene expression microarray data. J Comput Biol 7:819–837. doi:10.1089/10665270050514954
Kosmehl T, Otte JC, Yang L et al (2012) A combined DNA-microarray and mechanism-specific toxicity approach with zebrafish embryos to investigate the pollution of river sediments. Reprod Toxicol 33:245–253. doi:10.1016/j.reprotox.2012.01.005
Legradi J (2011) Microarray based transcriptomics and the search for biomarker genes in zebrafish. Ruprecht-Karls Universität, Heidelberg
Lesaffre E, Albert A (1989) Partial separation logistic regression. J R Stat Soc Ser B 51:109–116
Lettieri T (2005) Recent applications of DNA microarray technology to toxicology and ecotoxicology. Environ Health Perspect 4–9. doi: 10.1289/ehp.8194
Mapstone BD (1995) Scalable decision rules for environmental impact studies: effect size, type I, and type II errors. Ecol Appl 5:401–410
McCarthy DJ, Smyth GK (2009) Testing significance relative to a fold-change threshold is a TREAT. Bioinformatics 25:765–771. doi:10.1093/bioinformatics/btp053
Nøstbakken OJ, Goksøyr A, Martin SAM et al (2012) Marine n-3 fatty acids alter the proteomic response to methylmercury in Atlantic salmon kidney (ASK) cells. Aquat Toxicol 106–107:65–75. doi:10.1016/j.aquatox.2011.10.008
Oleksiak MF, Churchill GA, Crawford DL (2002) Variation in gene expression within and among natural populations. Nat Genet 32:261–266. doi:10.1038/ng983
Otte JC, Andersson C, Abrahamson A et al (2008) A bioassay approach to determine the dioxin-like activity in sediment extracts from the Danube River: ethoxyresorufin-O-deethylase induction in gill filaments and liver of three-spined sticklebacks (Gasterosteus aculeatus L.). Environ Int 34:1176–1184. doi:10.1016/j.envint.2008.05.004
Padhi BK, Joly L, Tellis P et al (2004) Screen for genes differentially expressed during regeneration of the zebrafish caudal fin. Dev Dyn 231:527–541. doi:10.1002/dvdy.20153
Pavlidis P (2003) Using ANOVA for gene selection from microarray studies of the nervous system. Methods 31:282–289. doi:10.1016/S1046-2023(03)00157-9
Peng H-Y, Jiang C-F, Fang X, Liu J-S (2014) Variable selection for Fisher linear discriminant analysis using the modified sequential backward selection algorithm for the microarray data. Appl Math Comput 238:132–140. doi:10.1016/j.amc.2014.03.141
Piña B, Barata C (2011) A genomic and ecotoxicological perspective of DNA array studies in aquatic environmental risk assessment. Aquat Toxicol 105:40–49. doi:10.1016/j.aquatox.2011.06.006
R Core Team (2014) R: a language and environment for statistical computing. R Found Stat Comput 1:409. doi:10.1007/978-3-540-74686-7
Reboiro-Jato M, Díaz F, Glez-Peña D, Fdez-Riverola F (2014) A novel ensemble of classifiers that use biological relevant gene sets for microarray classification. Appl Soft Comput 17:117–126. doi:10.1016/j.asoc.2014.01.002
Schiwy S, Bräunig J, Alert H et al (2014) A novel contact assay for testing aryl hydrocarbon receptor (AhR)-mediated toxicity of chemicals and whole sediments in zebrafish (Danio rerio) embryos. Environ Sci Pollut Res Int. doi:10.1007/s11356-014-3185-0
Snell TW, Brogdon SE, Morgan MB (2003) Gene expression profiling in ecotoxicology. Ecotoxicology 12:475–483
Strähle U, Scholz S, Geisler R et al (2012) Zebrafish embryos as an alternative to animal experiments—a commentary on the definition of the onset of protected life stages in animal welfare regulations. Reprod Toxicol 33:128–132. doi:10.1016/j.reprotox.2011.06.121
Underwood AJ, Chapman MG (2003) Power, precaution, type II error and sampling design in assessment of environmental impacts. J Exp Mar Biol Ecol 296:49–70. doi:10.1016/S0022-0981(03)00304-6
Wang Y-H, Chen Y-H, Wu T-N et al (2006) A keratin 18 transgenic zebrafish Tg(k18(2.9):RFP) treated with inorganic arsenite reveals visible overproliferation of epithelial cells. Toxicol Lett 163:191–197. doi:10.1016/j.toxlet.2005.10.024
Witten DM, Tibshirani R (2007) A comparison of fold-change and the t-statistic for microarray data analysis. Stanford Univ. Tech. Rep
Yang IV, Chen E, Hasseman JP et al (2002) Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol 3:1–12
Yang L, Kemadjou JR, Zinsmeister C et al (2007) Transcriptional profiling reveals barcode-like toxicogenomic responses in the zebrafish embryo. Genome Biol 8:R227. doi:10.1186/gb-2007-8-10-r227
Yang L, Ho NY, Müller F, Strähle U (2010) Methyl mercury suppresses the formation of the tail primordium in developing zebrafish embryos. Toxicol Sci 115:379–390. doi:10.1093/toxsci/kfq053
Zhang S, Cao J (2009) A close examination of double filtering with fold change and T test in microarray analysis. BMC Bioinf 10:402. doi:10.1186/1471-2105-10-402
Zorn C (2005) A solution to separation in binary response models. Polit Anal 13:157–170. doi:10.1093/pan/mpi009
Acknowledgments
The present study was part of the research funding project DanTox (DanTox—a novel joint research project using zebrafish (Danio rerio) to identify specific toxicity and molecular modes of action of sediment-bound pollutants). The authors acknowledge the financial support by the German Federal Ministry of Education and Research (BMBF grant 02WU1053). Also, the authors acknowledge the data provision from the GENDarT2 project (BMBF grant AZ:0315190 B).
Conflict of interest
The authors declare that they have no competing interests.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Philippe Garrigues
Rights and permissions
About this article
Cite this article
Hausen, J., Otte, J.C., Strähle, U. et al. Fold-change threshold screening: a robust algorithm to unmask hidden gene expression patterns in noisy aggregated transcriptome data. Environ Sci Pollut Res 22, 16384–16392 (2015). https://doi.org/10.1007/s11356-015-5019-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-015-5019-0