ALCHEMY - An automated, population genetic model driven SNP genotype calling method.

SourceForge.net Logo

ALCHEMY is a method for automated calling of diploid genotypes from raw intensity data produced by various high-throughput multiplexed SNP genotyping methods. ALCHEMY has been developed for and tested on Affymetrix GeneChip Arrays, Illumina GoldenGate, and Illumina Infinium based assays. Primary motivations for ALCHEMY's development was the lack of available genotype calling methods which can perform well in the absence of heterozygous samples (due to panels of inbred lines being genotyped) or provide accurate calls with small sample batches. ALCHEMY differs from other genotype calling methods in that genotype inference is based on a parametric Bayesian model of the raw intensity data rather than a generalized clustering approach and the model incorporates population genetic principles such as Hardy-Weinberg equilibrium adjusted for inbreeding levels. ALCHEMY can simultaneously estimate individual sample inbreeding coefficients from the data and use them to improve statistical inference of diploid genotypes at individual SNPs. The main documentation for ALCHEMY is maintained on the sourceforge-hosted MediaWiki system.

ALCHEMY was initially developed by Dr. Mark Wright at Cornell University in the laboratories of Dr. Susan McCouch and Dr. Carlos Bustamante as part of the NSF-funded Transgressive Variation in Rice Project (NSF #0606461) and is the recommended genotype calling software for SNP genotyping products developed by the project. For more information on genetic diversity in rice and on-going projects to discovery variation and genotype large collections of rice germplasm, see the Rice Diversity Project

Citing ALCHEMY

An academic peer-reviewed paper describing ALCHEMY was published in Bioinformatics in 2010. Academic users who publish results based on ALCHEMY analyses should cite:

Wright MH, et al. ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations. Bioinformatics 2010;26(23):2952-60

Current Status

ALCHEMY is GPL free software hosted by SourceForge.net. An official source code release is availabe on the release files page, including a plugin for Illumina GenomeStudio to extract the data needed for input from Illumina projects.

ALCHEMY is under further development to improve accuracy, call rates, execution speed, and memory consumption. The most recent source available can be obtained from the subversion repository development branch. It is not guaranteed to work. Beta releases and/or updates to the main version are expected as well but none are currently tagged in subversion.

Requirements

ALCHEMY is written in C and developed on the GNU/Linux platform. It should compile on any current GNU/Linux distribution with the development packages for the GNU Scientific Library (gsl) and other development packages for standard system libraries. It may also compile and run on Mac OS X if gsl is installed.