SNPs: Single-Nucleotide Polymorphisms

Source: http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism.

A single-nucleotide polymorphism (SNP, pronounced "snip"; plural "snips") is a DNA sequence variation occurring when a single nucleotide — A, T, C or G — in the genome (or other shared sequence) differs between members of a biological species or paired chromosomes in a human. In short, one nucleotide is replaced by another.

For example, two sequenced DNA fragments from different individuals,

contain a difference in a single nucleotide. Replacement of C with T creates two alleles (versions). Almost all common SNPs have only two alleles. The genomic distribution of SNPs is not homogenous; SNPs usually occur in non-coding regions more frequently than in coding regions.

In short, a SNP is a change of one nucleotide in the sequence for another. Think of it as a mutation.

Use in genetic genealogy

In genetic genealogy, certain SNPs define haplogroups and their subclades and are relied upon to understand branching of the human phylogenetic tree and fix one's place on it.

Especially in the Y chromosome, SNPs are additive down a paternal lineage. A SNP, once acquired, does not disappear from the lineage, but other SNPs may be added elsewhere in the chromosome.

SNP Names

There is not a uniform naming system for SNPs, a fact that causes confusion. Several systems are in place, leading to multiple names for the same thing.

One system used frequently -- as in the above example -- consists of one to three capital letters followed by one to several numbers, "M170". The "M" is a code for the research group which found, catalogued and published the SNP discovery and "170" means it was the 170th found by that group. In the NCBI system, this same SNP is known as " rs2032597".

However, the same SNP may also be discovered by other groups, each attaches its code and number. Only later is it found that the multiple discoveries are identical.  DF29 and S438 are alternate names for the same SNP; it's common to write this "DF29/S438".

The NCBI (National Center for Biomedical Information) system is the closest to a standard. However, not all SNPs have been registered with NCBI.

Upstream, Downstream

The terms upstream and downstream refer to SNP positions on the phylogenetic tree. Closer to the root is considered "upstream" and further out on a branch is considered "downstream". For example, M269 is upstream of U106 (AKA, M405/S21), which defines one of M269's subclades.

Terminal SNP

The "terminal SNP" is the furthest downstream SNP for which a person has tested positive. This does not necessarily mean it is the furthest downstream SNP for which it's possible to test and the person may be positive.

In our experience, "terminal SNP" is determined more by testing limitations than by a person's DNA status. It might be better to avoid the "terminal" part.

SNP vs. bit pair

Now that "Next Generation Sequencing" is more available, some people are confusing "SNP" with "bit pair"; they are not the same thing. SNP implies that a bit pair in a particular chromosome position can exist in more than one form, e.g., AC or AG. Bit pair carries no such implication; it merely describes the specific pair of nucleotides at that position on the chromosome.