Other pages & sections of our site:
[Home]  [Y-DNA]   [Contacts   [Groups]  [Haplogroups[Trees]  [Project Blog]  [Special]   [FAQ]
On this page:
 

How are SNPs Named?

The names for SNPs are a source of great confusion. Many SNPs have multiple names and some of the names have no relationship to the nature of the SNP. We can't go into all the complexities here, but we can describe a general picture and give examples. 

"SNP" stands for single-nucleotide polymorphism, referring to the fact that a base pair at a particular position may mutate from one nucleotide base to another. A cytosine (C) may be replaced with adenine (A) or guanine (G) with thymine (T). This will also change the complementary nucleotide on the other half of the double-helix.

SNP Name Systems

The basic systems in place for naming a SNP include:

  1. By who discovered it and in when
  2. By chromosome position and allele form.
  3. NCBI reference number

All of these systems are in current use.

Who found it

It's long been the custom in science that a person discovering a phenomenon is entitled to name it. So, too, with SNPs. If you discover a new scientific principle (Boyle's Law) , comet (Halley's), spider (Aphonopelma johnnycashi) or quantum mechanics particle (Higgs Boson) you get to name it. (TYou aren't supposed to name it after yourself but others can attributie it to you.)

With SNPs, a convention was established that names would consist of leading letters, followed by a number representing the order in which that person (or group) found it. For example, when Dr. Michael Hammer at the U. of Arizona found a mutation from C (cytosine) to A (adenine) at position 22157311 (build GRCh37) on the Y chromosome, he named it "P312" because he'd found 311 other mutations previously.

P312 (S116, PF6547, 22157311A>C, rs34276300) turned out to be an important SNP, marking a hereditary boundary between two very large bunches  of European R1b men who are M269+ (AKA PF6517, 22739367T>C & rs9786153). The other large group is U106+ (AKA S21 & M405) with the NCBI name rs16981293 and position name 8796078C>T.

This "name by finder" system works fairly well when the pace of discovery isn't too fast for scientists to check with each other and agree on a comon name. It breaks down when (A) discoveries come faster than publication and (B) scientists forget the "don't use you own name" rule.

The problem is that multiple people have discovered (and continue to discover) the same, identical things, giving them multiple names. Dr. James Wilson at Edinburgh University also discovered the mutation from C to A at position 22157311 and named it S116 (his 116th). And Dr. Paolo Francalacci, at Universita di Sassari in Italy, found the same thing and named it PF6547.

We have created a situation of mutual incomprehension; we don't understand each other. Discussion groups are filled with debates about the "proper name" for the same SNP. We've seen it said, "I can't be P312+; I'm S116+."

This, however, is not an intentional conspiracy to confuse and mislead. It just "grew like Topsy".

Chromosome position

Another way of identifying a SNP is by where it occurs on the chromosome and its allele form. For example, P312 (AKA S116, AKA PF6547) would be "22157311A>C".

In this system, each SNP has only one name -- one at a time, that is. (See below.) Many researchers (especially FTDNA) use this system internally to identify and catalogue SNPs.

Position numbers change

Because scientists are still learning more of the intricacies of the Y-chromosome and building revised models to better map the chromosome, position numbers can change with each successive model (known as a "build). To be clear which SNP is meant, it's best to cite the build number for the position ID.

NCBI

Some (but not all) SNPs have been registered with the National Center for Biotechnology information (NCBI), which assigns them a name consisting of the letters "rs" followed by a unique number. Submission for registration is a formal process with extensive documentation requirements. Further, NCBI registration results in publication; some researchers may not want the SNP published.

Example: P312/S116/PF6547 is registered with NCBI as rs34276300 [Homo sapiens]. Its NCBI page gives this information:

TCCTGCTAATGTATCTGCTGCACTG[A/C]CTTTCACTTTAGCCCCAACTCCACC
Chromosome: Y:19995425
Validated: by 1000G, by cluster, by frequency
Global MAF: A=0.1427/176
HGVS: NC_000024.10:g.19995425A>C, NC_000024.9:g.22157311A>C
PubMed article

Problem common to all the above

None of these systems relate, in any way, to where an SNP falls on the phylogenetic tree. In practice, SNPs are first discovered, then their phylogenetic meanings are teased out.

"Who found" List

From SNP Naming Convention Sun Oct 16, 2016 9:37 pm (PDT) . Posted by: alan.kane133

Resources

We recommend these references:

Revised: 18 Oct 2016