Other pages & sections of our site:
[Home]  [Y-DNA]   [Contacts]   [Groups[Haplogroups[Trees]  [Project Blog]  [Special Features]   [FAQ]
To find a specific term, use the "Find" function (control-F) of your browser. 

Definitions of Genetic Genealogy Terms

A list of terms that we use and what they mean. We present it in order to be sure we're all talking about the same things and using terms in the same ways.


Abbreviation for Di-oxy-riboNucleic Acid, the fundamental biochemical building block of life. DNA has a chemical structure of a double-helix ; like a ladder twisted into a spiral and then coiled on itself. Almost all living things have DNA in all their cells. The kinds of DNA typically tested for genetic genealogy are Y-DNA, mtDNA and autosomal DNA.
DNA is an information-storage device of remarkable capacity. It remembers what our bodies are to look like and how they're supposed to function.
RNA stands for RiboNucleic Acid. Think of it as the complementary of DNA; It has several roles, including as a messenger in cell replication and catalyzing biochemical reactions. ,
Genetic Genealogy
The study & use of DNA for genealogical purposes. DNA, because it's inherited from parents, may help to identify ancestors. Here, we restrict genetic genealogy to DNA uses which can lead to the identification by name, relationship, dates, places and other characteristics of specific individuals as one's ancestors..
Genetic ancestry
The commercial application of genetics to deep ancestry. It paints with a broader brush than genetic genealogy. It does not specify relationships and names only the famous. This field borrows from academic findings in "Population Genetics".
Genetic archeology
The application of genetics to very deep ancestry, typically prehistoric.  DNA from ancient remains is extracted, analyzed and perhaps compared to other (e.g., modern) populations. It is then correlated with archeological artifacts to infer information about ancient peoples.
Population genetics
The study of of genetic variation within and between populations, involving the examination and modeling of changes in the frequencies of genes and alleles in populations over space and time. ... In natural populations, however, the genetic composition of a population's gene pool may change over time. See Wikipedia.
Forensic genetics
The branch of genetics that applies genetic knowledge to legal problems and legal proceedings. Forensic genetics is also a branch of forensic medicine which deals more broadly with the application of medical knowledge to legal matters. See Medicine Net.
A section of DNA made up of nucleotides and the molecular unit of heredity. (Source) A gene is a part of a chromosome. The existence of genes was discovered by Gregor Mendel in the late 1850s to 1860s and that they are contained in chromosomes came in the 1880s to 1900s.
An organic molecule (compound consisting of a nucleoside linked to a phosphate group) that serve as the monomers, or subunits, of nucleic acids like DNA and RNA (ribonucleic acid). The building blocks of nucleic acids, nucleotides are composed of a nitrogenous base, a five-carbon sugar (ribose or deoxyribose), and at least one phosphate group.

Chemical schematic of an adenine nucleotide (Sugar not shown)

DNA has  four kinds of nucleotides: Adenine (A), Thymine (T), Cytosine (C) and Guanine (G).
Base pair
A pair of nucleotides bound together.  A & T bind with each other across the double-helix, as do C & G. C and G  do not bind with either A or T. (Mnemonic: Straight letters go together & curvy letters go together.)

Diagram of base pair binding.

Base pair may be abbreviated "bp", or with millions of base pairs "Mbp" or "Mb".
SNP -- pronounced "snip" -- stands for single-nucleotide polymorphism. It refers to the fact that a base pair at a particular position may change, say from an A-T to a C-G. When changed, the new form is inherited, usually without further mutation.
From Wikipedia: "A chromosome is an organized structure of DNA and protein found in cells, ..a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions."
Sex chromosome
The DNA chromosomes which determine gender. Females have two X chromosomes; males have one X & one Y. One X chromosome is inherited from the mother; either an X or Y is inherited from the father. (See yDNA and xDNA below.)
Any of the other 22 pairs of chromosomes which do not determine gender. They are numbered 1 to 22 and determine physical characteristics.
Nuclear DNA
The DNA inside a cell's nucleus. It consists of chromosomes, the number of which will vary from one species to another. Humans have 23 pairs of chromosomes. One of these pairs, either an XX or XY, is gender-determining; the 22 other pairs are designated by number.
DNA from the Y-chromosome, one of two sex chromosomes in males, inherited from fathers. The academically-preferred form is yDNA. See also this page.
DNA from the X-chromosome, one of two sex chromosomes. Males have one X (from the mother) and one Y (from the father). Females have an X from the mother and another X from the father. xDNA matching can trace direct maternal lineages, but is less often used.
DNA from the mitochondria, the "energy-makers" for cells; they exist within the cell, but outside the nucleus. Both sexes have mtDNA, inherited solely from their mothers. Mitochondria from male sperm are destroyed when a female's egg is fertilized. See also this page.
Autosomal DNA
DNA from the 22 pairs of chromosomes which are neither X or Y sex chromosomes. Autosomal DNA is inherited from both mothers and fathers, in ways which are complex and less than perfectly understood. Mendel's rule of averages (The offspring of a red flower and a white one will be 1/2 pink, 1/4 red and 1/4 white.) goes only so far. Autosomal DNA matches identify descendants of all one's ancestors, but is generally limited to fifth cousins. See also this page.
Autosomal DNA is currently the most popular genetic genealogy test.
An inherited family name. In European culture, it is passed down by fathers to their children. (In women, it may be a "maiden name".) Inheritance is a key test of a surname; a byname which is not inherited is not a surname. Surnames are a relatively recent cultural practice; they were unknown in most of Europe before the second millennium. More common was a patronymic suystem in which a person;s second name was his or her father's name with a prefix or suffix.
A similarity of a pair or set of DNA haplotypes. The pair or set may be identical or display a small number of minor differences. A yDNA match may indicate that the sample donors share a common direct paternal make ancestor, a CMA. Taylor Family Genes classifies yDNA matches as follows:
  • Reported Match -- A match reported by FTDNA; it meets FTDNA thresholds for reporting and indicates a possibility of a common ancestor. This category may not include all significant matches and it may include some matches which are not significant.
  • Significant Match -- A significant match is similarity sufficient to predict, to a high degree of probability, a common ancestor within the genealogic time frame; indicates a high probability of a common ancestor since about 1350 AD. Significant matches are usually included within reported matches, except that they do not include 12-marker matches. Other matches may be significant but not meet the FTDNA reporting threshold.
  • Exact Match -- A match where all markers are in complete agreement. If for more than 25 or more markers, it is a special case of a significant match and relatively infrequent.
  • Coincidental Match -- A match which does not necessarily reflect a common ancestor within genealogic time. For some, very common, haplotypes a match may be due to "convergent evolution". SNP testing is recommended to eliminate these false postives.
Match Quality
A quantification of the degree to which a pair of Y-DNA results match (are ssimilar):
  • By markers agreeing: One method is as a fraction, with the number of markers whose alleles agree completely as the numerator and the number tested in common as the denominator, e.g., "35/37".  An exact match is 25/25, 37/37 or 67/67.
  • By markers disagreeing: An alternative way of stating the fraction is as the reverse -- i.e., the number of disagreeing or mismatching markers is stated first, then the number compared. If 2 of 37 markers disagree (35 agree) the match is labeled 2/37. It's relatively easy to tell this method from the one above; if the number mismatching should be only a small number for quality match.
  • By GD: A third quantification method is "genetic distance" (GD). When using GD, one should always specify the number of markers compared. For example, "2:37" indicates a GD of 2 across 37 markers. Before using the method , we recommend viewing our page with FTDNA's statements on the subject.
  • Fourthly, is the TiP method. This method accounts for differences in mutation rates and applicable models, so most accurately measures the similarity of a pair of haplotypes.
"Hidden Match", "false negatives"
We should recognize the limitations of match-searching algorithms based on mismatching ySTR markers; they do not necessarily find every match of genealogical significance. Some fast-mutating markers may, in a few generations, produce differences exceeding the algorithms' limits. In such instances, hidden matches exist and may be found only with special tools.
Infinite Alleles Model
A method for comparing markers in a pair of yDNA (STR) results; it assumes that markers are free to change by any number during each mutation. This method counts each differing marker as one. (Most of the initial matching work in Taylor Family Genes was done using this simpler model and later refined with the TiP method.)
Stepwise Model
A method of comparing in a pair of yDNA (STR) results, which assumes that markers can change only by one step in each mutation. This method counts the absolute sum of differences across all markers.
Genetic Distance
Abbreviated "GD", this term was redefined by Family Tree DNA, effective 12 December 2012. Previously, GD was a simple sum of all differences in allele counts across all markers compared. The new definition is more complex; see this page.
However, genetic distance does not stand on its own; it is only relevant in the context of the number of markers being compared. A genetic distance of 2 may rule out a common ancestor for 12 markers, but rule one in for 37 markers.
Genetic family, Cluster or Group
These terms are often used as synonyms. A group of two or more significant matches is said a to form a "cluster" of matching DNA; indicating a genetic family. Minimum cluster size is two, but there is no maximum size to a cluster. See also "Triangulation".
The term "related" has a special meaning in yDNA genetic genealogy. It does not refer to all kinds of relationships but only the sharing a of common, direct paternal ancestor. (See "patriline" below.) For example, you would not be "related" to an uncle who is your mother's brother; you would be "related" (through your father's father) to your father's brother.
An analogous meaning also applies in mtDNA genetic genealogy but the relationship is thrpough the direct maternal lineage or matriline.. .
In autosomal DNa genealogy, the usual meaning applies.
A yDNA genetic family, including the founding patriarch and all direct filial descendants. i.e., those who are related through a Y chromosome. Usually, only living and tested descendants in a patriline can be identified by means of DNA; the ancestral DNA is normally unavailable.
Common ancestor
An ancestor shared by (common to) two or more persons. A common male ancestor (CMA) refers to a shared direct biological paternal ancestor. The common ancestor is often not known before testing; the purpose of the testing was to assist in discovering his or her identity.
Abbreviation for common male ancestor, a direct biological paternal ancestor shared by two or more men with matching Y-DNA.
Most recent common ancestor, representing the person where lineages converge. The MRCA is important because, if two persons share one ancestor, they will also share the ancestor's preceding ancestors. 
Most distant (earliest) known ancestor. Taylor Family Genes asks for the name and dates and places of birth and death information because it helps to "zero in" on the MRCA.
Earliest known ancestor. Same as MDKA above. Within Taylor Family Genes, we prefer EKA.
Abbreviation for time to most recent common ancestor, expressed in generations.  (Technically, the TMRCA is in "DNA transmission events" (sum of generations for both lines minus 1). A transmission event occurs when a man fathers a son. However, some TMRCA calculators assume that transmission events are equal in both lines of descent, halve the number, round off to an integer and report this as "generations".  See our page on TMRCA.
TiP Method
TiP (for "Time Predictor" is the FTDNA proprietary TMRCA calculator, using a hybrid combination of the infinite alleles and stepwise models for the "mismatches" reported & used in its TiP calculations. Some markers are computed on the infinite alleles model, others on the stepwise model. In general,
  • The first 37 markers tested (in the 12-, 25- & 37-marker panels) use the stepwise model for all but those markers prone to multi-step mutations, for which the infinite alleles model is used.
  • Markers numbered 38 through 67 (in the 67-marker panel) use only the infinite alleles method 
We believe TiP to be the most sophisticated and precise TMRCA calculator currently available. It produces percentage probabilities that a pair of men share a MRCA within various numbers of generations from 1 to 24.
Genealogical Significance
A measure of a genetic match's genealogical meaning. For simplicity, we divide into three categories:
  • Significant: The match indicates a high probability that two persons share a common ancestor who lived within the genealogical time-frame.
  • Not Significant: The match indicates that two persons do NOT share a common ancestor or that, if one exists, he lived outside the genealogical time-frame.
  • Borderline/Ambiguous: The match is not clearly classifiable as either significant or not significant.
Generation length
There is no consensus answer to the question "How long is a generation?" or to "How many generations do X years represent?"  It depends; it differs between paternal and maternal lineages, with maternal shorter. It also varies with place, time, population group  and birth order.. Expert estimates as to averages range from a low of 20 years to a high of 35 years. As a compromise, Taylor Family Genes uses a general average of 30 years for paternal lineages.
NPE is short for "non-paternal event(s)" or "not the parent expected"; more accurate might be to call it "incorrectly assigned paternity". NPE refers to situations in which a child does not bear the surname of his biological father, for any of a number reasons (undocumented adoption, name change, illegitimacy, etc.) . These can also be called "surname discontinuities".
NPE are reasonably common in DNA surname projects, but often undetected -- even unsuspected -- until DNA tests are done and a descendant's DNA does not match other descendants of the presumed CMA. NPEs are thought to occur in a population 1.3% to 3% per generation; the descendants carry the new surname forward so the effect is cumulative over generations.
A recently-proposed term -- as though another was needed -- is "exogenous ancestry".
NPE have been classified as "ingressions" (a Y chromosome enters the surname) and "egressions" (a Y chromosome leaves the surname and enters another name). Which applies depends on perspective.
non-Taylor paternity (NTP)
A direct paternal lineage (father's father's father, etc.) not of the Taylor surname. Subtracting out NTP members has become necessary to determine a realistic match rate for the project. (Those whose paternity is not of the surname are unlikely to match Taylors in yDNA.) NTP member numbers have recently increased rapidly; they  now represent ~15% of ySTR members. .
NTP is not NPE. NTP has no indication of direct Taylor paternity in the pedigree.
DYS number
A name for a marker or specific STR pattern on the Y-chromosome, also "marker". A typical marker name is "DYS393". Frequently, the "DYS" will be omitted and only the number following it cited, e.g., "393". (Some markers have names which do not include "DYS".)
Locus, loci
A location on a chromosome. Locus is singular, loci is plural. See "DYS" above; other loci names are "Y-GATA-A10" &  "DYF406S1".
Some markers occur at more than one location. These are known as multi-copy or palindromic markers.
This term has more than one meaning in DNA:
  1. For ySTR testing, it means a particular pattern, or "flavor", of repetitions on the Y-chromosome, also called a "locus" or "microsatellite". Some STR markers have only a single copy and are associated with a particular place (locus) on the chromosome. Others have more than one copy and occur in multiple places; these are also called "paliondromic markers". 
    Markers -- with names like "DYS393" -- are identified by their patterns. For example,
    • The DYS393 pattern is "AGAT", repeated up to 17 times.
    • The DYS464 pattern (for all copies of the marker) is "CCTT", repeated up to 20 times.
  2. For ySNP testing, marker means a specific, identified & catalogued mutation characterizing a particular haplogroup, with a name like "P297" indicating the research group which found it and the order found. Often, SNPs have more than one name due to independent discoveries by more than one group.
  3. It can also mean a gene or other particular characteristic of DNA.
Because of marker's multiple meanings, it is best to identify the context in which one uses the term.
Alleles, allele value
Generally, allele means one form (of several or many) in which a part of DNA can be found. In ySTR, testing, it means the number of counted repetitions of DNA patterns (.e.g, AGAT) for a particular marker.
Abbreviation for "Short Tandem Repeats", and used as a synonym for marker. In STR testing, repeating DNA patterns are counted and the values reported as integers. These patterns are related to positions on the chromosome (hence the terms "loci" & "microsatellite") but some patterns occur in multiple places. The abbreviation for yDNA STR is "ySTR".

With sufficient markers tested, STR testing specifically identifies an individual haplotype and an ancestral paternal line. It can predict the haplogroup into which the haplotype falls.
Abbreviation for "Single Nucleotide Polymorphism", pronounced "snip". In this type of testing, specific previously-catalogued mutations of DNA are looked for & reported as to presence or absence. SNP testing definitively establishes a haplogroup and subclade of the haplogroup. The abbreviation for yDNA SNP is "ySNP".
"Terminal" SNP
Refers to the furthest downstream SNP for which a person has tested positive. It is not necessarily the furthest downstream SNP for which a person would be positive if tested. This term is presently deprecated as saying more about the individual's testing status than the reality of his DNA.
A specific & unique pattern of Y-DNA. When two men have the same (or nearly the same) Y-chromosome haplotype, they have match -- indicating a common paternal ancestor, probably within a genealogical time frame. A haplotype is described as a pattern of marker/STR values; the more markers, the more complete the haplotype description is. 
A category of DNA types with similar characteristics, a group of haplotypes. A common ancestor for all within a haplogroup exists, but often in the very distant past. From https://en.wikipedia.org/wiki/Haplogroup:
In molecular evolution, a haplogroup (from the Greek: ἁπλούς, haploûs, "onefold, single, simple") is a group of similar haplotypes that share a common ancestor having the same single nucleotide polymorphism (SNP) mutation in all haplotypes. Because a haplogroup consists of similar haplotypes, it is possible to predict a haplogroup from haplotypes. An SNP test confirms a haplogroup.
A haplogroup is defined by specific SNPs, but can (to a limited degree) be estimated or "predicted" by STR values.
Phylogenetic tree, phylogeny
A depiction of the evolutionary &/or genetic history of an organism (e.g., a person) or a related group of organisms.
A change in the DNA pattern before, during or after transmission from father to son or mtDNA from mother to children. In STR testing, a mutation is revealed by difference in allele values between father and son. SNP testing looks for specific, known & catalogued mutations.
A mutation is not the same as a disagreement between present haplotypes; the proper term is a "mismatch", as we don't yet know whether the haplotypes derive from the same source, nor do we know the source haplotype.
Mutation frequency, mutation rate
The estimated rate of observable change in DNA. Mutations happen infrequently -- on the order of once in every 250 to 400 transmission events -- and (it's believed) randomly. The mutation frequency gives a probability for a mutation occurring in a specific marker for each transmission event.
Note: Although we know the average frequency of mutations for many STR markers, we do not know the variances in frequencies, which appear to be wide. This has important implications for probability predictions. For SNP mutations, an average of one per 130 years (3-4 transmission events) is often cited.
Abbreviation or acronym  for "Recombinant Loss of Heterozygosity", a type of mutation in which one segment of DNA "writes over" others, losing the difference between the segments. More generally, RecLOH refers to unreciprocated exchange of genetic information between two segments. In yDNA genetic genealogy, it refers to changes within the Y-Chromosome and the effect is to produce oddities in STR values. Click here for more.
Multi-copy marker, AKA palindromic marker
In STR testing, some markers (examples, 385a & 385b, 464a to 464d) are copies of each other -- though not necessarily perfect copies. (Palindromic refers to the STR pattern reading the same backward or forward.) They are found on loops within the chromosome; one or more copies leads into the loop and other copies lead out. These markers tend be more volatile -- i.e., have high mutation frequencies. For interpretation in matches, see this page.
Transmission Event
The passing of DNA from parents to a child, during which a mutation may occur. (A mutation may occur within the father's or mother's DNA, but not be observed until passed down to a child.)  The number of transmission events in one line of descent is one less than the number of generations. The number of transmission events for more than one line of descent from a common ancestor is the sum of the transmission events from the CMA to the DNA donors for each line.
Relation Generations Trans. Ev.
Father/son 2 1
Brothers 1 2
1st Cousins 2 4
2nd Cousins 3 6
2nd Cousins,
1x removed
3, 4 7
3rd Cousins 4 8
5th Cousins 6 12
7th Cousins 8 16
11th Cousins 12 24
Genealogical Time Frame
(Abbreviated GTF.) The period of time: To the present, from a date in the past, for which it is possible to identify specific individuals as one's ancestors. The beginning of this period varies with place & other factors; it is dependent on existence of written records, use of surnames, etc. For western Europe, a consensus estimate is that the genealogical time frame extends from no earlier than the mid-1300s. Also see this page.
Exceptions: If one's ancestors were sufficiently noteworthy to have been recorded in writing -- or had the resources and motivation to have a genealogy prepared long ago -- the genealogical time frame may extend further into the past.
A family name that is passed down from parent to children, also the social practice of using surnames. A surname is not a byname that may change or a name which one child inherits, but not all. Surnames began in England about 1000 AD and became universal between 1350 and 1400.
In yDNA, triangulation is a method for estimating the presumed haplotype of the CMA for a cluster (genetic family) of matching DNA and identifying genetic branches of the family.
It is done by calculating the modal (most frequent) values of the cluster for each marker. Triangulation typically requires a minimum cluster size of three; it becomes more reliable as the cluster becomes larger and represents more branches among the CMA's descendants -- and as the resolution (number of markers compared) increases..
Once the modal values are established, each member can be compared to the estimated CMA haplotype and individual marker differences noted. These differences may point to specific branches of the family lines.
In autosomal DNA: triangulation is a method for determining the nature of relationships among a group of people who shar autosomal matches.
Chromosome mapping
An analytical method in autosomal &/or X-chromosome DNA, closely related to triangulation: According to Tim Jantzen: "The process of determining which portions of your DNA came from which ancestors and/or which geographic region or population.” {And} Another similar definition might be as follows: “The attribution of phased autosomal or X-chromosome DNA segments to specific ancestors and/or geographic regions or populations"
Signature has at least three meanings:
  1. To refer to unique marker values which are found only in a particular genetic family.
  2. To refer to a defined haplotype described by 12 or more STR markers. For example, descendants of "Niall of the Nine Hostages" are said to have a signature of 13-25-14-11-11-12-12-12-13-14-29-16 for the first 12 FTDNA markers.
  3. To refer to one or more branch-defining differences from a cluster's CMA haplotype (See Triangulation.) which is common among the members of that branch but not other branches. A signature may help to pinpoint from which of a CMA's sons the cluster member descends.
Taylor Family Genes does not recommend focusing on signatures of only a few selected markers outside an already-defined genetic family. While useful when found in a specific genetic family, they are not sufficiently common.
The "family tree" of a genetically related group of organisms as distinguished from the development of a single organism.
A method of biological classification in which organisms are categorized based on shared derived characteristics that can be traced to a group's most recent common ancestor and are not present in more distant ancestors. Cladistics can also be used in genetic genealogy and illustrated in cladograms or "network diagrams".
Convergent Evolution
A term describing the process by which organisms of different phylogenies (ancestries) evolve toward similar forms. For example, a shark (fish) and a dolphin (mammal) have evolved similar body shapes though they are not closely related. Seeing similar ySTR haplotypes for men of clearly different phylogenies suggest that convergent evolution also exists in the human Y-chromosome.
A number between zero and one, often expressed as a percentage, as to how likely it is that a certain event will occur or that a hypothesis is true. The term also refers to probability theory and the branch of mathematics dealing with likelihood.
We should stress between zero and 1. In probability theory, no outcomes are either certain (1) or impossible (0).
Bayesian probability
A branch of probability theory which includes "expert knowledge" as well as experimental data to produce combined probabilities. The expert knowledge is represented by some (subjective) prior probability distribution and the other data is incorporated in a likelihood function. The product of the prior and the likelihood results in a posterior probability distribution that incorporates all the information known to date.
The amount of trust one can place in a statement (e.g., a conclusion). In statistics, confidence is a key concept and stated quantitatively;
Gamma distribution
A class of probability distributions often used to model genetic (and other) phenomena and to make predictions. For more, see this page.

Go to Top

Revised: 13 Dec 2016