Some men have very common yDNA haplotypes, as reflected by STR markers.
They may have hundreds to thousands of "close matches", even at 67 markers
and hundreds at 111. They have too many matches to be able to follow up on
This page is for those who suffer convergence.
What is Convergence?
Convergence describes a phenomenon in which organisms of different phylogenies
(ancestries) evolve toward common forms. This is seen in in the outward
forms of many different plants and animals, including humans, and may be due to evolutionary forces.
|The plants above look very much alike
though only distantly related.
|Marine animals with similar body types and
Evidence is growing that convergence also occurs with ySTR haplotypes. We are
seeing men reported with close matches who -- based on their SNPs -- can not
be related within two thousand or more years. One expert has termed these
Some have suggested that the phenomenon we see is at least partly due to lack
of divergence. That is, a few very ancient haplotypes (from before surnames were
adopted) have not mutated to different forms. We'll also include this
possibility in "convergence".
For this discussion, we'll operationally define convergence as sufficient similarity
between haplotypes of different paternal origins as to be displayed on a FTDNA match list,
- 12 markers: Less than a genetic distance of two, <=1;
- 25 markers: Less than a genetic distance of three, <=2;
- 37 markers: Less than a genetic distance of five, <=4;
- 67 markers: Less than a genetic distance of eight, <=7;
- 111 markers: Less than a genetic distance of eleven, <=10.
Some may invoke ySNPs in the definition. While a conflict in ySNPs (meaning
men can not share the same patriline within genealogic time) is the "acid
test" for convergence, it does not necessarily reflect all
Is it real? Does it actually happen?
Yes. The phenomenon has been seen but not thoroughly investigated.
Strong evidence comes from the Clan Irwin project where exact matches exist
between 30 participants at 37 markers and 12 at 67 markers. Yet, the genealogies
show no common ancestors for at least eight generations. There is only miniscule
probability (p << 0.0001) that an ancestral haplotype could remain unchanged to
yield exact matches so many DNA transmissions later. It is more likely that they
mutated toward common forms from different sources.
Within the Taylor Family Genes project we also see highly similar haplotypes between men of different subclades,
R-M269. Because the subclades separated thousands of years ago, we
know they can not share the same paternal ancestor within genealogic time and
How common is convergence??
A single instance of convergence is enough to establish its existence but
says little about prevalence. We try to answer this question (though perhaps crudely) using data
from Taylor Family Genes. The graph to the right and the table below show the distribution of project members by
number of FTDNA-listed
& other-surnamee matches at 37 markers.
|"Extra-project" Matches @ 37 markers
The above data (all haplogroups) were taken from ongoing
recording of matches, by in-project
matches (including non-participants with project surname) versus
extra-project matches (non-participant and other surname). Participants
assessed to be of non-Taylor paternity (non-Taylor surname plus no
in-project matches) have been excluded.
Possible signs of convergence
are seen in ~80% of project members' haplotypes. However, convergence is
high (>20 extra-project matches) for only 20% and very high (>50) for
Convergence is more commonly seen at lower resolutions than higher and at greater genetic distances than lesser. Common haplotypes (see
our study) may be the result of
How do I tell if this applies to me?
The number of matches on your match list at 37 or more markers is the
first hint. More than 50 is a good clue; few men have that many genetic
cousins who've tested DNA. If you have an unusually high number of matches,
you may see a recommendation above your match list to upgrade to more
markers; this may be a good idea.
Another clue is the variety of surnames. If the names are all different and
there's no apparent pattern to them, this could be you. (Though, if one
surname dominates the list, this could signal an NPE.)
- On the other hand, some Scots and Irish clans show a multiplicity of surnames
for the same biological paternal lineage. This is especially true of Clan
McGregor whose surname was banned on pain of death from 1603 to 1774.
Do you have a WAMH or Niall badge on your results? Do you come close to
matching either 12-marker haplotype? These haplotypes within R1b are extremely
common and men who have them or come close tend to have many matches.
The last, most important, clue is SNP results and haplogroups or subclades. If these
conflict, the match does not indicate a common ancestor within genealogic time.
- R1 (R-M173) vs. I1 (I-M253) conflict -- These haplogroups began
separating many thousands of years ago, before writing was invented.
- Comment: We see very few matches reported between these haplogroups.
- I1 (I-M253) vs. I1a (I-DF29) do not necessarily conflict -- I1a
is a more specific classification within I1. Likely, the I1 man has not
tested for DF29.
- R1b1a2 (R-M269) vs. R1b1a2a1a1 (R-U106) do not necessarily conflict --
As above the latter is a more specific classification within the former.
- R1b1a2a1a1 (R-U106) vs. R1b1a2a1a2 (R-P312) do conflict -- These
are separate subclades of R-M269.
See also match validity below.
The reasons for this phenomenon are not well-understood, but may include:
What to do?
The best strategy is to eliminate coincidental, irrelevant matches from
consideration without -- at the same time -- eliminating relevant and meaningful
Testing, to identify matches of a different subclade, is the recommended
strategy. Unfortunately, both parties to the match will need to test SNPs as far
downstream as possible for this method to work.
Several SNP testing options exist but we regard few of them as completely
satisfactory. They are either too expensive or incomplete. Exceptions are recent offerings by FTDNA of "SNP Packs" or "Backbone Panels";
they are fairly complete, testing 100+ SNPs per bundle, and reasonably
affordable, in the neighborhood of $100 or $1 per SNP. See
our page on them.
Test more STR markers
It has been a common strategy for men with many matches to test up to 111 markers and
ignore matches at lower levels. This does reduce the match list and, to some
extent, focuses the list more closely.
The problem here is that it also eliminates possibly significant and
meaningful matches at lower levels (e.g., at 37 markers). If your match partner
hasn't tested above 37, you won't see the match at 67 or 111.
Is a ySTR match valid?
More than two-thirds of Taylor Family Genes project members are in subclades of R1b1a2a1a
(R-M269, R-P310, see note below) About 10% of these have
many coincidental matches, which do not indicate common
paternal ancestry within genealogic time.
This is mostly a matter of elimination. To see if a STR match is invalid, check to see if tested SNPs
conflict. Different "terminal SNPs" are not necessarily conflicting; one
may may downstream of (included within) the other.
- Check SNPs of each member of the close-matching pair.
Both must have tested positive for SNPs listed in a column of the
table below; if not, this tool
can not be used.
- If one member has an SNP in the pink column and
the other has a SNP in the green column, the match
- If both have SNPs in the same column, both must be in same subclade, for
the match to be valid,
- L20 and L21 or M228.2 are not inconsistent. L21 is in a subclade of L20
and M228.2 is in s subclade of L21.
- M126 and M160 are in conflict; they are in different subclades, ..1b3a
A match is not necessarily valid if this tool fails to identify
it as invalid. However, the chances are reduced.
|YCC (FTDNA) Subclades of R1b1a2 a1a|
| YCC |
| L47 ||..1a4a
| L46 ||..1a4a1a
Note: Be aware of recent changes in the R1b1a2a1a phylogenetic tree,
due to continuing scientific discoveries.
The more recently updated phylogeny is the
FTDNA uses the 2010 Y-Chromosome Consortium (YCC) tree, with some additions. The two nomenclatures
are not entirely consistent with each other.
In the FTDNA tree (used above), subclades of R-U106 are
R1b1a2a1a1a.. and subclades of
R-P312 are R1b1a2a1a1b. In the more up-to-date ISOGG tree (below), R-U106 subclades are R1b1a2a1a1..
and R-P312 subclades are R1b1a2a1a2..
ISOGG Tree of R1b1a2a1a, |
as of April 2014
L45/S353, L164/S502, L237, L477, L493||
S268/Z9, S379, S504/Z28,
Z22, Z24, Z25, S517/Z26, Z29, S516/Z351||
L21/M529/S145, L459, S461||
P66_1, P66_2, P66_3||
Z326, S380/Z329, Z337||
CTS2509/S1734, S1741/Z319, Z325||
S424, S426, S3025,
S3026, S3031, S3057 (See
Notes regarding S423)||
S301, S308, S309, S427, S3027, S3028, S3034
Notes regarding S3028)||
L744/S388, L745/S463, L746/S310||