[an error occurred while processing this directive]
On this page:

Y-DNA Match Determination

We keep encountering new and different ways of determining whether a match exists between two sets of Y-DNA results (i.e., two men); some strike us as bizarre. This page will attempt to catalogue and evaluate the methods we've found as to their effectiveness and efficiency.

What is a "Match"?

Perhaps, the best first step is to define what we mean by the term "match".

A similarity between two or more sets of DNA results which indicate, to a high degree of probability, a shared ancestor within a period of time when documentary research may be able to identify him or her by name, dates, places and/or other characteristics.

Note that this definition is not specific to Y-DNA STR testing, though that will be our focus here. It allows us to re-frame the question into "Does the similarity of the sets of results allow us to say with confidence that these men share a common male ancestor within genealogic time?"

Best Method

The best method for determining a Y-DNA match proceeds directly from the definition and uses what is known about mutation frequencies of STR markers. It relies on direct comparison of the two haplotypes as measured by allele values of markers tested in common.

  1. Define "high probability" quantitatively. Is it 80%? 90%? Or some other number?
  2. Determine a time period amenable to research and estimate its length in generations.
  3. Using marker-mutation frequencies, calculate the TMRCA for the match and see if it fits the probability and time windows. One may use either of these calculation models:
    1. Infinite alleles -- assumes that markers are free to change by any amount in either direction. Therefore, a particular marker either agrees between two men or it doesn't.
    2. Stepwise -- assumes that the amount of any difference is also significant. A difference of one is one mutation, a difference of two is two mutations, etc.
    3. Combination of the two -- most markers stepwise, some infinite alleles

The calculations utilize the binomial probability theorem, a complicated formula. Alternatives to manually calculating the probabilities include:



Acceptable Methods

  1. Pre-set rules

    The method above can be used to establish standards for qualifying. For example, mismatches of two across 37 markers qualifies, but three does not.



  2. Comparison of haplotype to haplogroup modals




Unacceptable Methods

  1. Percentage of matching or mismatching