Estimating a Surname's Patrilines
This page describes a method for estimating the number of paternal lineages
for a surname from data contained in a DNA surname project. It is assumed
that the actual number can not be determined by other means.
There are minimum requirements to be met for using this method.
The project must have attained a minimum Y-DNA penetration, or tests per
100,000 males with the surname.
- Experience of the Taylor Family Genes project suggests this minimum
penetration rate may
be about 90:105. (Including females, ~45:105)
- Below the minimum penetration rate, the estimate may be too low.
- The estimate is only applicable to populations for which the minimum
penetration has been attained.
Groups, genetic families
The number of genetic families identified in the project must be known. This
should not be a problem for project administrators paying attention.
- The project must be current in its task of grouping members who match
each other. Such members left ungrouped will inflate the patrilines
The number of unmatched "singleton" members must be known.
- Determine this number by subtracting the number of matched/grouped project
members from the total with Y-DNA results.
- As a refinement, subtract the number whose patrilines are
definitively determined not to be of the surname.
An overall, average estimate must be developed for non-paternal events (AKA, not the parent
expected, surname discontinuity). This rate will act a an adjustment to the
singletons number because at least some of the singletons will not be
refelctive of the surname patrilines.
- The rate refers to the estimated number with the surname whose
patriline traces to another surname -- in James Irvine's term, iNPE.
- There may also be a number within the project of eNPE, those whose
patrilines are from the project surname but now have a different
surname. These will be reflected in iether groups or singletons numbers.
- NPE tend to be cumulative over generations and unsuspected.
- Suggested rates are between 20% and 40%; Taylor Family Genes uses
Each group (genetic family found) represents one patriline.
Add the number of groups to the adjusted singletons number. The formula is
P = G + S*(1-NPE)
- P = Patrilines,
- G = Genetic families
- S = Singletons<
- NPE = rate expressed as a decimal
If tracked over time, the number may fluctuate wildly, especially for
projedcts with low penetrations rates. A moving average over several periods
will smooth out the fluctuations nad more clearly display trends.
Note that, in the above graph, the patrilines moving average tends to parallel (though on different
scales) the match rate until a match rate of 50% is attained.