# Estimating a Surname's Patrilines

This page describes a method for estimating the number of paternal lineages for a surname from data contained in a DNA surname project. It is assumed that the actual number can not be determined by other means.

This page describes a method for estimating the number of paternal lineages for a surname from data contained in a DNA surname project. It is assumed that the actual number can not be determined by other means.

There are minimum requirements to be met for using this method.

### Penetration

The project must have attained a minimum Y-DNA penetration, or tests per 100,000 males with the surname.- Experience of the Taylor Family Genes project suggests this minimum
penetration rate may
be about 90:10
^{5}. (Including females, ~45:10^{5}) - Below the minimum penetration rate, the estimate may be too low.
- The estimate is only applicable to populations for which the minimum
penetration has been attained.

- Experience of the Taylor Family Genes project suggests this minimum
penetration rate may
be about 90:10
### Groups, genetic families

The number of genetic families identified in the project must be known. This should not be a problem for project administrators paying attention.- The project must be current in its task of grouping members who match
each other. Such members left ungrouped will inflate the patrilines
estimate.

- The project must be current in its task of grouping members who match
each other. Such members left ungrouped will inflate the patrilines
estimate.
### Singletons

The number of unmatched "singleton" members must be known.- Determine this number by subtracting the number of matched/grouped project members from the total with Y-DNA results.
- As a refinement, subtract the number whose patrilines are
definitively determined
**not**to be of the surname.

#### NPE Rate

An overall, average estimate must be developed for non-paternal events (AKA, not the parent expected, surname discontinuity). This rate will act a an adjustment to the singletons number because at least some of the singletons will not be refelctive of the surname patrilines.- The rate refers to the estimated number with the surname whose patriline traces to another surname -- in James Irvine's term, iNPE.
- There may also be a number within the project of eNPE, those whose patrilines are from the project surname but now have a different surname. These will be reflected in iether groups or singletons numbers.
- NPE tend to be cumulative over generations and unsuspected.
- Suggested rates are between 20% and 40%; Taylor Family Genes uses 30%.

Each group (genetic family found) represents one patriline.

Add the number of groups to the adjusted singletons number. The formula is

P = G + S*(1-NPE),
where

- P = Patrilines,
- G = Genetic families
- S = Singletons<
- NPE = rate expressed as a decimal

If tracked over time, the number may fluctuate wildly, especially for projedcts with low penetrations rates. A moving average over several periods will smooth out the fluctuations nad more clearly display trends.

Note that, in the above graph, the patrilines moving average tends to parallel (though on different scales) the match rate until a match rate of 50% is attained.