[an error occurred while processing this directive]
On this page:
 

The TiP (Time Predictor) Tool

This page is about the Family Tree DNA TiP (Time Predictor) tool. TiP is made available to the company's customers and project administrators for their uses but its inner workings are a proprietary secret of FTDNA and little documentation is provided.

We have made no attempt to "reverse engineer" TiP nor peer inside its "black box". We have, though, examined the inputs and outputs and try to explain them here.

What is TiP?

TiP is an online algorithm for calculating the time to the most recent common ancestor (TMRCA) for ySTR matches. We believe it is the most reliable of the TMRCA calculators available. In the set of tools that FTDNA provides for genetic genealogists, we feel it is one of the most useful and valuable.

Presumably, TiP runs as an Aspect (.aspx) script on the FTDNA servers.

TMRCA calculators, including TiP, have two purposes:

  1. To assess the quality of a match; the higher the probability for a given number of generations, the better the match
    and
  2. To predict the probable time (in generations) of a MRCA.

TiP will calculate cumulative probabilities out to 24 generations. We feel this limit is adequate because it's roughly consistent with universal surname adoption.

We distinguish between first use of surnames (~1000 AD in much of Europe) and universality of surnames for all people, nobles and commoners alike. Surname universality seems to have occurred between 1350 and 1400 in England. Before 1350, few English commoners had surnames; by 1400 all did.

How is TiP different?

Unlike other TMRCA calculators, TiP uses specific mutation rates of ySTR markers. (Other TMRCA calculators use average mutation rates for marker sets or no mutation rates.) A step of genetic difference on a "fast" marker is given less weight by TiP than a step on a "slow" marker.

TiP results

TiP calculations result in reports (in the more detailed form) like this:

In comparing Y-DNA67 markers, which show 3 mismatches, the probability that (XXXXXX) __ Taylor and (YYYYYY) __Taylor shared a common ancestor within the last...
...1 generation is 4.82%.
...2 generations is 15.93%.
...3 generations is 30.34%.
...4 generations is 45.10%.
...5 generations is 58.40%.
...6 generations is 69.44%.
...7 generations is 78.11%.
...8 generations is 84.63%.
...9 generations is 89.39%.
..10 generations is 92.79%.
..11 generations is 95.15%.
..12 generations is 96.78%.
..13 generations is 97.88%.
..14 generations is 98.61%.
..15 generations is 99.10%.
..16 generations is 99.42%.
..17 generations is 99.63%.
..18 generations is 99.76%.
..19 generations is 99.85%.
..20 generations is 99.90%.
..21 generations is 99.94%.
..22 generations is 99.96%.
..23 generations is 99.98%.
..24 generations is 99.99%.
* Assuming (XXXXXX) __ Taylor and (YYYYYY) __ Taylor do not share a common ancestor in the last 1 generation.

The default form of the report gives probabilities only for 4, 8, 12, 16, 20 & 24 generations.

Comment: Follow-up research by this pair of men revealed that the actual MRCA (Abraham Taylor, 1685-1751) lived eight generations ago, for which TiP calculated a cumulative probability of ~85%.

Caveat: The 4 significant digits (two decimal places after the percent) are misleading; in small-number cases (e.g., a pair) probabilities should not be trusted to this precision.

The default interval between generations is 4. Probabilities can also be calculated in generation intervals of 2 and 1 for a finer-grained picture. Though it's an extra step to recalculate, we recommend single-generation intervals.


Figure 1: Four typical cumulative p score curves are shown
for generations numbering up to 24.

The legend for Figure 1 labels the four curves by their 24-generation cumulative probabilities. These are roughly related to genetic distances, but more related to mutation rates of markers on which the genetic distances occur.

Some characteristics

How do I invoke TiP?

TiP calculations may be performed in either of three ways

  1. From a member's match list: Click on the orange icon to the right of the matching name.
  2. From the Y-DNA Genetic Distance report: After selecting the member to match against, click the TiP report link by another member's name.
    or
  3. From the GAP Home page, select Y-DNA TiP. Then select (from pick lists) a pair of men to compare.

Initial reports will be in 4-generation intervals. We recommending then selecting the "every generation" option and recalculating to get the detailed form shown above.

Paper trail adjustment

It is possible to apply a Bayesian adjustment to the TiP calculation; its effect is to shift the probability curve to the right. For example, if you absolutely know that the MRCA must be at least five (5) generations in the past, you can enter that. number to eliminate the most recent five generations.

However, the calculation cutoff will still be 24 generations; TiP will not calculate beyond 24 generations. We therefore recommend applying this adjustment later, while working with the TiP scores. Simply add 5 to each generation: 1 becomes 6, 2 becomes 7, etc.

Does the number of markers compared matter?

Yes and no. One should always use the highest available resolution for TiP calculations instead of lower resolutions. More markers yield greater confidence that the calculated probabilities reflect reality and they add somewhat to the precision of the probabilities.

Further, resolutions of 12 and 25 markers (for complex, technical reasons) are not adequate to yield sufficient confidence in the TiP-calculated probabilities. We do not recommend ussing resolutions less than 37 markers.

However, the generated probability curves do not significantly differ in shape or other characteristics.

How reliable is TiP?

We believe it is very reliable, based on a fairly thorough examination of the algorithm's results. 

In summer 2012, we did a study in which we compared the cumulative 37-marker, 24-generation TiP scores for all 238 R1b project members who had at that time tested at least 37 markers. This resulted in a matrix of 57,120 pairs of scores in this form:

Tip Scores:
37 markers, 24 generations
ID    A B C D
A __ 99% 5% 1%
B 99% __ 0.5% 2%
C 5% 0.5% __ 20%
D 1% 2% 20% __

 
Figure 2: Entire distribution                          Figure 3: Right tail  

We found that these scores discriminated well between genealogically significant and insignificant matches. We saw a sharp delineation between high scores and low. More than 99% of scores were less than p=80%; only 6% were higher than p=30%. Further, there was a sharp distinction between low scores and high, as seen in these graphs:

Figure 2 shows the entire frequency distribution for all pairs of scores; notice that the frequency distribution peaks at p<=1% and has a long right tail. Figure 3 expands the view of the tail for p=>30%.

At the time of the study, there was an insufficient number of members with resolutions >37 for full assessment. However, limited sampling suggested frequency distributions would be similar for 67 & 111 markers.

We concluded that TiP is as reliable (or more) as any TMRCA calculator based solely on ySTR markers.

Since our study, TiP underwent another revision in Spring 2016. This latest revision has us questioning its reliability and discriminatory capability.

The Mathematical Model

TiP is based on a family of probability distributions called "gamma distributions". The number and type of mismatches are modeled by assigning two parameters which govern the shape and scale of the probability curves.

Limitations

TiP, as with all probability estimates, only approximates the "truth". In the small numbers with which we typically deal, the approximations are not exact. Do not over-interpret the probabilities to be as precise as four significant digits implies.

The foundations of TiP (marker mutation rate data) are less robust than we'd prefer. There are variances (possibly large) in mutation rates between and within patrilines. While TiP can give us most likely estimates (MLEs) it can not tell us how much confidence to put in an MLE or when/how it should be adjusted.

TiP does not account for haplotype convergence, AKA "coincidental matches". With more and better ySNP testing since 2013, we've observed that some matches (<=5%?) are between men of different R1b subclades; these pairs of men can not have a MRCA within thousands of years. Therefore, the matches are not due to shared paternal ancestry but to something we call "convergence". (Presently, we have insufficient evidence as to whether, or to what extent, convergent matches are found in other haplogroups than R1b.)

We recommend ySNP testing as a validity check on ySTR matches. Should the testing show that a pair of men can not share a common ancestor within genealogic time, the match is not valid.

What can I do with TiP?

The first use of TiP is to evaluate match quality. A TiP score of 95% at X generations indicates a better match than one of 70% at the same number of generations.

TiP can help focus research (by which we mean traditional, documentary genealogy) to identify the MRCA. Such research requires substantial "bets" of time, effort and sometimes money to investigate -- often intensively -- particular times and places. It helps to know which bets are most likely to pay off in positive findings.

Example

To take the report from above, we can calculate the probability that the MRCA lived in a particular generation, as follows


Figure 4: Example of Cumulative Probability


Figure 5: Example of per-gen Probability

Generations Cumulative
Probability
Per Gen Per 3 Gen
1 4.82% 4.82% 4.82%
2 15.93% 11.11% 15.393%
3 30.34% 14.41% 30.34%
4 45.10% 14.76% 40.28%
5 58.40% 13.30% 42.47%
6 69.44% 11.04% 39.10%
7 78.11% 8.67% 33.01%
8 84.63% 6.52% 28.23%
9 89.39% 4.76% 19.95%
10 92.79% 3.40% 14.68%
11 95.15% 2.36% 10.52%
12 96.78% 1.63% 7.39%
13 97.88% 1.10% 5.09%
14 98.61% 0.73% 3.46%
15 99.10% 0.49% 2.32%
16 99.42% 0.32% 1.54%
17 99.63% 0.21% 1.02%
18 99.76% 0.13% 0.66%
19 99.85% 0.09% 0.43%
20 99.90% 0.05% 0.27%
21 99.94% 0.04% 0.18%
22 99.96% 0.02% 0.11%
23 99.98% 0.02% 0.08%
24 99.99% 0.01% 0.05%

What we've done above is to use the cumulative probabilities to establish a most likely estimate (MLE) for the generation of the MRCA. It's done by, for each generation, subtracting from its cumulative probability the probability for the next lower generation. This number is only a rough guide.

You may notice that TiP gives a most likely estimate of 4 generations for the MRCA but the actual number is 8. Such a difference is fairly typical and points up the need to take these numbers with a grain of salt. The difference suggests that this patriline has experienced fewer than the average (expected) number of mutations.

TiP History

The TiP algorithm has not remained static over the years. We know of at least three editions, each producing somewhat different results, depending on the differences between the pair of haplotypes compared. We've named them TiP1 (before December 2012), TiP2 (Dec. 2012 to Feb. 2013) and TiP3 (since Feb. 2013); TiP3 is, we think, still the current version.

The change from TiP1 to TiP2 was intended to better handle mutations in multi-copy markers and null values. (Changes not reflected in FTDNA counting of genetic distance until Summer 2016.) However, anomalous results were seen in some instances, caused by bugs. The change from TiP2 to TiP3 was to fix the bugs.