DNA: the Next Step in Family History

Mayo Genealogy Group

2:00 p.m. Saturday 13 May 2017

National Museum of Country Life, Turlough Park, Castlebar, County Mayo

by Paddy Waldron

WWW version:

http://pwaldron.info/Mayo/

Facebook event:

https://www.facebook.com/events/707451316082908/

YouTube version:

As there is no 3G mobile broadband coverage in Turlough Park, I could not use my own laptop for this presentation and therefore could not record it. The Minister for museums and rural broadband has promised many times to rectify this situation at some unspecified future date.

What is Genetic Genealogy?

Where does our DNA come from?

male offspring female offspring
sperm Y chromosome X chromosome
22 paternal autosomes
egg X chromosome
22 maternal autosomes
mitochondria

Inheritance paths

Y chromosome
Only males have a Y chromosome.
The Y chromosome comes down the patrilineal line - from father, father's father, father's father's father, etc.
This is the same inheritance path as followed by surnames, grants of arms, peerages, etc.
X chromosome
Males have one X chromosome, females have two.
X DNA may come through any ancestral line that does not contain two consecutive males.
Blaine Bettinger's nice colour-coded blank fan-style pedigree charts show the ancestors from whom men and women can potentially inherit X-DNA.
Autosomes
Exactly 50% of autosomal DNA comes from the father and exactly 50% comes from the mother.
On average 25% comes from each grandparent, on average 12.5% comes from each greatgrandparent, and so on.
Due to recombination (see below), one might inherit, for example, 27% from the paternal grandfather and 23% from the paternal grandmother.
Siblings each inherit 50% of their parents' autosomal DNA, but not the same 50% (except for identical twins).
Mitochondria
Everyone has mitochondrial DNA.
Mitochondrial DNA comes down the matrilineal line - from mother, mother's mother, mother's mother's mother, etc.
The surname typically changes with every generation in this line.
For genetic genealogy, beginners should start with autosomal DNA, or with Y DNA for one name studies or surname projects.

How much DNA do we have?

Billions of letters:
Male Female
Length Width Total Length Width Total
Autosomal 2,881,033,286 2 5,762,066,572 2,881,033,286 2 5,762,066,572
X 155,270,560 1 155,270,560 155,270,560 2 310,541,120
Y 59,373,566 1 59,373,566 0
Mitochondrial 16,569 1 16,569 16,569 1 16,569
GRAND TOTAL 3,095,693,981 5,976,727,267 3,036,320,415 6,072,624,261

How much DNA do we observe?

The three major DNA companies (see below) sample different locations on the autosomes.
The locations sampled are about 0.02% of the total, but are those known to vary between individuals.
The vast majority of DNA is identical for all humans, and even for many of the apes.
The overlap between FamilyTreeDNA and the original AncestryDNA set was 652,462 (based on my personal results).

The random component of DNA inheritance

Most DNA is transcribed exactly from the relevant parent.

Two sources of randomness mean that one cannot always exactly infer the child's DNA from the parents' or vice versa:
Mutations are transcription errors at single locations, e.g. a single A in the parent may be replaced by a C in the child.

Some locations mutate very frequently (every couple of generations), and can be used to identify individuals beyond reasonable doubt, e.g. in criminal cases.

Some locations mutate less frequently (only once in many generations or once in the history of mankind), and can be used to identify closely or distantly related individuals.

Special types of mutations:

Y-DNA Mutations

The entry-level Y-DNA37 product looks at the numbers of repeats for each of 37 STR markers on the Y chromosome, e.g. Durkin/Durkan/Durcan Surname Project
Note that surname spellings also mutate, independently of DNA mutations.
Some SNPs on the Y chromosome are once-in-the-history-of-mankind events and can be used to build a Y-DNA Haplogroup Tree.
STRs can only predict Y haplogroups but a SNP product must then be purchased to confirm the Y haplogroup.
Surname-specific SNPs are now being discovered.

mtDNA Mutations

There is also an mtDNA Haplogroup Tree.

Autosomal DNA Mutations

The 0.02% or so locations observed on the autosomes are known SNPs.
All observed SNPs are still weighted equally in relationship calculations, but two individuals who share a mutation observed in only 1% of the population are likely to be far more closely related than two individuals who share a mutation observed in 50% of the population.

Recombination

The other source of randomness is recombination, which is how, e.g., the father's paternal and maternal autosomes cross over to produce the child's paternal autosomes.

The double negative approach: opposite homozygous locations and half-identical regions

False positives can occur - a long half-identical region can by chance consist of short overlapping (or zig-zagging) paternal/paternal, paternal/maternal, maternal/paternal and maternal/maternal segments.

Phasing and triangulation are like opposite sides of the same coin:
Rule of thumb for lengths of the longest half-identical region::
The aggregate length of all the half-identical regions above some arbitrary threshold is used to estimate the relationship:
Average autosomal DNA shared by pairs of relatives:
Rules of thumb:

What can DNA tell us?

The big 3 DNA companies

New competitors are emerging: Living DNA, MyHeritage
AncestryDNA
Part of ancestry.com
Autosomal DNA only
Very limited analysis tools
Represents only 30 countries as of October 2016 and historically has either overcharged or not accepted non-U.S. customers
Full access requires paying ongoing annual subscription
Internal messaging system
Reached 1 million samples in July 2015; 2 million samples in June 2016; 3 million samples in January 2017; 4 million samples in April 2017
Most people use pseudonyms or initials and conceal their real surnames
My results
23andMe
Concentrates on medical aspects of DNA
Autosomal DNA plus predicted Y-DNA and mtDNA haplogroups
Overcharges non-U.S. customers
One-off payment
Optional internal messaging system
About 1 million samples
Most people anonymous
Analysis tools for non-anonymous matches
"We will soon transition your account to the new 23andMe experience" and "Preparation for the transition to the new 23andMe experience" (including doubling of prices) started in October 2015 and still ongoing as of May 2017
Results
Surname View
FamilyTreeDNA (FTDNA)
Dedicated to genetic genealogy
Autosomal DNA (Family Finder) plus various Y-DNA and mtDNA products
Good analysis tools
Single worldwide price (sale for U.S. Mother's Day)
One-off payment
No U.S. bias
Simple e-mail communications
873,827 records as of 12 May 2017
Most people use real names: but married women are recommended to use maiden surnames
Projects - e.g. Clare Roots and surname projects
Time for someone to start a Mayo Roots project (not to be confused with the Mayo Surname project)
You can transfer your existing National Geographic Genographic Project Y-DNA and mtDNA results to Family Tree DNA for free and upgrade to Family Finder for just USD39
My results

What do you get for your money?

The third-party sites

GEDmatch.com
DNAgedcom.com

Levels of involvement

At a minimum, please consider my advice on how to get the most out of your DNA purchase.
Others can then help you to analyse your results.
One's involvement can be at different levels:
Lists of names
A black box algorithm can be used to list the names of those in a database whose autosomal DNA is closest to yours. 
You can look at your matches' own names, their ancestral surnames, ancestral placenames and family trees, if they have made these available.
Anybody can do this.
Lengths of half-identical regions
To get full value from one's investment in DNA analysis, one should move on from the purely qualitative approach of looking at names and take a more quantitative approach.
The first step is to look at the centiMorgan lengths of the regions on which one is half-identical with a potential relative.
The higher the centiMorgan numbers, the closer the relationship is likely to be.
Some basic arithmetic skills are required for this.
Locations of half-identical regions (phasing and triangulation)
This may exercise your brain cells a little more than the first two approaches.
Raw data
Sooner or later, the only answer to a particular DNA puzzle will be to look at one's raw data, in the form of long sequences of pairs of As, Cs, Gs and Ts, in order to work out exactly how and why something happened.
This is for the specialist. There are many of us around, willing to help.

Example I: O'Neills of Shammer, Kilmovee

Example II: A 1938 adoption: Baby Ann

Example III: Marriage Dispensation - Stenson/Durcan

Example IV: Waldron Y-DNA

Example V: My Lynch ancestor was Irish - what townland was he from?

Conclusion: Why you should submit your DNA

Further reading