DNA Basics: Difference between revisions

From FamilySearch Wiki
No edit summary
No edit summary
Line 14: Line 14:
All human beings are 99.9% identical in their genetic makeup meaning that at out of the 3 billion genes we all have 99.9% are the same in all humans. The places where it is possible for a variance to occur are called, SNP's which stands for Single-nucleotide polymorphisms. SNP's are the main force behind DNA and what gives it it's genealogical value. When two individuals have enough matching SNP's in a row, this becomes a matching segment. The more matching SNP's there are, the bigger the segment is. If a segment is big enough (bigger than 15cm's), then the segment must be identical by descend (IBD) which means the two individuals share that segment because they both descend from a common ancestor who passed on that segment of DNA to both of them. The more matching segments there and the bigger they are, the closer two test takers are probably related. By testing a sample of a person's SNP's and then comparing them to everyone else in the database, it is possible to identify a person's genetic relatives. Most major companies will test 500-600k SNP's.
All human beings are 99.9% identical in their genetic makeup meaning that at out of the 3 billion genes we all have 99.9% are the same in all humans. The places where it is possible for a variance to occur are called, SNP's which stands for Single-nucleotide polymorphisms. SNP's are the main force behind DNA and what gives it it's genealogical value. When two individuals have enough matching SNP's in a row, this becomes a matching segment. The more matching SNP's there are, the bigger the segment is. If a segment is big enough (bigger than 15cm's), then the segment must be identical by descend (IBD) which means the two individuals share that segment because they both descend from a common ancestor who passed on that segment of DNA to both of them. The more matching segments there and the bigger they are, the closer two test takers are probably related. By testing a sample of a person's SNP's and then comparing them to everyone else in the database, it is possible to identify a person's genetic relatives. Most major companies will test 500-600k SNP's.


In theory each SNP can be one of the four nitrogenous bases (A, C, G, or T), but in practice only two are ever found at each specific spot the vast majority of the time. There is usually a major alle that is present in about 95% of test takers and a minor alle that is present in about 5% of test takers. Each person will have two nitrogenous bases at each spot, one inherited from their mother and one inherited from their father. This means that at each SNP tested a person can have one of three combinations: two copies of the major alle (called a homozygous SNP), one copy of the major alle and one copy of the minor alle (called a heterozygous SNP), and two copies of the minor alle (called a homozygous SNP).
In theory each SNP can be one of the four nitrogenous bases (A, C, G, or T), but in practice only two are ever found at each specific spot the vast majority of the time. There is usually a major alle and a minor alle that is present in at least 5% of test takers. Each person will have two nitrogenous bases at each spot, one inherited from their mother and one inherited from their father. This means that at each SNP tested a person can have one of three combinations: two copies of the major alle (called a homozygous SNP), one copy of the major alle and one copy of the minor alle (called a heterozygous SNP), and two copies of the minor alle (called a homozygous SNP). When two people have at least once matching SNP at each spot, it is a half match, if both SNPs match it is a full match, and if neither SNP matches it is no match. Since there are only three possible combinations at each spot, many people will be either a full or half match at any given SNP by coincidence even though they are not related. In fact, somebody who is heterozygous will match everybody on earth at that spot. This is why it is important that hundreds-thousands of SNPs in a row match to be confident that a matching segment is identical by descent and not just a coincidence.
 


If you download your DNA raw data file and open it, you can see exactly what genes you have at each of the tested SNPs. The order they are listed is arbitrary and it is impossible to know which gene came from each parent. If both genes are the same (A A for example) then one A came from mom and one from dad. If your DNA is heterozygous at a certain SNP (C G for example), the only way to know which parent gave you the C and which the G is by comparing against other relatives. Sorting out the paternal and maternal SNPS is called phasing. In this situation, if you compared your DNA against your mom's and at the spot where you have C G, your mom has G G then you must have inherited the C from your dad and the G from your mom. If you are C G and your mom is also C G, then it is still unclear which gene came from which parent and comparing against your dad or another relative would be necessary to figure it out. Programs such as GedMatch.com offer the ability to phase your DNA by comparing it against one or both parents. Using phased kits reduces the amount of false segments identified between you and a match and is a valuable tool for people interested in small DNA segments.


== Autosomal DNA ==
== Autosomal DNA ==

Revision as of 14:03, 8 July 2022

DNA


What is DNA?[edit | edit source]

Deoxyribonucleic acid (DNA) is a molecule found in nearly all human cells and contains the information for the development and function of all living organisms. A human DNA molecule is a double helix shaped like a twisted ladder. The human genome is the complete set of human genetic information found within 23 pairs of chromosomes for each person. Half of the chromosomes come from the father and half from the mother. These 23 chromosomes reside within the nucleus of the cells and some DNA is also found in the mitochondria.

Nitrogenous Bases (ATCG)[edit | edit source]

DNA chromosomes look like long twisty ladders. The longest chromosome (1) has over 249 million rungs and the smallest (21) has over 48 million. In total there are over 3 billion of these rungs in human DNA. Each rung in the ladder will contain a pair of nitrogenous bases: Adenine (A), Thymine (T), Cytosine (C), and Guanine (G). A and T are always paired together and C and G are paired together. Although two SNPs will always be together at each spot, only one of the two values at each spot will do any coding, the other is just a backbone that holds the structure together. The side that does the coding is called the + strand and the side that is the backbone is the - strand. Sometimes in an A T pair, the A will be the coding gene and the T will be the backbone, other times it will be the reverse and the same is true for C and G pairs. For simplicity, DNA companies will therefore just record the value of a person's + strand at each spot they test.

Single-nucleotide polymorphisms (SNPs)[edit | edit source]

All human beings are 99.9% identical in their genetic makeup meaning that at out of the 3 billion genes we all have 99.9% are the same in all humans. The places where it is possible for a variance to occur are called, SNP's which stands for Single-nucleotide polymorphisms. SNP's are the main force behind DNA and what gives it it's genealogical value. When two individuals have enough matching SNP's in a row, this becomes a matching segment. The more matching SNP's there are, the bigger the segment is. If a segment is big enough (bigger than 15cm's), then the segment must be identical by descend (IBD) which means the two individuals share that segment because they both descend from a common ancestor who passed on that segment of DNA to both of them. The more matching segments there and the bigger they are, the closer two test takers are probably related. By testing a sample of a person's SNP's and then comparing them to everyone else in the database, it is possible to identify a person's genetic relatives. Most major companies will test 500-600k SNP's.

In theory each SNP can be one of the four nitrogenous bases (A, C, G, or T), but in practice only two are ever found at each specific spot the vast majority of the time. There is usually a major alle and a minor alle that is present in at least 5% of test takers. Each person will have two nitrogenous bases at each spot, one inherited from their mother and one inherited from their father. This means that at each SNP tested a person can have one of three combinations: two copies of the major alle (called a homozygous SNP), one copy of the major alle and one copy of the minor alle (called a heterozygous SNP), and two copies of the minor alle (called a homozygous SNP). When two people have at least once matching SNP at each spot, it is a half match, if both SNPs match it is a full match, and if neither SNP matches it is no match. Since there are only three possible combinations at each spot, many people will be either a full or half match at any given SNP by coincidence even though they are not related. In fact, somebody who is heterozygous will match everybody on earth at that spot. This is why it is important that hundreds-thousands of SNPs in a row match to be confident that a matching segment is identical by descent and not just a coincidence.

If you download your DNA raw data file and open it, you can see exactly what genes you have at each of the tested SNPs. The order they are listed is arbitrary and it is impossible to know which gene came from each parent. If both genes are the same (A A for example) then one A came from mom and one from dad. If your DNA is heterozygous at a certain SNP (C G for example), the only way to know which parent gave you the C and which the G is by comparing against other relatives. Sorting out the paternal and maternal SNPS is called phasing. In this situation, if you compared your DNA against your mom's and at the spot where you have C G, your mom has G G then you must have inherited the C from your dad and the G from your mom. If you are C G and your mom is also C G, then it is still unclear which gene came from which parent and comparing against your dad or another relative would be necessary to figure it out. Programs such as GedMatch.com offer the ability to phase your DNA by comparing it against one or both parents. Using phased kits reduces the amount of false segments identified between you and a match and is a valuable tool for people interested in small DNA segments.

Autosomal DNA[edit | edit source]

Autosomal Recombination

Autosomal DNA (atDNA): This makes up about 95% of human DNA. It consists of 22 pairs of chromosomes and in each pair, one is from mom and one is from dad. The pairs of chromosomes are simply names 1, 2, 3, etc... 21, 22. They are mostly named by their size with chromosome 1 being the largest, chromosome 2 being the second largest etc. They are not perfectly named by their size, for example chromosome 22 is actually bigger than chromosome 21, chromosome 20 is slightly bigger than chromosome 19 etc. This is because as our understanding of DNA grows and improves, some chromosomes turned out to be a little bigger or smaller than we initially thought.

Autosomal DNA is the most important and most commonly tested type of DNA. A child will have exactly half of their autosomal DNA from each parent. However, a child will usually not have exactly 25% from each grandparent. A child could have 20% from one grandparent and 30% from another grandparent for example. This is because each child is randomly given half of their parent's DNA. When a new sperm or egg is formed, the pairs chromosomes in the parent line and exchange information, meaning segments are randomly cut out and switch places with each other to form new chromosomes that are a mixture of the previous ones. The child is then given one chromosome from each pair and the other is discarded. This happens in both parents so that the child receives one full set of chromosomes from each parent.

Sex Chromosomes (X and Y)[edit | edit source]

Sex Chromosomes: There are 22 pairs of autosomal DNA. The 23rd and final pair are the sex chromosomes which come in two variants, X, and Y. Females have two X chromosomes, and males have an X and a Y. The X chromosome is about as big as chromosome 7 and the Y is about as big as chromosome 21. In other words, the X chromosome is about 3 times bigger than the Y chromosome. Because they are so different in size, during recombination in males, the two try to line up at but can only perform recombination at the tip (in females the two X chromosomes recombine without any problems). The father gives the child all of his X or all of his Y. If he gives X, the child becomes a biological female if he gives Y the child becomes a biological male. Because of this rule, both have unique inheritance patterns.

X-DNA: X DNA is similar to autosomal DNA except that a man never inherits it from his father. This allows some lines to be eliminated based on the presence of matching X DNA. For example, if a man shares X DNA with another person, that person must be related on the man's mother's side. A female inherits X DNA from her father, but all of that DNA comes from her father's mother (her paternal grandmother). A female will never inherit any X DNA from her paternal grandfather.

Y-DNA: The Y chromosome is only passed down father to son and only males have a Y chromosome. The Y chromosome does not go through the recombination process so the only way it can change is through spontaneous mutations. These mutations are usually harmless and the only way you can know if you have one is by taking a Y-DNA test.

Y-DNA has several advantages for genetic genealogists. It mutates at a faster rate than mtDNA and at a rate that is more useful for genealogists. By comparing the Y-DNA of two individuals it is possible to determine how closely the two are related on the direct male line. Anyone who matches your Y-DNA test is related on both your and their direct paternal line. The common ancestor will always be a man who had at least two sons, one of which you descend from and the other that the match descends from. Because Y-DNA can only change through mutations, it can be used to find relatives on the direct male line up to 25 generations back whereas autosomal DNA is usually only helpful up to about 5 generations back. Because Y-DNA and surnames are usually both passed down father to son, Y-DNA surname projects exist that try discover how everyone with the same surname is related. Such projects can be less effective with common names like Smith, but can be extremely effective with unique surnames like Tolman.

The biggest disadvantage to Y-DNA research is far less people have taken Y-DNA tests than they have autosomal DNA tests. The only way to overcome this obstacle is for more people to test. The other disadvantage is that Y-DNA can only reveal information about your direct male line. If you wanted to use Y-DNA to research your mother's father's surname, for example, your Y-DNA would not work. You would need to test your maternal grandfather or another close relative related on the direct male line.

Mitochondrial DNA[edit | edit source]

Mitochondrial DNA

Mitochondrial DNA (mtDNA): Mitochondrial DNA is unique. The rest of human DNA is located in the nucleus of the cell and is divided into chromosomes. Mitochondrial DNA on the other hand resides in the cell's mitochondria (the part that provides most of the cell's energy) and it is connected in a circle just like bacterial DNA. It is also the smallest portion of human DNA being about only 16,000 base pairs long. In other words, it is about one third the size of the smallest chromosome. These unique properties cause mitochondrial DNA to be have a special place among genetic genealogists. One advantage of mtDNA are that because it resides outside the nucleus, it can only be inherited from the mother. Sperm only contains a cell's nucleus and anything outside of that only comes from the mother's egg. This means that it does not go through recombination and testing your mtDNA can reveal information about your ancestor's on your direct maternal line. A match on an mtDNA test must be related through your mother's direct maternal line and their mother's direct maternal line until eventually you will find a female common ancestor who had at least two daughters, one of which is your ancestress and the other is theirs. The other major advantage to mtDNA is that it decomposes slower than the rest of your DNA so it is more likely to be used for solving cold cases when the person of interest is no longer alive to test. It was by using mtDNA that the remains of Richard III were conclusively identified, for example.

One major disadvantage to mtDNA is that it mutates at a slow rate. A person whose mtDNA perfectly matches yours could be related through a common ancestor that lived anytime within the past 500 years. Also the direct maternal line usually has a different surname at every generation so surname projects are useless.

Why use it in family history research?[edit | edit source]

DNA is a powerful tool for genealogists and can be used to prove or verify conclusions that cannot be solved any other way. Autosomal DNA is likely to solve problems up to five generations back, but the fewer generations the test takers are removed from the research problem, the greater the odds of success will be. Y-DNA can be used to solve problems up to 25 generations back. Adoptees and people who don't know their parentage can also use DNA to identify their biological family too. All adoptees have the right to know who their biological family is, but the adoptee only has the right to a relationship with those family members if they consent to it.

DNA Testing Risks[edit | edit source]

DNA testing can reveal information that you did not expect and can be painful. You may learn for example, that the man who raised you was not your biological father, you may learn that your grandfather fathered a child outside of his marriage, or that you are really adopted. The view of this author (Tanner Tolman) is that DNA can only add to your family. All blood is family, but not all family is blood.

DNA companies[edit | edit source]

The companies listed below provide ethnicity results and match lists

  • FamilyTree DNA - began in 1999
    • Website
    • Autosomal (called their Family Finder test)
    • Y-DNA (37 markers through 700 markers with paternal Haplogroup)
    • mt-DNA (with maternal Haplogroup)
  • 23andme - began in 2006
    • Website
    • Autosomal (with paternal and maternal Haplogroups)
  • GedMatch - began in 2010
    • Website
    • Autosomal. Unlike the other companies, GedMatch does not sell DNA kits. However, you can upload your DNA results from any other company
  • Ancestry DNA - began DNA in 2012
  • MyHeritage - began 2016
  • Living DNA - began 2017
    • Website
    • Autosomal (with paternal and maternal Haplogroups)
  • For more information, go to Hiring a DNA Testing Company Wiki article.

Online Classes[edit | edit source]

Additional information[edit | edit source]

Facebook Groups[edit | edit source]

  • 23andMe Newbies - This is a group for those who are beginning genetic testing with 23andMe and discussing genealogy and health.
  • Africana Genetic Genealogy Consortium - This is a group for advanced-level exploration and discussion of African DNA research.
  • Ancestry DNA for Dummies - This group is to help understand Family research using DNA.
  • Autosomal DNA - This group talks about geographical and historical origins of your genetic markers.
  • DNA Detectives - From their Facebook group "The DNA Detectives group is focused on bringing together volunteers with genetic genealogy and searching experience with those seeking biological family -- adoptees, foundlings, donor-conceived individuals, unknown paternity and all other types of unknown parentage cases, near or far. This group is for members helping members and self-education."
  • DNA Tools - This group is for those interested in genetic genealogy DNA tools. Main sites discussed are DNAGedcom, Gedmatch AncestryDNA Chrome Extension and 23++ as well as new ideas for tools.
  • International Society of Genetic Genealogy (ISOGG) - From their Facebook group The mission of the International Society of Genetic Genealogy is to advocate for and educate about the use of genetics as a tool for genealogical research, and promote a supportive network for genetic genealogists."
  • Borland Genetics Users Group - Borland Genetics is a website you can use to help reconstruct your ancestor's DNA. This website is a place where users can collaborate with each other and ask questions to the creator.