From teaching

Jump to: navigation, search

Molecular Evolution - Final Exam

The data set you will analyze in this micro project consists of genes encoding the cytochrome c oxidase sub-unit 1 protein from a range of eukaryotic species: Cytochrome c dataset

Specifically the data set contains the following sequences (I have indicated the taxonomic class for each sequence):

  • Mammals: Human, Bovine (cattle), Mouse, Rat, Seal, Whale
  • Ray-finned fishes: Carp
  • Birds: Chicken
  • Amphibians: Xenopus


Electron transport chain. Complex IV is at the right
The enzyme cytochrome c oxidase or Complex IV is a large transmembrane protein complex found in bacteria and the mitochondrion.

It is the last enzyme in the respiratory electron transport chain of mitochondria (or bacteria) located in the mitochondrial (or bacterial) membrane. It receives an electron from each of four cytochrome c molecules, and transfers them to one oxygen molecule, converting molecular oxygen to two molecules of water. In the process, it binds four protons from the inner aqueous phase to make water, and in addition translocates four protons across the membrane, helping to establish a transmembrane difference of proton electrochemical potential that the ATP synthase then uses to synthesize ATP.

The crystal structure of bovine cytochrome c oxidase in a phospholipid bilayer. The intermembrane space lies to top of the image.

Project description

Using the methods and tools you learned in class you should complete the tasks below. All results (including plots of trees) should be reported in the form of a micro-report (Word/RTF/PDF document or similar) which you will hand in electronically at CampusNet. Please include your data set at the end of the mini report (in fasta format). For each step you should describe how you solved the problem, including the exact commands used, and provide arguments for why you did it in the way you did.

  1. Align the sequences. Explain why you chose the particular method you did.
  2. Convert alignment to suitable file format(s).
  3. Select an outgroup. Argue for your choice.
  4. Construct rooted (on your chosen outgroup) phylogenetic trees using the following five methods. If a method results in more than one optimal tree, then construct a consensus tree. Include a plot of each of the 5 rooted trees in your report.
    • Parsimony
    • Distance based method, with optimality criterion, with JC correction
    • Distance based clustering method, with JC correction
    • Maximum Likelihood (determine best model before constructing tree)
    • Bayesian (bonus points if you can figure out how to use same model as for maximum likelihood)
  5. Make an analysis of selection on the data set. Which model fits the data better? (Explain how to compute this). According to the chosen model: what fraction of sites is under positive selection? Negative selection? What fraction evolves neutrally?

When you are done, please hand in the answer using the electronic system at CampusNet: On the course CampusNet page, go to assignments, and under the header "Assessments", choose "Final exam". Click "hand in" and select the file(s) you want to submit.

Personal tools