We possess no pedigrees or armorial bearings; and we have to discover and trace the many diverging lines of descent in our natural genealogies, by characters of any kind which have long been inherited.
- Charles Darwin (1872)

The matrices:


As I have spent a significant portion of my career in palaeontology harvesting the literature for primarily morphological cladistic matrices of fossil taxa I am trying to avoid duplicated efforts by making them available here, e.g., for teaching or just as a literature resource for what is available for each group. However, it should be noted that the sampling is markedly skewed tpwards tetrapods (as this is where most of my research is focussed), so this should not be considered an unbiased or comprehensive sample.

There are, of course, huge gaps and I always appreciate a heads up for anything I have missed, either a paper or a file - just email me. Similarly, if you are interested in producing a metatree of a particular group do get in touch as I have a pretty mature pipeline at this point that can massively minimise the effort involved compared to, for example, formal supertree approaches. Similarly, there are pecularities to the database that mean without the knowledge of the compiler (i.e., me) major assumption violations can be avoided.


Aside from the numerous authors who have given me copies of there NEXUS/TNT files I am indebted to Dan Cashmore, Mario Coiro, David Grossnickle, Mickey Mortimer, Ross Mounce, Spencer Hellert, Winston Wilson, and Anna Wisniewski for pointing out studies I had missed. I have also raided pre-existing online resources for some of the matrices, including: TreeBASE, Cladestore and the now defunct personal page of Pete Wagner.

Key to files:

  • NEXUS - Matrix in #NEXUS format
  • TNT - Matrix in TNT/Hennig86 format
  • MPT(s) - Most parsimonious trees (in Newick format)
  • (1) - First most parsimonious tree (in Newick format)
  • SC - Strict consensus tree of the above (in Newick format)
  • (HTML) - Strict consensus tree of the above visualised in HTML using Phy2HTML
  • MRP - Matrix representation with parsimony of tree(s)
  • XML - Metadata in XML format

NB: All tree files >1Mb in size were zipped.


Why are there no character lists?

Two reasons: First, the kinds of impetuses I compiled these files for didn't require them. Second, these are a lot of extra work.

Are these the exact same files used by the authors of the original paper?

No. Most ought to be very similar, but I individually check and process files, sometimes making amendments based on my personal needs. Caveat emptor. Always.

Why host these on your personal web site instead of a more formal repository like TreeBASE?

Primarily because my goals and those of TreeBASE, and other online repositories differ. For example, most other repositories want to record the published tree(s) "as is", whereas I reanalyse matrices to generate the trees myself.

Where do the trees come from?

Trees are from TNT searches where "ienum;" was used for 24 or less taxa and a series of 20 replicates of new technology searches followed by "bbreak=tbr;" to find all MPTs for 25 or more taxa. If more than 100,000 MPTs were found I only report the first 100,000. As I use TNT all polymorphisms are treated as uncertainties, although it should be noted that this may not always have been the author(s)' original intent.

Why are there no Bayesian trees?

I would like to add these, but I haven't gotten around to it yet. You can check out my paper with April Wright here for some of them though.

Last updated 6th April 2011.