I don't like to say bad things about paleontologists, but they're not very good scientists. They're more like stamp collectors.
- Luis Alvarez

The matrices:


As I have spent a significant portion of my career in palaeontology harvesting the literature for primarily morphological cladistic matrices of fossil taxa I am trying to avoid duplicated efforts by making them available here. The majority of these concern Mesozoic dinosaurs as the initital list stemmed from the dinosaur supertree project. However, I am trying to broaden the phylogenetic scope and still have a large number of matrices on my hard-drive that I hope will steadily appear here.

There are, of course, huge gaps. So if you would like to donate a file or notify me of a paper I have missed then just email me. Similarly, if you want all of the files for a meta-analytical project please email me and I can get you access to a dropbox account that will make your life easier and save my bandwidth.


Aside from the numerous authors who have given me copies of there NEXUS/TNT files I am indebted to Dan Cashmore, Armin Elsler, Mickey Mortimer, and Ross Mounce for pointing out studies I had missed. I have also raided pre-existing online resources for some of the matrices, including: TreeBASE, Cladestore and the now defunct personal page of Pete Wagner.

Key to files:

  • NEXUS - Matrix in #NEXUS format
  • TNT - Matrix in TNT/Hennig86 format
  • MPT(s) - Most parsimonious trees (in Newick format)
  • (1) - First most parsimonious tree (in Newick format)
  • SC - Strict consensus tree of the above (in Newick format)
  • (TV) - Tree visualisation using TreeVector (Pethica et al. 2010)
  • MRP - Matrix representation with parsimony of tree(s)
  • XML - Metadata in XML format

NB: All tree files >1Mb in size are zipped.


Why are there no character lists?

Two reasons: First, the kinds of impetuses I compiled these files for didn't require them. Second, these are a lot of extra work.

Are these the exact same files used by the authors of the original paper?

In most cases, no. In theory they ought to be the same as they are, for the most part, compiled from the information given in the paper, but inevitably human error creeps in (on both sides) and differences arise.

Why host these on your personal web site instead of a more formal repository like TreeBASE?

Primarily because my goals and those of TreeBASE differ. However, I have made my files available to them and the dinosaur matrices at least are slowly being added there.

Where do the trees come from?

Trees are from TNT searches where "ienum;" was used for 24 or less taxa and a series of 20 replicates of new technology searches followed by "bbreak=tbr;" to find all MPTs for 25 or more taxa. If more than 100,000 MPTs were found I only report the first 100,000. As I use TNT all polymorphisms are treated as uncertainties, although it should be noted that this may not always have been the author(s)' original intent.

