How (and how not) to survey a systematic invertebrate paleontology collection for locality data

The problem: How to conduct a quick survey of a large, systematically  arranged, invertebrate paleontology collection at the Field Museum to learn how many specimens in the collection are from the Michigan Basin.

Methods: The collection is not on a database and only partially cataloged. The only way to access data for the entire collection is by handwritten labels stored with the specimens.  An initial survey was conducted on the Devonian brachiopod systematic collection by randomly selecting three drawers from each cabinet (~15% of this collection) and recording the locality data for every specimen in the drawer. Although the drawers were arranged alphabetically by genus and species names, my assumption was that the locality data would be random and not correlated with these names. A second survey was conducted of this entire collection (7069 specimen lots) and the results were compared.

Results: 45% of the specimen lots surveyed are from the Appalachian Basin, 15% are from the Michigan Basin, 15% are from the Illinois Basin, 15% are from the Iowa Basin, and the remaining 10% are from around the world. The error in the initial partial survey was greater than anticipated. Using a computer spreadsheet additional surveys were conducted virtually. Comparing surveys of equal size, randomly selecting individual specimen lots yield results with significantly less error than sampling all specimen lots in randomly selected drawers.


Explanation: My assumption was false. Specimens of the same species tend to be arranged in the collection by locality and some species are restricted to a single locality or appear that way due to collection biases. Therefore, specimens in a drawer are more likely to be from the same locality than randomly selected specimens. A better way to conduct this survey is to randomly sample 15% of specimen lots in every drawer.