Getting Smarter

In the years since Stucky first explored Landsat’s potential, new computer processing techniques have added to the paleontologist’s toolbox. As Anemone and Emerson examine the Great Divide Basin, they don’t just look at a few satellite images. They have taught their computers to analyze hundreds of images.

Inspired by nerve cells (neurons) in the human brain, artificial neural networks tackle complex problems by dividing them into smaller pieces. The networks consist of nodes—which receive inputs, process them, and produce outputs—and connections. Connections pass along the results to subsequent nodes. Not all nodes perform equally important functions, so the connections between nodes assign varying values and priorities to the information they channel.

Map of potential fossil locations.
Using Landsat images analyzed by a neural network, Anemone and his colleagues created a land cover map of the Great Divide Basin. Potential fossil locations are light red, and likely locations are dark red. (Map adapted from Anemone, et al., 2011.)

The neural network assembled by Anemone and Emerson consists of input nodes, hidden nodes, and output nodes. The input nodes are the different spectral bands from Landsat ETM+; the hidden nodes handle the inner workings of the network; and the output nodes are the different types of land cover, including forests, wetlands, sand dunes, and known fossil sites.

But Anemone and Emerson don’t just want their computer network to calculate values. They want it to “learn” how to distinguish between sites that could hold fossils versus sites that would not. To achieve this, they used a feature of neural networks known as back-propagation.

Neural nodes typically compute results and send them forward through the network. “Training” a network means giving it examples of the outputs that it should produce. The discrepancies between what the network produces and what the trainer wants are considered errors. So while positive values travel forward through the system, errors travel backward, showing the computer its mistakes. Computer “trainers” repeat this process until the network learns how to compute the right results.

So Anemone and Emerson had to teach the computer how to use image-analysis software to do what they wanted. The lesson plan included satellite images of the Great Divide Basin.

Like the digital family snapshots you can view on your computer, satellite images are composed of pixels arranged in columns and rows like tiny tiles. As Anemone and Emerson analyzed satellite images, they “tagged” pixels that corresponded with sites they knew to be fossil rich from earlier work in the basin. Through a process of iteration, Anemone and Emerson fed the computer fossil-rich and fossil-poor pixels and showed the computer its errors. The process required more than 200 iterations to teach the computer to separate land cover types into classes: barren, scrubland, forest, wetland, and promising fossil beds.

Map of likely fossil locations.

The best fossil sites are on steep slopes, where erosion exposes multiple layers of rock. (Map adapted from Anemone, et al., 2011.)

Then the moment of truth arrived: they wanted the computer network to predict which pixels were likely to contain fossils. They fed it satellite images of the Great Divide Basin that it had not analyzed before. The computer accurately identified nearly 80 percent of the pixels known to contain fossil sites.

Next, they fed the computer some imagery from the nearby Bison Basin to see if it could identify fossil-rich sites in a different landscape. They got a list of targets from Chris Beard of the Carnegie Museum of Natural History. The network found all three fossil sites that Beard had indicated.

But to their surprise, the network also identified a fourth site. It turned out to be a fossil bed that Beard knew from his own field work but hadn’t told Anemone and Emerson about.

Training a neural network is time consuming and sometimes tedious, so why bother? Why not just analyze the satellite images with human eyes? As Emerson points out, their research team would have to examine images that contain more than 52 million pixels in different spectral bands and different dimensions. It is about as practical as surveying the entire basin on foot.