Monkey Species Classification with Deep Learning in MATLAB

Video thumbnail Watch simulation overview
matlab projects illustration

Introduction

MATLABSolutions demonstrate classification of monkey species using deep learning in MATLAB. In this video we attempt to build a classifier that can identify the species of monkeys from its physical measurements. Six facial characteristics of monkeys are considered: frontal lip, face length, face width and face depth. The problem on hand is to identify the specie of monkey given the observed values for each of these 6 physical characteristics. The six physical characteristics will act as inputs to a neural network and the specie of monkey will be the target. Given an input, which constitutes the six observed values for the physical characteristics of a monkey, the neural network is expected to identify the specie of the monkey.

Methodology

Video recording is now ubiquitous in the study of animal behavior, but its analysis on a large scale is prohibited by the time and resources needed to manually process large volumes of data. We present a deep convolutional neural network (CNN) approach that provides a fully automated pipeline for face detection, tracking, and recognition of wild chimpanzees from long-term video records. In a 14-year dataset yielding 10 million face images from 23 individuals over 50 hours of footage, we obtained an overall accuracy of 92.5% for identity recognition and 96.2% for sex recognition. Using the identified faces, we generated co-occurrence matrices to trace changes in the social network structure of an aging population. The tools we developed enable easy processing and annotation of video datasets, including those from other species. Such automated analysis unveils the future potential of large-scale longitudinal video archives to address fundamental questions in behavior and conservation.Deep learningThe field of machine learning uses algorithms that enable computer systems to solve tasks without being hand programmed, relying instead on learning from examples. With increasing computational power and the availability of large datasets, “deep learning” techniques have been developed that have brought breakthroughs in a number of different fields, including speech recognition, natural language processing, and computer vision. Deep learning involves training computational models composed of multiple processing layers that learn representations of data with many levels of abstraction, enabling the performance of complex tasks. A particularly effective technique in computer vision is the training of deep convolutional neural network (henceforth CNN) architectures to perform fine-grained recognition of different categories of objects and animals, including automated image and video processing techniques for facial recognition in humans, outperforming humans in both speed and accuracy.