TY - JOUR AU1 - Manoukis, Nicholas C AU2 - Collier, Travis C AB - Abstract New or improved technologies can enable entomologists to address previously intractable questions, especially in the area of insect behavior. In this review, we describe the basic elements of applied computer vision for entomologists: image capture, data extraction, and analysis. We describe some of the currently available options in imaging hardware and cameras, lighting, software, as well as some basic data collection scenarios, and give detailed examples from our own experience. We suggest that the study of insect behavior is increasingly based on quantification of behavioral phenomena, that use of computer vision techniques for quantification will increase, and that the application of these tools and approaches will bring new insight and answers to questions in entomology. We hope this review can serve as a starting point for those interested in delving deeper into how computer vision can be applied to their research. Entomologists, particularly those investigating insect behavior, regularly struggle to observe the object of their study. In the laboratory, it is almost certain that behavior will be different in nature, but in the field, there is difficulty locating individuals and observing behaviors throughout the life cycle. These difficulties are increased when a researcher aims to measure the behaviors for quantitative analysis. In recent years, there has been increased use of cameras, computers, and image analysis to mitigate these issues and further reveal the essential behavioral patterns of insects. The goal of this article is to provide a starting point to researchers wishing to use computer vision to understand the behavior of insects quantitatively. The field of computer vision is concerned with ‘the construction of explicit, meaningful descriptions of physical objects from images’ (Ballard and Brown 1982); for our purposes, we add that it is also concerned with the motion of physical objects. Computer vision has historically been linked with work in artificial intelligence and neurobiology (Biederman 1987, Huang 1996, Gupta et al. 2010, Lin et al. 2018) and shares techniques from image processing and machine vision. Image processing is the algorithmic analysis of 2D images and transformations of an image to another image (e.g., a rotation), whereas machine vision normally refers to industrial and robot-guidance applications (Steger et al. 2018). Because many of the techniques are shared, these and other related fields are sometimes synonymized. However, computer vision’s goal of creating meaningful descriptions of the real world from images fits well with statistical analysis and eventual quantitative understanding of insect behavior. Quantification has been enabled by increases in the quality and quantity of image data that can be captured with digital systems (Nakamura 2017) plus increases in computing power to process these data. These tools are available to entomologists today, but unfamiliarity with the technologies can restrict their application. Another important enabling trend has been increasing use of ‘commercial off-the-shelf’ systems (COTS) for scientific research, which lowers the barriers to entry posed by designing and testing a custom image capture and processing pipeline (Akkaynak et al. 2014). Finally, advances in wireless communications and low-power systems have made automated real-time or near real-time observations in the field a possibility in an increasing number of situations. This review will present a largely idiosyncratic view of computer vision as applied to insect behavior research based on our experience and examples from the literature. For a more general treatment, we refer the reader to recent reviews on how computer vision is changing the field of animal behavior (Anderson and Perona 2014). We strive to highlight the enormous potential these tools and techniques hold for entomological science and to provide useful starting points to readers selecting hardware, software, and analysis approaches. Though we do not aim to be comprehensive or exhaustive, this article should serve as a helpful guide to the principal components of computer vision in insect behavior research. Application The computer vision process of constructing measurements and descriptions from images can be broken down into a few steps. The first step is image capture. Fundamentally, this is a camera (video or still) converting light to images (which are numerical representations of scenes) to be processed by a computer. Measurements, observations, classifications, and other data are then extracted from the images or videos using a variety of techniques. Software can be used for multiple quantification tasks, including object detection and counting, location and distance measurements, motion tracking, species identification, and classification of behaviors. Finally, the data need to be analyzed in the context of a predetermined hypothesis being tested and results interpreted. Image Capture Image capture choices will be driven by whether the recording of behavior is to be conducted in the laboratory or in the field (see Fig. 1 for examples). Data collected from field observations will generally be more reflective of ecologically relevant behaviors. However, field conditions can impose strong limitations on the setting and the equipment. In the lab, conditions such as lighting, temperature, background, and more can be controlled, as well as providing protection from the elements, electricity to run equipment, and the convenience of having materials and tools on hand. In any setting, standard photographic considerations such as foreground/background, focus, framing, contrast, and depth of field must be taken into account. For computer vision analyses, consistency between images taken in the same and across sessions is important. Fig. 1. Open in new tabDownload slide Sample camera setups for insect behavior research. (A) Dr. R. da Silva Gonçalves setting up a system in the laboratory to using networked IPcams to record the oviposition behavior of the braconid Fopius arisanus; (B) multi-camera setup to record trap catch of tephritids in semifield conditions (from Manoukis 2016); (C) NCM preparing to use stereo machine vision cameras to record mating behavior of Anopheles gambiae Giles (Diptera: Culicidae) in the field (see Butail et al. 2012 for details). Because images and video are literally sensing light, how the subjects being recorded are lit is critically important. In some cases, natural light may suffice and is preferable because it will not unnaturally alter the behaviors being observed, but often artificial light will be required. The consistent illumination of artificial light is especially useful for computer vision systems, which commonly include steps comparing images for changes. In addition, lights of specific spectra/colors can be used to minimize the impact on the subjects or enhance the contrast of between features of interest and the background, especially when combined with filters on the lens. In some cases, it is even be possible to utilize specific lighting which stimulates fluorescence in subjects which is matched to a lens filter. For nocturnal studies, infrared (IR) light is often employed to observe subjects while minimizing disturbance as most insects cannot detect light in the IR spectrum (however, see Gibson 1995). Most CCD sensors can detect IR, but higher-end cameras often include an ‘IR cut’ filter, which may need to be removed. In a laboratory setting, lighting can be chosen to mimic specific conditions, but care must be taken to shield subjects from uncontrolled lights (e.g., daylight) and other stimuli. However, any lighting setup, even IR, may have significant effects on the behaviors being observed, so small trials testing different lighting conditions are advisable. In addition to lighting, there are a number of other study specific considerations when employing computer vision. The first consideration is the length of recording required. In the case of capturing the mating behavior of a lepidopteran, relevant behaviors may occur at sunset (which might mean a relatively short recording window), or may occur at an unknown time overnight (hours of recording may be needed). Second, lensing and positioning of the camera or cameras must be selected to ensure that the individual insects being observed are of a resolvable size; even small insects have been successfully studied when equipment is carefully chosen (Manoukis et al. 2009, de Bruijn et al. 2018). If orientation information is needed, then each insect should cover an area of dozens of pixels in the final images, while observing grooming or feeding behavior may require much higher resolution. Often trial and error with a variety of sensors and lenses is needed to establish a usable combination. Third, there is a throughput trade-off in camera hardware where resolution, bit depth (colors or grays; not all studies will require color images, as these generally require more storage space), and frame rates are codependent. An increase in one will reduce the available values of the others. Camera throughput can also be affected by the image format chosen for output. It is best for images to be ‘raw’ or compressed with a lossless method because lossy compression, such as JPEG, introduces artifacts that may interfere with data extraction. However, lossy compression is sometimes unavoidable to achieve required throughput, record for long durations, or satisfy other equipment limitations. This touches on a final consideration, storage. The amount of space taken by a digital recording will be affected by both the length of time required and the amount of image data captured per unit time. Relevant capture rates might be 30 frames per second for video footage, thousands of frames per second for slow motion capture, or perhaps as few as a few frames per hour for long duration surveys or time lapse. A camera system that allows these to be set will be most flexible for accommodating research on varying insects and behaviors. Classes of systems and their capabilities and trade-offs are given in Table 1. Table 1. Classes of cameras that can be used for computer vision research System class . Example model . Resolution (px) . Frame rate (fps) . Selectable lens? . Raw output possible? . Recording length . Field capable? . Cost . USB camera Logitech HD Pro C920 1,080 × 1,920 30 N Ya +++ N $ Security camera Cisco 6000 series 1,080 × 1,920 30 Y N +++ Y $$ Web/IP cam YI model 87025 1,080 × 1,920 15 N N +++ N $ Action cam GoPro Hero 6 Black 4,096 × 2,160 60 N Y + Y $ Camcorder Panasonic HC-WXF991K 4,096 ×  2,160 30 N N ++ Y $$ Digital SLR Canon EOS 5D Mark IV 4,096 × 2,160 30 Y Y + Y $$$ Machine vision camera FLIR Flea3 USB3 1,600 × 1,200 (selectable sensors) 15–150 Y Y + N $$ System class . Example model . Resolution (px) . Frame rate (fps) . Selectable lens? . Raw output possible? . Recording length . Field capable? . Cost . USB camera Logitech HD Pro C920 1,080 × 1,920 30 N Ya +++ N $ Security camera Cisco 6000 series 1,080 × 1,920 30 Y N +++ Y $$ Web/IP cam YI model 87025 1,080 × 1,920 15 N N +++ N $ Action cam GoPro Hero 6 Black 4,096 × 2,160 60 N Y + Y $ Camcorder Panasonic HC-WXF991K 4,096 ×  2,160 30 N N ++ Y $$ Digital SLR Canon EOS 5D Mark IV 4,096 × 2,160 30 Y Y + Y $$$ Machine vision camera FLIR Flea3 USB3 1,600 × 1,200 (selectable sensors) 15–150 Y Y + N $$ These can capture video or still images, though specifics vary significantly. For the ‘recording length’ column, ‘+’ indicates minutes, ‘++’ indicates hours, and ‘+++’ indicates possibly days for the given settings. aRaw output requires third-party drivers and additional effort. Open in new tab Table 1. Classes of cameras that can be used for computer vision research System class . Example model . Resolution (px) . Frame rate (fps) . Selectable lens? . Raw output possible? . Recording length . Field capable? . Cost . USB camera Logitech HD Pro C920 1,080 × 1,920 30 N Ya +++ N $ Security camera Cisco 6000 series 1,080 × 1,920 30 Y N +++ Y $$ Web/IP cam YI model 87025 1,080 × 1,920 15 N N +++ N $ Action cam GoPro Hero 6 Black 4,096 × 2,160 60 N Y + Y $ Camcorder Panasonic HC-WXF991K 4,096 ×  2,160 30 N N ++ Y $$ Digital SLR Canon EOS 5D Mark IV 4,096 × 2,160 30 Y Y + Y $$$ Machine vision camera FLIR Flea3 USB3 1,600 × 1,200 (selectable sensors) 15–150 Y Y + N $$ System class . Example model . Resolution (px) . Frame rate (fps) . Selectable lens? . Raw output possible? . Recording length . Field capable? . Cost . USB camera Logitech HD Pro C920 1,080 × 1,920 30 N Ya +++ N $ Security camera Cisco 6000 series 1,080 × 1,920 30 Y N +++ Y $$ Web/IP cam YI model 87025 1,080 × 1,920 15 N N +++ N $ Action cam GoPro Hero 6 Black 4,096 × 2,160 60 N Y + Y $ Camcorder Panasonic HC-WXF991K 4,096 ×  2,160 30 N N ++ Y $$ Digital SLR Canon EOS 5D Mark IV 4,096 × 2,160 30 Y Y + Y $$$ Machine vision camera FLIR Flea3 USB3 1,600 × 1,200 (selectable sensors) 15–150 Y Y + N $$ These can capture video or still images, though specifics vary significantly. For the ‘recording length’ column, ‘+’ indicates minutes, ‘++’ indicates hours, and ‘+++’ indicates possibly days for the given settings. aRaw output requires third-party drivers and additional effort. Open in new tab In a laboratory, a COTS USB camera might be sufficient to capture usable images for analysis through a desktop or notebook computer. A more costly but flexible and higher quality option would be a machine vision camera connected over DisplayLink or GigE interfaces to a computer with a fast storage system. The sample camera options given in Table 1 illustrate trade-offs and differences between camera types. The USB camera is a COTS device with relatively little configurability, but is cost-effective and can be sufficient for some problems; the FLIR camera allows selectable lenses, pixel-level filters, uncompressed output, sensor selection, and many image capture options that enable a greater set of questions to be addressed, but is costly and complex to operate. In a field setting, options may be constrained by lack of electricity or weather conditions. A rugged action cam can capture time-lapse images over many hours to internal storage and might be adequate to analyze the number of insects coming to a particular lure, for example. Alternatively, for short recordings, a digital SLR could be preferable as it allows for a wide range of lenses and higher quality images. If individual tracking data or quantitation of positions in real-world coordinates are needed to address a research question, then another consideration is camera calibration. Briefly, the geometry of any image capture system should be measured to allow translation of image (pixel) data into measurements in the environment (e.g., mm). This can be accomplished via a set of calibration images (usually images of a checkerboard pattern before each data collection session) and recording camera orientation data (position, inclination, etc). We refer the interested reader to the excellent book by Hartley and Zisserman (2000) for a practical introduction to these issues. Data Extraction Once images are captured, the next logical step is to process them to obtain measurements that can be analyzed, with the key initial step being separation of subjects from the background in each image. In most situations involving entomological research, the capture and processing occur as two separate steps. An important part of the processing in these cases involves some manual observation and measuring (‘scoring’) by humans. Researchers looking at each image or a video to manually record measurements, ethograms, etc. is not what one would generally think of as computer vision. However, manual scoring is essential for gathering data to develop, train, and validate automated processing. This is especially true in research applications, which usually represent unique situations. For small-scale experiments, human processing may be entirely sufficient, but the amount of work required grows quickly with the experiment scale. Manual scoring also introduces issues such as variation between observers, the observer-expectancy effect, and other factors, which make human observations less reliable than generally supposed (Rosenthal 1966). Human processing of images and videos in entomology is often particularly difficult because the animals tend to be small and hard to see, and there are often long periods where animals are inactive or not in frame followed by brief action occurring on a faster than human timescale. An automated image processing system can be scaled to deal with large amounts of data and greatly enhance reliability. Today, there are many automated image processing systems available with extremely active development of new and improved systems by commercial companies, academics, and open-source communities. Because computer vision is such an active area of research, interdisciplinary collaboration to address questions in entomology via novel computer methods is often very successful (Balch et al. 2001, Spampinato et al. 2015). Often cross-disciplinary collaboration with computer vision specialists results in custom software. There are several excellent libraries of functions and frameworks such as OpenCV (Bradski 2000) or MATLAB’s Computer Vision System Toolbox (Mathworks 2017) that facilitate the development of custom software. For this article, we focus on the use of applications such as those listed in Table 2 rather than on custom software. Many of these applications are specifically designed for observing insect behavior and provide easy access to sophisticated methods. There are also companies offering complete systems such as Noldus Information Technology’s Track3D system. Table 2. A sample of useful software for computer vision analysis Software . Cost Free? . Features . URL . 3D Slicer Y Image registration and segmentation https://www.slicer.org/ Amira N 3D visualization and analysis https://www.fei.com/software/amira-avizo/ Aphelion N Image acquisition, enhancement and classification http://www.adcis.net/en/Products/Aphelion-Imaging-Software- Suite.html Bonsai Y Real-time video analysis http://www.open-ephys.org/bonsai/ Buridan Y Drosophila tracking http://buridan.sourceforge.net/ Ctrax Y Drosophila tracking http://ctrax.sourceforge.net/install.html Ethovision XT N Associated products and complete turnkey systems https://www.noldus.com/animal-behavior-research ImageNets Y Image processing, workflow design http://imagenets.sourceforge.net/ idTracker Y Insect tracking http://www.idtracker.es/home ImageJ Y Image processing and analysis in Java http://imagej.nih.gov/ij/ ITK Y Image segmentation and registration https://itk.org/ JAABA Y Machine learning-based system to automatically compute statistics describing video of behaving animals http://jaaba.sourceforge.net/ MeVisLab N Image processing and visualization https://www.mevislab.de/ ToxTrac Y Y Animal tracking program https://sourceforge.net/projects/toxtrac/ VirtualDub Y Video capture/processing utility http://virtualdub.org/ VTK Y Visualization of 3D and other computer graphics https://www.vtk.org/ Software . Cost Free? . Features . URL . 3D Slicer Y Image registration and segmentation https://www.slicer.org/ Amira N 3D visualization and analysis https://www.fei.com/software/amira-avizo/ Aphelion N Image acquisition, enhancement and classification http://www.adcis.net/en/Products/Aphelion-Imaging-Software- Suite.html Bonsai Y Real-time video analysis http://www.open-ephys.org/bonsai/ Buridan Y Drosophila tracking http://buridan.sourceforge.net/ Ctrax Y Drosophila tracking http://ctrax.sourceforge.net/install.html Ethovision XT N Associated products and complete turnkey systems https://www.noldus.com/animal-behavior-research ImageNets Y Image processing, workflow design http://imagenets.sourceforge.net/ idTracker Y Insect tracking http://www.idtracker.es/home ImageJ Y Image processing and analysis in Java http://imagej.nih.gov/ij/ ITK Y Image segmentation and registration https://itk.org/ JAABA Y Machine learning-based system to automatically compute statistics describing video of behaving animals http://jaaba.sourceforge.net/ MeVisLab N Image processing and visualization https://www.mevislab.de/ ToxTrac Y Y Animal tracking program https://sourceforge.net/projects/toxtrac/ VirtualDub Y Video capture/processing utility http://virtualdub.org/ VTK Y Visualization of 3D and other computer graphics https://www.vtk.org/ Open in new tab Table 2. A sample of useful software for computer vision analysis Software . Cost Free? . Features . URL . 3D Slicer Y Image registration and segmentation https://www.slicer.org/ Amira N 3D visualization and analysis https://www.fei.com/software/amira-avizo/ Aphelion N Image acquisition, enhancement and classification http://www.adcis.net/en/Products/Aphelion-Imaging-Software- Suite.html Bonsai Y Real-time video analysis http://www.open-ephys.org/bonsai/ Buridan Y Drosophila tracking http://buridan.sourceforge.net/ Ctrax Y Drosophila tracking http://ctrax.sourceforge.net/install.html Ethovision XT N Associated products and complete turnkey systems https://www.noldus.com/animal-behavior-research ImageNets Y Image processing, workflow design http://imagenets.sourceforge.net/ idTracker Y Insect tracking http://www.idtracker.es/home ImageJ Y Image processing and analysis in Java http://imagej.nih.gov/ij/ ITK Y Image segmentation and registration https://itk.org/ JAABA Y Machine learning-based system to automatically compute statistics describing video of behaving animals http://jaaba.sourceforge.net/ MeVisLab N Image processing and visualization https://www.mevislab.de/ ToxTrac Y Y Animal tracking program https://sourceforge.net/projects/toxtrac/ VirtualDub Y Video capture/processing utility http://virtualdub.org/ VTK Y Visualization of 3D and other computer graphics https://www.vtk.org/ Software . Cost Free? . Features . URL . 3D Slicer Y Image registration and segmentation https://www.slicer.org/ Amira N 3D visualization and analysis https://www.fei.com/software/amira-avizo/ Aphelion N Image acquisition, enhancement and classification http://www.adcis.net/en/Products/Aphelion-Imaging-Software- Suite.html Bonsai Y Real-time video analysis http://www.open-ephys.org/bonsai/ Buridan Y Drosophila tracking http://buridan.sourceforge.net/ Ctrax Y Drosophila tracking http://ctrax.sourceforge.net/install.html Ethovision XT N Associated products and complete turnkey systems https://www.noldus.com/animal-behavior-research ImageNets Y Image processing, workflow design http://imagenets.sourceforge.net/ idTracker Y Insect tracking http://www.idtracker.es/home ImageJ Y Image processing and analysis in Java http://imagej.nih.gov/ij/ ITK Y Image segmentation and registration https://itk.org/ JAABA Y Machine learning-based system to automatically compute statistics describing video of behaving animals http://jaaba.sourceforge.net/ MeVisLab N Image processing and visualization https://www.mevislab.de/ ToxTrac Y Y Animal tracking program https://sourceforge.net/projects/toxtrac/ VirtualDub Y Video capture/processing utility http://virtualdub.org/ VTK Y Visualization of 3D and other computer graphics https://www.vtk.org/ Open in new tab Figure 2 illustrates the steps involved in data extraction for the analysis of images of a tephritid fruit fly trap (a Jackson trap) from a recent study (Manoukis 2016). The goal of the study was to determine whether the insecticide in a trap acted as a repellent to Mediterranean fruit fly, Ceratitis capitata Wiedemann (Diptera: Tephritidae), by comparing resting places in baited traps with and without insecticide. Relevant to this discussion, the image capture system consisted of IP cameras capturing six images per minute of various surfaces of the traps (one camera each capturing internal sides 1 and 2, one camera for top two surfaces, and a fourth camera capturing the bottom). Images were then processed via background subtraction (in its simplest form, subtracting one image from the next gives only the portions that have changed—presumably the flies). Next, the subtracted images were processed in ImageJ (see Table 2): first through a process known as ‘thresholding’: setting a level of brightness above which pixels are set to 100% brightness, and all below are dropped. This leaves a binary image with only the pixels representing flies marked as while (see Fig. 2F and G). Thresholding is the simplest form of a general process called segmentation, where images are simplified to make them easier to analyze. Finally, minimum size and shape parameters are applied to allow the computer to ‘count’ the number of flies and their positions in the image (Fig. 2H). Fig. 2. Open in new tabDownload slide Sample raw and processed images from Manoukis (2016). Raw images of a Jackson trap baited with a TML plug and no insecticide: (A) internal, East; (B) internal, West; (C) top, West and East; (D) bottom, processed images; (E) image from (A) after background subtraction; (F) image from (B) after background subtraction and thresholding (region of interest marked with polygon); (G) image from (C) after background subtraction and thresholding; (H) image from (D) after background subtraction, thresholding, and automated detection (outlines of detected flies are marked and numbered). Analysis Analysis of captured and processed image data is the last stage in a massive filtration of an initial dataset of pixel data. A concrete example to illustrate is Manoukis and Jang (2013). Images amounted to about 5.9 × 109 pixels from all replicates, which then underwent data extraction in the form of background subtraction, segmentation, and automated detection of individual insects, resulting in about 1.92 × 105 insects detected. This number of insects was then further reduced to the number per image, a dataset of about 184,000 points. This dataset was entered into statistical analysis software (R Core Team 2016), which was used for normalization and calculation of the mean numbers of insects for different treatments at three times of the day. This final set of a few dozen numbers was analyzed via Wilcoxon rank-sum tests for differences and other statistical tests. One important point from our own experience is that the analysis of image data can be as time consuming and complex as acquisition and data extraction combined. This often arises from difficulty designating an experimental replicate (sequences capturing behavior are often not paired, or independent). Tracking data, especially in 3D, can also be intractable with standard statistical approaches, requiring sophisticated or custom-designed tests (Manoukis et al. 2014) such as behavioral spectrograms (Berman et al. 2014). It is important for entomologists embarking on computer vision projects to consider exactly what sort of data they require to address their research hypothesis and have an analysis plan ready at the outset, even if it must later be modified when it encounters real data. Computer vision systems make errors, but often the system can be designed or tuned in such a way that the bias of those errors is known and works with an experimental design. For example, a system to count the number of flies visiting a lure may undercount when there are many individuals, but this would only strengthen the confidence in a result showing one lure is more attractive than another. Statistical computing programs such as R, SAS (SAS Institute 2010), SPSS (IBM Corp 2017), and many others may be used for common methods, and more specialized methods are often available as packages for R or add-ons to commercial products. The BehavioralMicroarray toolbox (Dell et al. 2014) for MATLAB (Mathworks 2017) is particularly relevant. This toolbox calculates a suite of measurements to characterize the walking behavior of insects, including per frame speed, distance to closest other fly, turning rates, and more. Many other specialized software packages exist for particular data types, an example is ‘Clocklab’ (http://www.actimetrics.com/products/clocklab/) and cost-free versions (e.g., the circadian.org suite https://www.circadian.org/softwar.html) for analysis of circadian data. A natural development is the application of machine learning (ML) methods to the analysis of image data. In fact, most ‘deep learning’ systems, currently applied to diverse ML tasks and believed to hold great promise, are based on convolutional neural networks developed for computer vision tasks (LeCun et al. 1998) and inspired by Hubel and Wiesel’s (1962) work on the architecture of the mammalian visual cortex. ML methods generally require very large amounts of training data, which image capture is well suited to provide. The Janelia Automatic Animal Behavior Annotator (JAABA; Kabra et al. 2013) is the first tool we know of specifically designed to use ML methods to classify nonhuman animal behavior, but we expect further developments in this field and automatic behavior classification to become a common feature in behavior/ethogram coding software (Dell et al. 2014). Examples of Computer Vision in Entomology In this section, we give examples of hardware and analysis approaches that might be applied to diverse questions in entomology, from simplest to increasingly complex. The goal is to show the range of methods that might be considered computer vision in the context of entomology research. References in this section are model studies that will be of interest to those considering using the approach outlined for a particular study. A staple of circadian behavior studies in Drosophila, the Locomotor Activity Monitor, can be considered a very low-resolution computer vision system with capture and processing steps occurring in real time. These devices, e.g. DAM2 (TriKinetics Inc., Waltham, MA), are made up of arrays of holding tubes with IR beam-break sensors around each tube. A fly crossing the beam triggers a signal, which is recorded by the computer and interpreted as activity. Each beam-break sensor is essentially a single pixel camera, but the very low-resolution and spatial discontinuity caused by the beams only monitoring specific portions of the tube results in an underestimation of activity compared with what is achievable using automated processing of high-resolution video (Zimmerman et al. 2008). Given the low cost of suitable cameras, readily available software (e.g., Tracker by Donelson et al. 2012), and ability to detect briefer and smaller movements, the use of improved activity monitoring systems based on computer vision techniques is becoming more common (e.g., Dankert et al. 2009, Inan et al. 2011, Gilestro 2012). Activity monitoring is associated with circadian behavior and sleep research using Drosophila, but can useful in other settings such as quality control of mass-reared tephritids used for sterile insect control releases (Dominiak et al. 2014), or assaying pesticide resistance. A novel application of computer vision based activity monitoring is eclosion monitoring via a COTS webcam for optogenetic studies (Ruf et al. 2017). In addition to detecting movement as a classic activity monitor does, the higher spatiotemporal resolution of camera/video-based system also enables measuring position, speed, and direction. For analysis restricted to a 2D plane, this requires slightly more sophisticated processing, but this is available via several of the programs mentioned above, such as Buridan and Ctrax. Another common application of computer vision is counting and possibly classifying individuals. An example is detecting and counting the number of insects coming to a lure or trap. Identification and automated counting can be accomplished with COTS devices (Manoukis and Jang 2013, Manoukis 2016), such as networked IP cameras and a computer to automatically ‘grab’ still images at a preset interval. Alternatively, a subset of still frames can be extracted from a video sequence to perform the data extraction. A further application of counting and classification are ‘smart traps’ where insect traps in the field are equipped with sensors, which allows them to automatically count and classify what they have caught in near real time. There has been significant research activity in this area followed by rapid commercial development. More sophisticated smart traps employ an optical sensor to record the wing-beat frequency of insects entering the trap combined with other information such as the time and prior/expected probabilities to classify by species (Chen et al. 2014). A more traditional computer vision approach integrates a camera, processor, and communications into an otherwise standard trap, normally a sticky-panel trap. The system periodically takes an image, detects any insects caught, and classifies them. The system then transmits the information to a central processor where statistics and summary information are computed. Operators can be notified immediately if an unexpected pest is detected and the specific images can be reviewed to check for potential false positives. This approach has been the subject of many research papers (e.g., Tirelli et al. 2011, López et al. 2012, Selby et al. 2014) and several commercial endeavors (e.g., Trapview, SnapTrap). Using computer vision to analyze insect behavior in 3D further increases complexity. These systems generally require multiple cameras capturing images simultaneously and require sophisticated algorithms for analysis. Some of the earliest 3D tracking work in entomology was by Okubo et al. (1981), who used a single camera system and the angle of the sun to calculate 3D positions and track midges in mating swarms. Shortly, thereafter video systems with computer interfaces were introduced (e.g., Shinn and Long 1986) for 3D position analysis. From that time, there have been many studies, especially after the turn of the 21st century, including aspects from environmental cues (Chiron et al. 2013) to joint kinematics of walking insects (Bender et al. 2010). Comprehensive review of the many studies in this area is beyond our scope here. Our own work involved 3D tracking of Anopheles mosquitoes in mating swarms in the field (Manoukis et al. 2009, Butail et al. 2013). Subsequent studies on mosquito swarming behavior have provided valuable insight, but these have predominantly been conducted in the laboratory (Kelley and Ouellette 2013, Wilkinson et al. 2014, Jackson et al. 2015). Sample Capture, Processing, and Analysis Pipeline In this section, we give an example from our research (submitted) to illustrate the hardware and software components, capture, extraction, and analysis steps, in the context of a complete insect behavior study. This work was conducted in 2016, focused on measuring the fine-scale behavior of Zeugodacus cucurbitae Coquillett (Diptera: Tephritidae) (‘melon fly’) in response to two male lures. Melon fly is an invasive pest of cucurbits. An important tool for its detection and control are male lures such as cuelure (Beroza et al. 1960). The computer vision component of the study was designed to assess if there were differences in the behavior of males approaching two lures: a more direct approach and increased touching/feeding on one lure compared with the other might mean higher effectiveness of trapping, as the insect would be more likely to enter the trap and be caught compared with a lure that just attracts flies to the general area or causes arrestment. To quantify the approach of individual wild melon fly males to the lures, we set up two adjacent stages in the field, each with one of the lures placed in the center and a camera-capturing video (Fig. 3A). We used FLIR (formerly Point Grey) 1.3 MP ‘Flea’ cameras (Point Grey, Richmond BC Canada), each connected and powered via USB3 to a single notebook computer with fast (SSD) storage. For each recording day, we recorded location, start and end times, temperature and RH measurements at the start and end of the observation, and camera settings. For camera settings, we recorded lens aperture (f-stop) and various settings related to image capture in the software (Pt. Grey Fly Capture 2 ver 2.9.3.43) including brightness, shutter/exposure time, sharpness filter, gain, frames per second, image bit depth, and compression or lack thereof. In order, our software settings on August third were as follows: +0, 0.809 ms, +1,530, 0 db, 20 fps, 8 bit-mono, and mjpeg 100% quality. We note that compression should generally be avoided as it degrades image quality, but it was necessary in this case due to bandwidth limitations of the camera-computer interface. Fig. 3. Open in new tabDownload slide Image capture, data extraction, and analysis samples. (A) Two ‘flea’ cameras mounted over stages with lures to attract Zeugodacus cucurbitae; (B) still from video sequence that has been used for tracking individual flies by ctrax; (C) positional heat-map and turn histogram (inset) for the approach of a single fly to the lure at the center of the stage. We recorded for approximately 30 min, then reversed the positions of the two stages, to minimize positional effects, and recorded for another 30 min. These two sets of recordings comprised a single day of fieldwork, and 10 d of recordings were captured for the season. Data extraction occurred in the laboratory using desktop computers. We again kept notes on the entire processing pipeline, as it contains multiple steps, and it is important to have these recorded for replicability. In the case of this experiment, we started by creating an uncompressed AVI-containerized video from the source using VirtualDub (see Table 2). A filter was applied before extraction (temporal smoother) to reduce changes between frames. Subsequent steps relied on the ctrax fly tracking suite (Table 2; Branson and Bender 2008). With ctrax, we created smaller ufmf files using any2ufmf (part of ctrax) for ease of handling and speed. We then ran initial tracking using the main ctrax suite, following the ‘wizard’. Settings used were recorded, especially those relating to background subtraction, segmentation, and object shape parameters. The results could then be visualized in an ‘annotated’ video file, created with ctrax, that superimposes the tracking results on the video (Fig. 3B; see also Supp Video [online only]). After many tries, the tracking was considered as good as possible, and all the videos from the day were processed overnight using the batch facility in ctrax. Following this initial tracking, the FixErrors package from ctrax was used to manually rectify errors in the tracking. The software package documentation provides details of the many kinds of errors that can be detected, reviewed, and manually corrected with this tool. For example, one common error is the tendency of tracking software to swap the identity of individuals when they come together and then separate. The software presents segments where errors such as identity swaps might have occurred and the user can manually correct them. The entire image capture, data extraction, and analysis pipeline described above had to be optimized. This meant a significant amount of trial and error. Short recordings in the laboratory with dead and then living colony-reared insects were made to test resolution, frame rate, calibration, and other aspects. Because portions of the process are highly interrelated, it is important to include trial runs involving all the steps, not just collecting video. This was an iterative process, with the image capture variables the most important, as poor images cannot be made into good ones once captured. Beyond optimizations, testing the accuracy of any automatically measured data is essential. In many of our studies, this takes the form of selecting a subset of images for manual processing and systematically comparing the results with the automated approach. For the study described here, we visually compared tracks for a subsample of each day’s captures and quantified the number of occlusion or track-swap errors. Once the positions/tracks of each individual had been collected, checked for errors, and tested for accuracy, the pipeline shifts to the analysis portion (Fig. 3C). In the case of this experiment, we conducted analysis in R. This included visualizing tracks similar to the position heat-map shown in Fig. 3C and then scaling the pixel measurements in the image (vertical and horizontal) to real-word measurements by using the calibration marks. We then calculated the distance of each individual track from the lure (image center) and the time to approach, which could be compared between the lures with a Wilcoxon rank-sum test. Conclusion The study of insect behavior is increasingly being based on the quantification of behavioral phenomena (Anderson and Perona 2014). Qualitative descriptions are useful, but the availability of high throughput and automated behavioral phenotyping tools, including those that might be classified as employing computer vision, make numerical quantification possible. In this review we have presented a particular view of the use of computer vision in insect behavior research, provided details on some of the methods that might be employed, and described examples from our own research and the utility of these tools for entomological research. There are several challenges for an average entomologist who wants to use these techniques in her or his research, but none are insurmountable barriers. First, the hardware, image capture, data extraction, and analysis all must be selected according to the problem at hand: there are currently no generic systems that can handle a wide range of computer vision tasks for research, though there are increasingly common methods, tools, and even whole systems available for particular situations. Second, the methods and language around computer vision tools and research are foreign to many entomologists, though most of the concepts are not difficult if the jargon is translated. Finally, unexpected problems might arise (e.g., the amount of time and sophistication needed for analysis) that might impede project completion. One effective way to bridge these gaps is through interdisciplinary collaboration with specialists in the field of computer vision and image processing. We have benefited greatly by close collaborations with engineers and computer scientists, including conducting an extremely successful research collaboration with members of an aerospace engineering department, yielding a trip to the field by an engineer and important entomological insight from state-of-the-art tracking and (Butail et al. 2012, Shishika et al. 2014). The availability and application of computer vision to problems in insect behavior is very likely to increase in the coming years. It is our expectation that this intersection will bring new insight and answers to questions in entomology, but to accelerate this process entomologists studying behavior must be aware of the general approaches and tools currently available. We hope that this review can serve as encouragement and a starting point for those interested in delving further into how computer vision can help address their research problems. Acknowledgments We thank Jana Lee for organizing the symposium and inviting one of us to present our perspective on the use of computer vision for insect behavior research. Many collaborators were essential to the work described here, our thanks to them. Opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the USDA. USDA is an equal opportunity provider and employer. References Cited Akkaynak D. , T. Treibitz, B. Xiao, U. A. Gürkan, J. J. Allen, U. Demirci, and R. T. Hanlon. 2014 . Use of commercial off-the-shelf digital cameras for scientific data acquisition and scene-specific color calibration . J. Opt. Soc. Am. A 31 : 312 – 321 . Google Scholar Crossref Search ADS WorldCat Anderson D. J. , and P. Perona. 2014 . Toward a science of computational ethology . Neuron 84 : 18 – 31 . Google Scholar Crossref Search ADS PubMed WorldCat Balch T. , Z. Khan, and M. Veloso. 2001 . Automatically tracking and analyzing the behavior of live insect colonies, pp. 521 – 528 . In Proceedings of the Fifth International Conference on Autonomous Agents . ACM , New York, NY . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Ballard D. H. , and C. M. Brown. 1982 . Computer vision . Prentice Hall , Englewood Cliffs, NJ . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Bender J. A. , E. M. Simpson, and R. E. Ritzmann. 2010 . Computer-assisted 3D kinematic analysis of all leg joints in walking insects . PLoS One 5 : e13617 . Google Scholar Crossref Search ADS PubMed WorldCat Berman G. J. , D. M. Choi, W. Bialek, and J. W. Shaevitz. 2014 . Mapping the stereotyped behaviour of freely moving fruit flies . J. R. Soc. Interface 11 : 20140672 . Google Scholar Crossref Search ADS PubMed WorldCat Beroza M. , B. H. Alexander, L. F. Steiner, W. C. Mitchell, and D. H. Miyashita. 1960 . New synthetic lures for the male melon fly . Science 131 : 1044 – 1045 . Google Scholar Crossref Search ADS PubMed WorldCat Biederman I. 1987 . Recognition-by-components: a theory of human image understanding . Psychol. Rev . 94 : 115 – 147 . Google Scholar Crossref Search ADS PubMed WorldCat Bradski G. 2000 . The OpenCV library . Dr Dobbs Journal of Software Tools , San Francisco, CA . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Branson K. , and J. Bender. 2008 . Ctrax: the Caltech multiple walking fly tracker . http://ctrax.sourceforge.net/ Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Butail S. , N. Manoukis, M. Diallo, J. M. Ribeiro, T. Lehmann, and D. A. Paley. 2012 . Reconstructing the flight kinematics of swarming and mating in wild mosquitoes . J. R. Soc. Interface 9 : 2624 – 2638 . Google Scholar Crossref Search ADS PubMed WorldCat Butail S. , N. C. Manoukis, M. Diallo, J. M. Ribeiro, and D. A. Paley. 2013 . The dance of male Anopheles gambiae in wild mating swarms . J. Med. Entomol . 50 : 552 – 559 . Google Scholar Crossref Search ADS PubMed WorldCat Chen Y. , A. Why, G. Batista, A. Mafra-Neto, and E. Keogh. 2014 . Flying insect classification with inexpensive sensors . J. Insect Behav . 27 : 657 – 677 . Google Scholar Crossref Search ADS WorldCat Chiron G. , P. Gomez-Krämer, M. Ménard, and F. Requier. 2013 . 3D tracking of honeybees enhanced by environmental context, pp. 702 – 711 . In A. Petrosino (ed.), Image Anal Process – ICIAP 2013, lecture notes in computer science . Springer-Verlag, Berlin and Heidelberg , Germany . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Dankert H. , L. Wang, E. D. Hoopfer, D. J. Anderson, and P. Perona. 2009 . Automated monitoring and analysis of social behavior in Drosophila . Nat. Methods 6 : 297 – 303 . Google Scholar Crossref Search ADS PubMed WorldCat de Bruijn J. A. C. , L. E. M. Vet, M. A. Jongsma, and H. M. Smid. 2018 . Automated high-throughput individual tracking system for insect behavior: applications on memory retention in parasitic wasps . J. Neurosci. Methods 309 : 208 – 217 . Google Scholar Crossref Search ADS PubMed WorldCat Dell A. I. , J. A. Bender, K. Branson, I. D. Couzin, G. G. de Polavieja, L. P. Noldus, A. Pérez-Escudero, P. Perona, A. D. Straw, M. Wikelski, et al. 2014 . Automated image-based tracking and its application in ecology . Trends Ecol. Evol . 29 : 417 – 428 . Google Scholar Crossref Search ADS PubMed WorldCat Dominiak B. C. , B. G. Fanson, S. R. Collins, and P. W. Taylor. 2014 . Automated locomotor activity monitoring as a quality control assay for mass-reared tephritid flies . Pest Manag. Sci . 70 : 304 – 309 . Google Scholar Crossref Search ADS PubMed WorldCat Donelson N. C. , N. Donelson, E. Z. Kim, J. B. Slawson, C. G. Vecsey, R. Huber, and L. C. Griffith. 2012 . High-resolution positional tracking for long-term analysis of Drosophila sleep and locomotion using the “tracker” program . PLoS One 7 : e37250 . Google Scholar Crossref Search ADS PubMed WorldCat Gibson G. 1995 . A behavioural test of the sensitivity of a nocturnal mosquito, Anopheles gambiae, to dim white, red and infra-red light . Physiol. Entomol . 20 : 224 – 228 . Google Scholar Crossref Search ADS WorldCat Gilestro G. F. 2012 . Video tracking and analysis of sleep in Drosophila melanogaster . Nat. Protoc . 7 : 995 – 1007 . Google Scholar Crossref Search ADS PubMed WorldCat Gupta A. , A. A. Efros, and M. Hebert. 2010 . Blocks world revisited: image understanding using qualitative geometry and mechanics, pp. 482 – 496 . In K. Daniilidis, P. Maragos, and N. Paragios (eds.), Lecture Notes in Computer Science. Presented at the European Conference on Computer Vision . Springer , Berlin, Germany . Hartley R. , and A. Zisserman. 2000 . Multiple view geometry in computer vision . Cambridge University Press , Cambridge, United Kingdom . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Huang T. 1996 . Computer vision: evolution and promise (19) . CERN School of Computing , Geneva, Switzerland . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Hubel D. H. , and T. N. Wiesel. 1962 . Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex . J. Physiol . 160 : 106 – 154 . Google Scholar Crossref Search ADS PubMed WorldCat IBM Corp . 2017 . IBM SPSS for windows version 25.0 . IBM Corp. , Armonk, NY . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Inan O. T. , O. Marcu, M. E. Sanchez, S. Bhattacharya, and G. T. Kovacs. 2011 . A portable system for monitoring the behavioral activity of Drosophila . J. Neurosci. Methods 202 : 45 – 52 . Google Scholar Crossref Search ADS PubMed WorldCat Jackson B. T. , C. M. Stone, B. Ebrahimi, O. J. Briët, and W. A. Foster. 2015 . A low-cost mesocosm for the study of behaviour and reproductive potential in Afrotropical mosquito (Diptera: Culicidae) vectors of malaria . Med. Vet. Entomol . 29 : 104 – 109 . Google Scholar Crossref Search ADS PubMed WorldCat Kabra M. , A. A. Robie, M. Rivera-Alba, S. Branson, and K. Branson. 2013 . JAABA: interactive machine learning for automatic annotation of animal behavior . Nat. Methods 10 : 64 – 67 . Google Scholar Crossref Search ADS PubMed WorldCat Kelley D. H. , and N. T. Ouellette. 2013 . Emergent dynamics of laboratory insect swarms . Sci. Rep . 3 : 1073 . Google Scholar Crossref Search ADS PubMed WorldCat LeCun Y. , L. Bottou, Y. Bengio, and P. Haffner. 1998 . Gradient-based learning applied to document recognition . Proc. IEEE 86 : 2278 – 2324 . Google Scholar Crossref Search ADS WorldCat Lin X. , Y. Rivenson, N. T. Yardimci, M. Veli, Y. Luo, M. Jarrahi, and A. Ozcan. 2018 . All-optical machine learning using diffractive deep neural networks . Science 361 : 1004 – 1008 . Google Scholar Crossref Search ADS PubMed WorldCat López O. , M. Rach, H. Migallon, M. Malumbres, A. Bonastre, and J. Serrano. 2012 . Monitoring pest insect traps by means of low-power image sensor technologies . Sensors 12 : 15801 – 15819 . Google Scholar Crossref Search ADS PubMed WorldCat Manoukis N. C. 2016 . To catch a fly: landing and capture of Ceratitis capitata in a Jackson trap with and without an insecticide . PLoS One 11 : e0149869 . Google Scholar Crossref Search ADS PubMed WorldCat Manoukis N. C. , and E. B. Jang. 2013 . The diurnal rhythmicity of Bactrocera cucurbitae (Diptera: Tephritidae) attraction to cuelure: insights from an interruptable lure and computer vision . Ann. Entomol. Soc. Am . 106 : 136 – 142 . Google Scholar Crossref Search ADS WorldCat Manoukis N. C. , A. Diabate, A. Abdoulaye, M. Diallo, A. Dao, A. S. Yaro, J. M. Ribeiro, and T. Lehmann. 2009 . Structure and dynamics of male swarms of Anopheles gambiae . J. Med. Entomol . 46 : 227 – 235 . Google Scholar Crossref Search ADS PubMed WorldCat Manoukis N. C. , S. Butail, M. Diallo, J. M. Ribeiro, and D. A. Paley. 2014 . Stereoscopic video analysis of Anopheles gambiae behavior in the field: challenges and opportunities . Acta Trop . 132 ( Suppl ): S80 – S85 . Google Scholar Crossref Search ADS PubMed WorldCat Mathworks . 2017 . MATLAB version 9.3.0.713579 (R2017b) . The Mathworks, Inc. , Natick, MA . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Nakamura J. 2017 . Image sensors and signal processing for digital still cameras . CRC Press , Boca Raton, FL . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Okubo A. , D. J. Bray, and H. C. Chiang. 1981 . Use of shadows for studying the three-dimensional structure of insect swarms . Ann. Entomol. Soc. Am . 74 : 48 – 50 . Google Scholar Crossref Search ADS WorldCat R Core Team . 2016 . R: a language and environment for statistical computing . R Foundation for Statistical Computing , Vienna, Austria . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Rosenthal R. 1966 . Experimenter effects in behavioral research . Appleton-Century-Crofts , New York . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Ruf F. , M. Fraunholz, K. Öchsner, J. Kaderschabek, and C. Wegener. 2017 . WEclMon – a simple and robust camera-based system to monitor Drosophila eclosion under optogenetic manipulation and natural conditions . PLoS One 12 : e0180238 . Google Scholar Crossref Search ADS PubMed WorldCat SAS Institute . 2010 . SAS OnlineDoc® 9.3 . SAS Institute , Cary, NC . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Selby R. D. , S. H. Gage, and M. E. Whalon. 2014 . Precise and low-cost monitoring of plum curculio (Coleoptera: Curculionidae) pest activity in pyramid traps with cameras . Environ. Entomol . 43 : 421 – 431 . Google Scholar Crossref Search ADS PubMed WorldCat Shinn E. A. , and G. E. Long. 1986 . Technique for 3-D analysis of Cheumatopsyche pettiti (Trichoptera: Hydropsychidae) swarms . Environ. Entomol . 15 : 355 – 359 . Google Scholar Crossref Search ADS WorldCat Shishika D. , N. C. Manoukis, S. Butail, and D. A. Paley. 2014 . Male motion coordination in anopheline mating swarms . Sci. Rep . 4 : 6318 . Google Scholar Crossref Search ADS PubMed WorldCat Spampinato C. , G. M. Farinella, B. Boom, V. Mezaris, M. Betke, and R. B. Fisher. 2015 . Special issue on animal and insect behaviour understanding in image sequences . EURASIP J. Image Video Process . 2015 : 1 . Google Scholar Crossref Search ADS WorldCat Steger C. , M. Ulrich, and C. Wiedemann. 2018 . Machine vision algorithms and applications . Wiley-VCH , Weinheim, Germany . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Tirelli P. , N. A. Borghese, F. Pedersini, G. Galassi, and R. Oberti. 2011 . Automatic monitoring of pest insects traps by Zigbee-based wireless networking of image sensors, pp. 1 – 5 . In Instrumentation and Measurement Technology Conference I2MTC 2011 IEEE , Piscataway, NJ . Wilkinson D. A. , C. Lebon, T. Wood, G. Rosser, and L. C. Gouagna. 2014 . Straightforward multi-object video tracking for quantification of mosquito flight activity . J. Insect Physiol . 71 : 114 – 121 . Google Scholar Crossref Search ADS PubMed WorldCat Zimmerman J. E. , D. M. Raizen, M. H. Maycock, G. Maislin, and A. I. Pack. 2008 . A video method to study Drosophila sleep . Sleep 31 : 1587 – 1598 . Google Scholar Crossref Search ADS PubMed WorldCat Published by Oxford University Press on behalf of Entomological Society of America 2019. This work is written by (a) US Government employee(s) and is in the public domain in the US. Published by Oxford University Press on behalf of Entomological Society of America 2019. TI - Computer Vision to Enhance Behavioral Research on Insects JF - Annals of the Entomological Society of America DO - 10.1093/aesa/say062 DA - 2019-05-07 UR - https://www.deepdyve.com/lp/oxford-university-press/computer-vision-to-enhance-behavioral-research-on-insects-nIMVGLi02o SP - 227 EP - 235 VL - 112 IS - 3 DP - DeepDyve ER -