Evolving Feature Extraction Algorithms for Image Analysis
GENIE is a software system for rapidly evolving feature extraction algorithms for image analysis. With current sensor platforms collecting a flood of high-quality data, automatic feature extraction (AFE) has become a key to enabling human analysts to keep up with the flow. GENIE uses a genetic programming approach to produce AFE tools for broad-area features in multispectral, hyperspectral, panchromatic, and multi-instrument imagery. Both spectral and spatial signatures of features are discovered and exploited.
Genie Pro is a complete rewrite of the GENIE software for Windows workstations, incorporating faster algorithms, support for many more standard image formats ( including image file sizes > gigabyte), and vectorization tools (including export of results to vector ESRI Shape files). Genie Pro is being released under a commercial license and test and evaluation copies of the software are available.
GENIE was invented for the Rapid Feature Identification Project (RFIP), a project of Los Alamos National Laboratory's ISR Division.
Background
Extraction of features from large and possibly multi-instrument
imagery data sets is a crucial task facing many communities of
researchers and analysts. With new distribution technologies and data
formats making storage and dissemination of huge amounts of data
cheaper and easier, the bottle-neck to successful exploitation of this
flood of raw information rests on the availability of analysis tools.
From change detection for broad-area environmental monitoring, to
terrain catergorization for cartographers, development of
image-processing tools for novel datasets is an expensive business,
often requiring a significant investment of time by highly skilled
scientists, analysts, and programmers. With the arrival of
multi-spectral sensors platforms such as Landsat and high-resolution
imaging sysensors such as IKONOS, the analyst can now search for
spectral, spatial, and possibly hybrid spatio-spectral signatures,
requiring development of whole new tool-kits. Our own work in the
field of remote sensing science has led us to seek an accelerated
toolmaker. Since creating and developing individual algorithms is so
important and yet so expensive, we have begun investigating machine
learning approaches to this problem.
Over the last three decades, ideas taken from the theory of evolution in natural systems have inspired the development of a group of powerful yet flexible optimization methods known collectively as evolutionary computation (EC). The modern synthesis derives from work performed in the 60s and 70s by researchers such as Holland [1], Rechenberg [2], and Fogel et al [3]. While the various schools founded by these pioneers have differences, their approaches share the common themes of optimization performed by a competing population of individuals in which a process of selection and reproduction with modification is occurring. The beauty of EC is its flexibility: if we can derive a fitness measure for a problem, then the problem might be solved using EC. Many different problems from different domains have been successfully tackled using EC, including: optimization of dynamic routing in telecommunications networks[4]; designing finite-impulse-response digital filters [5]; designing protein sequences with desired structures [6]; and many others.
A crucial issue when using EC is how to represent candidate solutions so that they can be manipulated by EC effectively. We wish to evolve individuals that represent possible image processing algorithms, and so we use a system based upon genetic programming [7]. Genetic programming (GP) is essentially a framework for developing executable programs using EC methods. GP has been the subject of a huge amount of research this decade and has been applied to a wide range of applications, from circuit design [7], to share price prediction [8]. With particular relevance to the current project, GP has also been applied to image-processing problems, including: edge detection [9]; face recognition [10]; image segmentation [11]; image compression [12]; and feature extraction in remote sensing images [13-21].
GENIE
GENIE [14-21] is an evolutionary computation (EC) software system,
using a genetic algorithm (GA) to assemble image-processing tools from
a collection of low-level image operators (e.g., edge detectors,
texture measures, spectral operations, various morphological
filters). Each candidate tool generates a number of feature planes,
which are then combined using a supervised classifier (Fisher linear
discriminant) to generate a final boolean feature mask. A population
of candidate tools is generated, ranked according to a fitness metric
measuring their performance on some user-provided training data, and
fit members of the population permitted to reproduce. Several standard
fitness metrics have been implemented, including Euclidean distance
and Hamming distance. The process cycles until the population
converges to a solution, or the user decides to accept the current
best solution. The user is also able to modify the training data as
Genie reports its initial results, to help refine the search. The
burden of low-level programming is thus shifted to the genetic
algorithm, leaving the analyst free to concentrate on the critical
task of making judgements. GENIE may choose ignore the spatial
information in the image and rely wholly on spectral operations and
the supervised classifier module, but in practice GENIE will construct
integrated spatio-spectral algorithms. A comparison of GENIE to
standard supervized classifier algorithms of remote sensing has
appeared in [17].
As shown in the above Figure, training data replaces detailed programming in a machine learning system. GENIE requires a limited amount of analyst-supplied training data, provided via a point-and-click interface. For the water-finding task shown (top left), example true pixels are painted green, and example false pixels are painted red. GENIE used this to evolve the mask shown (top right: red/green boolean mask overlaid on a greyscale reference image). The user is able to influence the evolution of algorithms by providing additional information, and by interactively providing additional training data. Previous results can be reused and built upon. The lower panels show a beach-finding task, which used the evolved waterfinder's result as part of its algorithm. Output from existing algorithms (e.g., road or building finders) can also be incorporated.
GENIE can derive multiple features for the same scene to produce terrain classifications [18]. GENIE has been applied to DOE/NNSA Multispectral Thermal Imager, Landsat 7 ETM+, Daedalus, AVIRIS, Hyperion, IKONOS,standard color/infrared aerial photography, Digital Elevation Model, and other datasets. GENIE is currently being tested for use with high-resolution panchromatic imagery from NASA planetary missions [19], and with hyperspectral imagery from modern biomedical instruments.
GENIE's system architecture has been designed to provide a flexible and powerful computing paradigm. GENIE can search a rich and complex feature space using its gene pool of standard primitive image processing operators and the results of additional analyst-selected algorithms. The system employs both spectral and spatial image analysis techniques in combination, and can in principal simultaneously exploit data from different sensors (e.g., optical imagery plus multi-spectral imagery plus altimeter data or digital elevation models). The ability to combine diverse datasets requires that the data be co-registered, which requires use of some other package (e.g., commercial software packages such as Erdas Imagine or ITT ENVI/IDL). The code is written in a combination of Perl (for the GA), Java (for the graphical user interface, or GUI), and IDL, and augmented by our own C libraries for fitness evaluation of candidate tools. The code was developed in an Intel/Linux environment.
Genie Pro is a complete rewrite of the GENIE software for Windows workstations, written entirely in C++, using the Qt toolkit for its GUI and GDAL library for file i/o. Genie Pro incorporates faster algorithms, support for many more standard image formats ( including image file sizes > gigabyte), and vectorization tools (including export of results to vector ESRI Shape files). Genie Pro is being released under a commercial license and test and evaluation copies of the software are available.
The prototype system typically requires a few hours to evolve a high-fitness image-processing algorithm running on a single, fast Linux/Intel workstation. The GENIE system is parallelizeable and scalable, and we are developing a version of our system for a cluster of 10's to 100's of commercial off-the-shelf Linux workstations.
References
1. J. H. Holland, Adaptation in Natural and Artificial Systems,
University of Michigan, Ann Arbor, 1975.
2. I. Rechenberg, Evolutionsstrategie: Optimierung technischer Systeme
nach Prinzipien der biologischen Evolution, Fromman-Holzboog, Stuttgart,
1973.
3. L. Fogel, A. Owens and M. Walsh, Artificial Intelligence through
Simulated Evolution, Wiley, New York, 1966.
4. L. A. Cox Jr., L. Davis, and Y. Qiu,, "Dynamic anticipatory routing
in circuit-switched telecommunications networks," in Handbook of Genetic
Algorithms, L. Davis, ed., pp. 124-143, Van Nostrand Reinhold, New York,
1991.
5. J. D. Schaffer and L. J. Eshelman, "Designing multiplierless digital
filters using genetic algorithms," in Proceedings of the Fifth
International Conference on Genetic Algorithms, S. Forrest, ed., pp.
439-444, Morgan Kaufmann, San Mateo, 1993.
6. T. Dandekar and P. Argos, "Potential of genetic algorithms in
protein folding and protein engineering simulations," Protein Engineering
5(7), pp. 637-645, 1992.
7. J. R. Koza, Genetic Programming: On the Programming of Computers by
Natural Selection, MIT, Cambridge, 1992.
8. G. Robinson and P. McIlroy, "Exploring some commercial applications of
genetic programming," in Evolutionary Computing, Volume 993 of Lecture
Notes in Computer Science, T.C. Fogarty, ed., Springer-Verlag, Berlin,
1995.
9. C. Harris and B. Buxton, "Evolving edge detectors", Research Note
RN/96/3, University College London, Dept. of Computer Science, London,
1996.
10. A. Teller and M. Veloso, "A controlled experiment: Evolution for
learning difficult image classification" in 7th Portuguese Conference on
Artificial Intelligence, Volume 990 of Lecture Notes in Computer Science,
Springer-Verlag, Berlin, 1995.
11. R. Poli and S. Cagoni, "Genetic programming with user-driven
selection: Experiments on the evolution of algorithms for image
enhancement," in Genetic Programming 1997: Proceedings of the 2nd Annual
Conference, J. R. Koza, et al., editors, Morgan Kaufmann, San Francisco
1997.
12. P. Nordin, and W. Banzhaf, "Programmatic compression of images and
sound," in Genetic Programming 1997: Proceedings of the 2nd Annual
Conference, J. R. Koza, et al., editors,, Morgan Kaufmann, San Francisco,
1996.
13. J. M. Daida, J. D. Hommes, T. F. Bersano-Begey, S. J. Ross, and J. F.
Vesecky, "Algorithm discovery using the genetic programming paradigm:
Extracting low-contrast curvilinear features from SAR images of arctic
ice," in Advances in Genetic Programming 2, P. J. Angeline and K. E.
Kinnear, Jr., editors, chap. 21, MIT, Cambridge, 1996.
14. J. Theiler, N. R. Harvey, S. P. Brumby, J. J. Szymanski, S. Alferink,
S. Perkins, R. Porter, and J. J. Bloch, "Evolving Retrieval Algorithms
with a Genetic Programming Scheme", 1999, Proc. SPIE 3753, pp. 416-425, 1999.
15. S. P. Brumby, J. Theiler, S. J. Perkins, N. R. Harvey, J.J.
Szymanski, J. J. Bloch, and M. Mitchell, "Investigation of Feature
Extraction by a Genetic Algorithm", Proc. SPIE 3812, pp 24-31, 1999.
16. S. Perkins, J. Theiler, S. P. Brumby, N. R. Harvey, R. B. Porter,
J. J. Szymanski, and J. J. Bloch, "GENIE - A Hybrid Genetic Algorithm
for Feature Classification in Multi-Spectral Images", 2000, Proc. SPIE
4120, pp.52-62, 2000.
17. N. R. Harvey, J. Theiler, S. P. Brumby, S. Perkins, J. J.
Szymanski, J. J. Bloch, R. B. Porter, M. Galassi, and A. C. Young,
"Comparison of GENIE and conventional supervised classifiers for
multispectral image feature extraction," IEEE Transactions on
Geoscience and Remote Sensing, vol. 40, pp. 393-404, Feb. 2002.
18. J. J. Szymanski, S. P. Brumby, P. Pope, D. Eads,
D. Esch-Mosher, M. Galassi, N. R. Harvey, H. D. W. McCulloch,
S. J. Perkins, R. Porter, J. Theiler, A. C. Young, J. J. Bloch and
N. David, "Feature Extraction from Multiple Data Sources Using Genetic
Programming", Proc. SPIE 4725, 2002, in press.
19. C. S. Plesko, S. P. Brumby and C. Leovy, "Automatic Feature
Extraction for Panchromatic Mars Global Surveyor Mars Orbiter Camera
Imagery," Proc. SPIE 4480, pp. 139-146, 2002.
20. K. Lewis Hirsch, S. P. Brumby, N. R. Harvey, and A. B. Davis,
"The MTI Dense-Cloud Mask Algorithm Compared to a Cloud Mask Evolved
by a Genetic Algorithm and to the MODIS Cloud Mask", 2000, Proc. SPIE
4132, pp. 230-237, 2000.
21. S. P. Brumby and A. E. Galbraith, "Evolving spatio-spectral
feature extraction algorithms for hyperspectral imagery", Proc. SPIE
4816, 2002, in press.
![]() |
Operated by Los Alamos National Security, LLC for
the U.S. Department of Energy's NNSA. © Copyright 2011 Los Alamos National Security, LLC All rights reserved | Disclaimer/Privacy |