Lab Home | Phone | Search | ||||||||
|
||||||||
A new generation of airborne sensor allows very large images (60M-1G pixels) to be captured at 2 frames per second. At these resolutions, a geographic area covering a whole city can be generated at once from an unmanned aerial vehicle (UAV), while still allowing the detection of vehicles and people (for sensors under development). This capability, coupled with the availability of increased computational power, has led to the growth of Wide Area Airborne Surveillance (WAAS). This type of imagery enables a number of applications in several domains such as urban planning, security, and geospatial digital libraries. Making sense of this torrent of image data requires a new paradigm to replace the simple display for human observers. This talk addresses the steps involved in the process. The suite of algorithms needs to automatically process the imagery and turn it into a more useful, informative form. They progress at different semantic levels, from low to high level. WAAS data is captured by an array of cameras. Therefore, at the lowest level, we need to transform an array of individual camera images and produce a high quality mosaic. Due to vibrations and other mechanical issues, this needs to be done at every frame. The next level of processing involves tracking, i.e. estimating the trajectories of all moving objects. We propose a tracking algorithm that optimally infers short tracks using Bayesian networks. These tracklets are then integrated into a multi-object tracking algorithm that achieves good performance. WAAS is often collected over urban areas, in the presence of buildings, and other structures, which produce parallax effects. These effects can be exploited to generate 3D models from the collected imagery. With the 3D information, occluded tracks can be detected and repaired. In order to enable large-scale semantic analysis of WAAS data, higher-level algorithms that determine at least some of the semantics are necessary. Expecting full semantic descriptions of the scene is unrealistic, but the task can be made easier for a human operator by automatically determining common, or primitive, events or activities. We propose a framework based on the Entity Relationship Model that is able to detect a large variety of activities on real data as well as GPS tracks. Host: Frank Alexander, fja@lanl.gov, 665-4518. Information Science and Technology Center (ISTC) |