Center for Nonlinear Studies

Wednesday, January 09, 20133:00 PM - 4:00 PMCNLS Conference Room (TA-3, Bldg 1690)
Seminar
The Knowledge Gradient Policy for Optimal Learning
Warren B. PowellDepartment of Operations Research and Financial Engineering, Princeton University
There are many applications which require collecting information, where the time or cost required to make a measurement may be high. A measurement may require running an expensive simulation, testing a molecular compound in a lab, estimating the presence of a disease in the population, or field testing a market price or business policy in the market place. There are elegant, optimal techniques for very specialized problems such as multi-armed bandit problems, and a host of heuristics and techniques developed for specialized problem classes. The knowledge gradient is a technique that guides measurement decisions using what might be described as classical steepest ascent which requires finding the expected value of a single measurement. This technique is myopically optimal, and is also asymptotically optimal, with strong supporting evidence for problems with finite budgets. The appeal of the method is its generality, allowing it to address problems that have been previously viewed as belonging to completely separate communities. However, it introduces a specific computational challenge which has to be overcome before it can be used for a particular application. The idea will be illustrated on discrete choice problems with correlated beliefs, scalar problems (e.g. optimizing prices), continuous multidimensional problems (e.g. finding the best set of parameters to optimize a simulation), and drug discovery. This work is joint with Peter Frazier and Ilya Ryzhov.

Host: Frank Alexander, fja@lanl.gov, 665-4518