|
|
||||||||
Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544
We consider a Bayesian ranking and selection problem with independent normal rewards and a correlated multivariate normal belief on the mean values of these rewards. Because this formulation of the ranking and selection problem models dependence between alternatives' mean values, algorithms may use this dependence to perform efficiently even when the number of alternatives is very large. We propose a fully sequential sampling policy called the knowledge-gradient policy, which is provably optimal in some special cases and has bounded suboptimality in all others. We then demonstrate how this policy may be applied to efficiently maximize a continuous function on a continuous domain while constrained to a fixed number of noisy measurements.
Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544
Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544
pfrazier{at}princeton.edu
powell{at}princeton.edu
sdayanik{at}princeton.edu
Key words: simulation; design of experiments; decision analysis; sequential; statistics; Bayesian
History: received January 2008;
revised October 2008;
accepted November 2008.
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |