Heatmap from Scientific Research

Open access visualization of Heatmap, GP-UCB model, Normalized reward, GP regression, Reward generalization
CC-BY
2
Views
0
Likes
DOI

Illustration of the GP-UCB model based on observations at the eighth trial (showing normalized reward). We use GP regression as a psychological model of reward generalization 38 , making Bayesian estimates about the expected rewards and uncertainty for each option. The free parameter lambda (equation ( 2 )) controls the extent that past observations generalize to new options. The expected rewards m ( x ) and uncertainty estimates v ( x ) are combined using UCB sampling (equation ( 3 )) to produce a valuation for each option. The exploration bonus beta governs the value of exploring uncertain options relative to exploiting high reward expectations. Lastly, UCB values are entered into a softmax function (equation ( 4 )) to make probabilistic predictions about where the participant will search next. The decision temperature parameter tau governs the amount of random (undirected) exploration.

Related Plots

Discover More Scientific Plots

Browse thousands of high-quality scientific visualizations from open-access research