Left: schematic of the SARSA model. Right: average learning rate of the value iteration of the 3 kHz tone. ** P < 10 −2 , *** P < 10 −3 . Data analyzed by ( c , n = 5 mice) two-sided one-way repeated measures ANOVA with post-hoc Dunnett’s comparisons (comparing to Stable session), ( d , n = 493 neurons from 5 mice) two-sided Friedman test with post-hoc Bonferroni comparisons, ( e , n = 5 mice) two-sided paired t -test, ( h , k : n = 493 neurons; i , n = 167 neurons; l , n = 281 neurons, from 5 mice) two-sided Wilcoxon signed-rank test, or ( o : n = 5 mice) two-sided one-way repeated measures ANOVA with post-hoc Tukey’s comparisons. Data are presented as ( c , e , o) mean s.e.m. or ( d , h , i , k , l) box plots (center l ine, median; box limits, upper and lower quartiles; whiskers, 1.5 interquartile range). Statistical details are presented in Supplementary Table 1 . Source data are provided as a Source data file.