Human online inference in the presence of temporal structure

Get Complete Project Material File(s) Now! »

The variability in subjects’ responses varies over the course of inference

Having examined average learning rates, we now turn to their variability. Although all subjects were presented with identical series of signals, xt, their responses at each trial were not the same (Fig. 4A). This variability appears in both HI and HD conditions. The distribution of responses around their averages at each trial has a width comparable to that of the likelihood distribution, g(xtjst) (Fig. 4B). More importantly, the variability in the responses (as measured by the standard deviation) is not constant, but decreases for successive trials following a change point, at short run-lengths (Fig. 4C). Comparing the HI and HD conditions, we observe that for run-lengths shorter than 7, the standard deviation in the HD condition is significantly lower than that in the HI condition. At larger run-lengths, the two curves cross and the HD variability becomes significantly higher than in the HI case. The HD curve adopts, again, a ‘smile shape’ (Fig. 4C). What is the origin of this variability? Because it changes with the run-length and the HI vs. HD condition, it cannot be explained only by a source of noise independent from the inference process, such as motor noise. We posit that the variability exhibited by subjects is related to the uncertainty they carry in their inference process. A simple probabilistic model, to which we turn next, will allow us to capture the experimental results and to explore the way in which posterior sampling may explain the observed variability.

Human repetition propensity

In the previous section, we showed that the subjects’ average response and its variability were related to the first two moments of a Bayesian density on the states. These were statements about averages. To seek further validation of our model, we now focus on a local property of the distribution of responses: its value for ^s+1 = ^s , i.e., the probability of repeating a response. For subjects, it means clicking twice consecutively on the same pixel. We call ‘repetition propensity’ the proportion of such trials.
The repetition propensity as computed in the sampling model increases with the runlength, , in both HI and HD conditions, and drops slightly for high run-lengths in the HD condition. We find that the trend in the data agrees qualitatively with that in the model (Fig. 6A, solid and dotted lines). But there is an appreciable quantitative discrepancy: the repetition propensity of subjects varies from 10% to 25%, whereas the model’s repetition propensity lies between 1.5% and 4%. Moreover, the distribution of the subjects’ corrections exhibits a distinct peak at zero (Fig. 6B). What may explain the subjects’ high repetition propensity? The simplest explanation is that, after receiving a new signal, a subject may consider that the updated best estimate of the state lands on the same pixel as in the previous trial. The width of one pixel in arbitrary units of our state space is 0.28. As a comparison, the triangular likelihood, g, has a width of 40, and thus a standard deviation, g, of 8.165. An optimal observer estimating the center of a Gaussian density of standard deviation g, using 10 samples from this density, comes up with a posterior density with standard deviation g=p10 2:6. Therefore, after observing even 10 successive signals, the subjects’ resolution is not as fine as a pixel (it is, in fact, 10 times bigger). Another explanation is that even though the new estimate falls on a nearby location, a motor cost prohibits a move if it is not sufficiently extended to be ‘worth it’ [96, 97, 98, 99]. A third explanation is that subjects may have a ‘repetition bias’ by which they display a given probability of repeating, independently of their estimate of the state.
To bridge the discrepancy in repetition propensity between our model and the data, we supplement our model with a motor-cost effect modeled as a probability of remaining on the same pixel following the posterior update. This probability depends on the distance, = j^s+1 􀀀 ^s j, between the previous response, ^s , and the potential updated response, ^s+1, given by sampling the posterior. We assume that the probability of repeating decays with the distance as a Gaussian function, r0 exp(􀀀 2 2R ), where r0 and R are numerical parameters characterizing the effect of the motor cost. As a function of the run-length, the repetition propensity obtained from the model is similar to that of the subjects, both quantitatively and qualitatively (Fig. 6A, dashed lines). The motor-cost mechanism introduces a delta function in the density of responses, at ^st+1 = ^s , thus reproducing the peak at zero in the distribution of subjects’ corrections (Fig. 6C). The parameters of the model, r0 = 0:41 and R = 3:41, are fitted to the observed repetition propensity. The value of R corresponds to 12.3 pixels (3.1mm on the screen). The following example provides a sense of the meaning of these values: if a potential response, ^s+1, falls within a window of width 12.3 pixels around the previous response, ^s , on average there’s a 38% chance that the previous response, ^s , is repeated.

HI and HD signals in the real world

In order to make appropriate decisions in relation to their environment, humans and animals must infer the state of the surrounding world on the basis of the sensory signals they receive. If these signals are noisy and if the environment is dynamic, their inference task can be difficult as a new incoming signal may reflect either noise or a change in the underlying state. However, if events in the world present some kind of temporal structure, such as in our HD signal, it is possible to use that structure to refine one’s inference.
Conversely, if events follow a Poisson process, as in the HI signal, their occurrences present no particular temporal structure, and what just happened conveys no information on what is likely to happen next. Hence, there is a fundamental difference between the HI and HD conditions, which impacts the inference of an optimal observer. Many natural events are not Poisson-distributed in time, and exhibit strong regularities.
References [75, 76, 77, 78] have recorded the motor activity of both rodents and human subjects over the course of several days. In both species, they found that the time intervals between motion events were distributed as a power law, a distribution characterized by a long tail, leading to bursts, or clusters, of events followed by long waiting epochs. The durations of motion episodes also exhibited heavy tails. These kinds of distribution are incompatible with Poisson processes, which yield exponentially distributed inter-event epochs. Moreover, both rodent and human activity exhibited long-range correlations, another feature that cannot be explained by a Poisson process, nor even by an inhomogeneous Poisson process (i.e., a process with a time-varying rate). A particular form of autocorrelation is periodicity, which occurs in a wide range of phenomena. In the context of human motor behavior, we note that walking is a highly rhythmical activity [79, 80]. Circadian and seasonal cycles are other examples of periodicity. More complex patterns exist (neither clustered nor periodic), such as in human speech which presents a variety of strong temporal structures, whether at the level of syllables, stresses, or pauses [81, 82, 83]. In all these examples, natural mechanisms produce series of temporally structured events. The ubiquity of history-dependent statistics of events in nature begs for explorations of inference mechanisms in their presence.

Behavioral and neural responses in the presence of different temporal statistics

In the case of studies of perception and decision making, in both humans and animals, history-dependent signals have been used widely. In a number of experiments [84, 85, 86, 46, 47], a first event (a sensory cue, or a motor action such as a lever press) is followed by a second event, such as the delivery of a reward, or a ‘go’ signal triggering the next behavior. The time elapsed between these two events – the ‘reward delay’ or the ‘waiting time’ – is randomized and sampled from distributions that, depending on the studies, vary in mean, variance, or shape. For instance, in both Refs. [84] and [85], unimodal and bimodal temporal distributions are used. Because of the stochasticity of this waiting time, the probability of occurence of the second event varies with time, similarly to the probability of a change point in our HD condition; these studies explore whether variations of this probability are captured by human and animal subjects. In Ref. [84], recordings from the V4 cortical area in rhesus monkey indicate that, for both unimodal and bimodal waiting times distributions, the attentional modulation of sensory neurons varies consistently with this event probability. In Ref. [85], the reaction times of macaques are inversely related to the event probability, for both unimodal and bimodal distributions, and the activity of neurons in the lateral intraparietal (LIP) area is correlated to the evolution of this probability over time. Reference [86] manipulates another attribute of the distribution of reward delays: between blocks of trials, the standard deviation of this distribution is changed, while the mean is left unchanged. Mice, in this situation, are shown to adapt their waiting times to this variability of reward delays, consistently with a probabilistic inference model of reward timing.
Akin to the tasks just outlined are ‘ready-set-go time-reproduction tasks’, in which subjects are asked to estimate the random time interval between ‘ready’ and ‘set’ cues, and to reproduce it immediately afterwards. References [46, 47] show that human subjects combine optimally the cue (consisting in the perceived ready-set interval) with their prior on the interval length. Different priors are learnt in training runs: in Ref. [46] they differ by the variances of the interval distributions, while in Ref. [47] they differ by their means. In both cases, subjects integrate the prior in a fashion consistent with Bayesian inference. Adopting a different approach, Ref. [87] shows that attentional resources can be dynamically allocated to points in time at which input is expected: when asked to detect auditory stimuli (beeps) of low intensity embedded in a continuous white noise, human subjects perform better when detecting periodic beeps rather than random beeps, suggesting that they are able to identify the temporal regularity and use it in their detection process.
In all these studies, the event of interest has a probability of occurrence that varies with time. The resulting temporal structure in the signal appears to be captured by human and animal subjects, and reflected in behavior and in its neural correlate. Various probability distributions used in the reported tasks can be directly compared to our HD sigmoid-shaped change probability, with adjusted parameters. In line with these studies, our results confirm that human subjects adapt their behavior depending on the temporal structure of stimuli. Additionally, we provide a direct comparison between two very different conditions, a HD condition akin to a ‘jittered periodic’ process, and the Poisson, HI condition; the latter produces a memoryless process. Importantly, it plays the role of a benchmark from the point of view of probability theory: in discrete time it yields a geometric distribution, while in continuous time it yields an exponential distribution; both distributions maximize the entropy, subject to the constraint of a fixed event rate. In this study, we compared a specific, temporally structured HD condition to this benchmark, HI condition.

READ Wireless Computer Networking

Inference with HI and HD change points

We compared HI and HD conditions using change-point signals. Contrary to the studies on inference and decision-making mentioned above, the literature on change points is focused mainly on history-independent, memoryless processes, where events have a constant probability (but see Ref. [74]). Change-point signals are unidimensional random signals with a mean that abruptly changes from time to time; inference with such stimuli has been explored extensively [73, 74, 71, 52, 53, 72]. A simple approach to the inference of the true state is to consider the so-called ‘Delta rule’ that uses the inference error to update the current estimate, as ^st+1 = ^st + (xt+1 􀀀 ^st), where xt+1 is the signal, ^st is the estimate at time t, and is a fixed learning rate (the fitting parameter in this algorithm). Although this reinforcement learning method yields an efficient solution for a range of problems [100, 101], it is as such ill-adapted to change-point problems, because the learning rate is either too small and the model fails to adjust quickly after a changepoint, or it is too large and the model does a poor job at inferring the mean during stable phases [53]. References [52, 53] have suggested other methods that extend the delta rule and make use of an adaptive learning rate (through a mixture of delta rules or through an ‘approximately Bayesian delta rule’). Our model provides a simple account of why and how a learning rate that emerges from Bayesian reasoning is dynamically adapted to the surprise and to the run-length.
Experimental data demonstrate that human learning rates indeed adapt. In Ref. [52], the authors find that in an HI condition (with the same 0.1 constant change probability), the subjects’ learning rate increases with surprise and decreases with run-length, in agreement with our observations. We add to these findings by exhibiting the behavior of human learning rates in a HD condition, and in particular we show that it is suppressed at short run-lengths and that it increases at long run-length. Furthermore, we highlight the variability of human responses, and we characterize its behavior as a function of run-length. A sampling mechanism, whereby responses are sampled from the posterior distribution on the states, qualitatively reproduces the experimental findings. It also reproduces the observed increase of the learning rate as a function of surprise. Reference [95] presents a detailed computational study of an array of suboptimal models motivated by our behavioral task. It adds weight to the picture in which human variability originates from probability sampling. But it also reveals that sampling may occur earlier in the inference process, namely, that probability distributions may be represented through a set of samples from the start.

Table of contents :

Introduction
Expected utility hypothesis
Bayesian probabilities
Bayesian models of human inference
Online inference
Questions
Outline
1 Human online inference in the presence of temporal structure
Results
Behavioral task, and history-independent vs. history-dependent stimuli .
Learning rates adapt to the temporal statistics in the stimulus
The variability in subjects’ responses varies over the course of inference
A simple, approximate Bayesian model
Human repetition propensity
Discussion
HI and HD signals in the real world
Behavioral and neural responses in the presence of different temporal
statistics
Inference with HI and HD change points
Methods
Details of the behavioral task
Subjects
Details of the signal
Training runs
Empirical run-length
Details of approximate model
Model self-consistency
Statistical tests
Supplementary tables: statistical tests
Supplementary figure: data analysis excluding all occurrences of repetitions .
2 Cognitive models of human online inference
Results
Behavioral inference task
Optimal estimation: Bayesian update and maximization of expected reward
The optimal model captures qualitative trends in learning rate and repetition propensity
Impact of an erroneous belief on the temporal statistics of the signal
Impact of limited run-length memory
Does behavioral variability depend on the inference process?
Models with limited memory and variability in the inference step or the
response-selection step
Stochastic model with sampling in time and in state space: the particle
filter
Fitting models to experimental data favors sample-based inference
Discussion
Online Bayesian inference
Holding an incorrect belief on temporal statistics
Sampling versus noisy maximization
Stochastic inference and particle filters
Robustness of model fitting
Sample-based representations of probability
Methods
Bayesian update equation
Nodes model
Particle Filter
Model Fit
3 Sequential effects in the online inference of a Bernoulli parameter
Sequential effects
A framework of Bayesian inference under constraint
Predictability-cost models
The Bernoulli case
Inferring conditional probabilities
Representation-cost models
The Bernoulli observer
The conditional-probabilities observer
Discussion
Preliminary experimental results
Leaky integration
Variational approach to approximate Bayesian inference
Representation cost and the theory of rational inattention
Bibliography