-
Notifications
You must be signed in to change notification settings - Fork 3
Home
This FAQ is curated by Luigi Acerbi, and in constant expansion.
For a IBS tutorial and example, see ibs_example.m (in MATLAB); other languages to be added.
If you have questions not covered here, please feel free to ask in the lab Discussions forum.
Acknowlegments: Many of the questions currently answered here originated in a live Q&A session with the Ma lab, and thanks to Hsin-Hung Li for taking notes.
-
No, this is not okay in the sense that by doing it one would essentially be reverting IBS to be a fixed-sampling method, with all the associated problems discussed in the paper. A more principled way is to put an early-stopping threshold on the log-likelihood, as described in the paper.
-
Is it important to provide the standard deviation of the IBS estimator to the optimization/inference algorithm for every parameter combination evaluated?
It depends:
- If you are optimizing the target log-likelihood (e.g., for maximum-likelihood or maximum-a-posteriori estimation) then it might help but it is not necessary because the IBS estimator variance, somewhat surprisingly, is nearly constant across the parameter space. However, the BADS optimizer (which we recommend to use in combination with IBS; see also below) does not currently support user-provided, input-dependent noise; so in that case it is not even an option.
- If you are performing Bayesian inference, for example using the VBMC toolbox, then it is necessary to provide the standard deviation of the IBS estimate to the algorithm. Bayesian inference is very sensitive to noisy estimates of the log-likelihood (or log-posterior), so it is crucial to provide the inference algorithm with all available information about the magnitude of observation noise.
-
I am interested in using IBS for problems with continuous responses. Should I try and implement ABC-IBS or is discretization enough? And how do I set the number of bins in the discretization (or, equivalently, the epsilon radius for ABC-IBS)?
While ABC-IBS as briefly described in the paper is a slightly better approach statistically, for most problems it would not make a big difference if one simply discretizes the response space. Using ABC-IBS (or discretizing the space) is roughly equivalent to adding localized uniform noise to the response of the model being fit, with radius equal to half the bin size (or equal to epsilon). So, as a rule of thumb, one wants this added noise to be (much) less than the magnitude of the noise present in the data.
-
BADS is a robust optimizer that works well with stochastic target functions, and in particular with the noisy estimates produced by IBS. Many questions related to the usage of BADS can be found in the BADS general FAQ. In particular, you might want to start the section of the FAQ dedicated to noisy objective functions (but do not stop there — all sections of the FAQ are relevant).
-
If you are interested in Bayesian inference, i.e. in recovering the Bayesian posterior over model parameters or computing the marginal likelihood, we recommend to use Variational Bayesian Monte Carlo (VBMC), a toolbox for approximate Bayesian inference that supports potentially noisy estimates of the log-likelihood, such as those produced by IBS. In a large empirical benchmark, VBMC has been shown to work very well in combination with IBS (see paper).
IBS affords a simple way to change the precision of its estimates, by using multiple "repeats", each amounting to an independent run of the estimator. In this section, we answer questions related to the precision of the IBS estimate, related to the number of repeats.
-
For computational reasons, we can often not afford to evaluate the log likelihood of every parameter combination with high precision while optimizing the parameters. However, once we have found the (supposedly) best parameter combination, we could increase the precision (e.g., the number of IBS repeats). Is this advisable?
Yes, absolutely. It should be considered standard practice, regardless of IBS. Whenever optimizing a noisy target function, after obtaining a candidate solution from an optimization method, one should evaluate the target function at the solution with higher precision.
-
In an ideal world, would you let the number of IBS repeats depend on how close the optimization algorithm thinks it is to the maximum — i.e. some form of adaptive precision?
Yes, this is a good idea and topic of ongoing research.
-
I am trying to get more precise results. As I increase the number of IBS repeats, the standard deviation of the estimated log-likelihood goes down slowly, but the computational time increases linearly. Is this trend normal?
Well, think about it. The number of repeats is literally the number of times the IBS estimator is run, so the computational time has to be linear in the number of repeats. On the other hand, the number of repeats increases the number of independent log-likelihood estimates you are averaging over. As known, the standard error of the mean decreases with the square root of the number of independent estimates (in this case, number or repeats).
-
Any guideline on how to balance computation time and precision of the IBS estimates (i.e., number of repeats)?
The algorithm you are using (e.g., BADS or VBMC) will often have some recommendation for how much noise in the log-likelihood it can handle, usually of order ~1. If you cannot decrease the log-likelihood observation noise to be ~1 or less, try to be as precise as possible within the available computational resources.
-
The main assumption for it to work well is roughly that the trial likelihoods are correlated across (reasonable) regions of parameter space, such that you can compute the resource allocation for a given "representative" set of parameters, and that allocation of resource is still beneficial across iterations of the optimization or inference algorithm.
-
For which cases are trial-dependent repeats more preferable than fixed repeats (e.g., 20 repeats for every trial)?
In theory whenever the above assumption holds, which seems to hold often in practice. However, more empirical studies are needed.
-
The responses of my model depend both on the current trial and on responses or stimuli from previous trials. How can I tell this to
ibslike
?First, note that the inputs to
ibslike
areibslike(fun,params,respMat,designMat,options,varargin)
, wherevarargin
denotes additional arguments that are passed tofun
, the function handle to your simulator model. If your simulator depends on data that go beyond the current stimulus and current response in a trial, you should:- leave
designMat
empty — this will tellibslike
to callfun
with a list of trial indices; - pass the full matrix of stimuli and responses, and any further data, as additional arguments (as
varargin
); - write your
fun
simulator to take as input a parameter vectorPARAMS
, an array of trial numbersT
, and any other input argument represented byvarargin
; - inside
fun
, for each trial indexed by trial numberT
, generate a synthetic response using the appropriate information contained invarargin
(e.g., the full matrix of stimuli and responses).
For more information on variable-length input argument lists (
varargin
) in MATLAB, see the official documentation here. - leave
-
My model uses data structures which are not easily converted to numerical arrays. Can I still use
ibslike
?Yes, absolutely.
- The responses of your model have to be expressed as numerical arrays, but given that
ibslike
works only with discrete responses, it should always be possible to map the responses of your model to a finite set of numbers. - Any other data used to compute such responses need not be a numerical array. If you need your simulator
fun
to accept inputs which are not numerical arrays, you should use thevarargin
input argument, as explained in the question above.
- The responses of your model have to be expressed as numerical arrays, but given that
-
Be very careful that
ibslike
in MATLAB (and similarly in other implementations) will call the simulator multiple times with different subsetsidx
of trials (i.e., different rows of the response and design matrices).This means that your simulator function should work "row-wise" and not depend on the ordering of the rows nor on any information that depends on the other rows. For example, assuming that the matrix
designMat(:,1)
contains for each trial the contrast of the stimulus presented in the trial, be very careful thatmin(designMat(:,1))
inside the simulator function will not be the minimum contrast across all trials. Instead, it will be the minimum contrast for the trials that IBS is considering in that call, i.e. something likemin(designMat(idx,1))
, which will keep changing from call to call based on the set of indicesidx
(likely not what your model needs). If your simulator function needs access to global information, this needs to be given separately; see for example the question above.