Further improving the 'meta' simulator #226

paul-buerkner · 2024-10-27T06:16:11Z

The existing meta simulator feature is nice because it allows us to tell the simulator about meta variables that will affect the shape of some other simulated variables such that the meta variables cannot vary within a batch.

I believe we can improve this feature further as I illustrate below using our linear regression example. We use the simulator:

# TODO: do we have to require "batch_shape" to the function passed to meta_fn?
def meta(batch_shape):
    # N: number of observation in a dataset
    N = np.random.randint(5, 15)
    return dict(N=N)

def prior():
    # beta: regression coefficients (intercept, slope)
    beta = np.random.normal([2, 0], [3, 1])
    # sigma: residual standard deviation
    sigma = np.random.gamma(1, 1)
    return dict(beta=beta, sigma=sigma)

def likelihood(beta, sigma, N):
    # x: predictor variable
    x = np.random.normal(0, 1, size=N)
    # y: response variable
    y = np.random.normal(beta[0] + beta[1] * x, sigma, size=N)
    return dict(y=y, x=x)

simulator = bf.simulators.make_simulator([prior, likelihood], meta_fn=meta)

I see the following three problems:

By passing meta_fn = meta, we are internally computing

meta = make_simulator(meta, is_batched = true)

but this is kind of a lie. meta is not really batched. It just should not be auto-batched because its variables need to remain constant within each batch.

Since we will treat meta as "already batched", we have to have batch_shape as first argument of meta, or at least have _ there to indicate the presense of an argument even if unused. This is a quite technical requirement that is hard to communicate to users. They may just do it because we told them to but it will remain a bit weird I believe.
Later on, in the adapter, we have to figure out which of the meta variables are actually already batched and which should just be constant within each batch. This is what adapter.broadcast does and its a good functionality. But perhaps we can relieve it of some of its burden by better informing the adapter about what variables are meta, i.e. which variables are definitely not coming with a batch_size dimension and need one such that their values are constant within each batch.

I propose the following solution. Add a new is_meta flag in make_simulator, which is also automatically set to true for simulators passed to meta_fn. Any variables produced by such a meta simulator will carry the information that they are definitely not coming with a batch_size dimension and need one such that their values are constant within each batch. This would solve all three problems I think: 1. we are completely transparent to the user what a meta simulator does. 2. We don't need an unused batch_size (or batch_shape) argument anymore. In fact we will prohibit it to be present in meta simulators. 3. The adapter can safely (and automatically via the default adapters) broadcast all meta variables to include the batch_size as first dimension.

Any thoughts on this proposal would be much appreciated!

The text was updated successfully, but these errors were encountered:

paul-buerkner added the feature New feature or request label Oct 27, 2024

paul-buerkner added this to the BayesFlow 2.0 milestone Oct 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Further improving the 'meta' simulator #226

Further improving the 'meta' simulator #226

paul-buerkner commented Oct 27, 2024

Further improving the 'meta' simulator #226

Further improving the 'meta' simulator #226

Comments

paul-buerkner commented Oct 27, 2024