Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further improving the 'meta' simulator #226

Open
paul-buerkner opened this issue Oct 27, 2024 · 0 comments
Open

Further improving the 'meta' simulator #226

paul-buerkner opened this issue Oct 27, 2024 · 0 comments
Labels
feature New feature or request
Milestone

Comments

@paul-buerkner
Copy link
Contributor

The existing meta simulator feature is nice because it allows us to tell the simulator about meta variables that will affect the shape of some other simulated variables such that the meta variables cannot vary within a batch.

I believe we can improve this feature further as I illustrate below using our linear regression example. We use the simulator:

# TODO: do we have to require "batch_shape" to the function passed to meta_fn?
def meta(batch_shape):
    # N: number of observation in a dataset
    N = np.random.randint(5, 15)
    return dict(N=N)

def prior():
    # beta: regression coefficients (intercept, slope)
    beta = np.random.normal([2, 0], [3, 1])
    # sigma: residual standard deviation
    sigma = np.random.gamma(1, 1)
    return dict(beta=beta, sigma=sigma)

def likelihood(beta, sigma, N):
    # x: predictor variable
    x = np.random.normal(0, 1, size=N)
    # y: response variable
    y = np.random.normal(beta[0] + beta[1] * x, sigma, size=N)
    return dict(y=y, x=x)

simulator = bf.simulators.make_simulator([prior, likelihood], meta_fn=meta)

I see the following three problems:

  1. By passing meta_fn = meta, we are internally computing
meta = make_simulator(meta, is_batched = true)

but this is kind of a lie. meta is not really batched. It just should not be auto-batched because its variables need to remain constant within each batch.

  1. Since we will treat meta as "already batched", we have to have batch_shape as first argument of meta, or at least have _ there to indicate the presense of an argument even if unused. This is a quite technical requirement that is hard to communicate to users. They may just do it because we told them to but it will remain a bit weird I believe.

  2. Later on, in the adapter, we have to figure out which of the meta variables are actually already batched and which should just be constant within each batch. This is what adapter.broadcast does and its a good functionality. But perhaps we can relieve it of some of its burden by better informing the adapter about what variables are meta, i.e. which variables are definitely not coming with a batch_size dimension and need one such that their values are constant within each batch.

I propose the following solution. Add a new is_meta flag in make_simulator, which is also automatically set to true for simulators passed to meta_fn. Any variables produced by such a meta simulator will carry the information that they are definitely not coming with a batch_size dimension and need one such that their values are constant within each batch. This would solve all three problems I think: 1. we are completely transparent to the user what a meta simulator does. 2. We don't need an unused batch_size (or batch_shape) argument anymore. In fact we will prohibit it to be present in meta simulators. 3. The adapter can safely (and automatically via the default adapters) broadcast all meta variables to include the batch_size as first dimension.

Any thoughts on this proposal would be much appreciated!

@paul-buerkner paul-buerkner added the feature New feature or request label Oct 27, 2024
@paul-buerkner paul-buerkner added this to the BayesFlow 2.0 milestone Oct 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant