Add constrained transform #764

tomicapretto · 2023-11-29T17:58:29Z

This PR adds a new transformation that can be used in response terms. This is similar to truncated but it directly constrains the bounds of the response distribution (remember, truncated refers to the missing data mechanism).

The usage is constrained(value, lower, upper) ~ whatever. lower and upper must be scalar or None.

It uses the original bounds when obtaining draws from the posterior predictive distribution.

Edit closes #750

bambi/terms/utils.py

GStechschulte · 2023-12-02T05:58:46Z

tests/test_models.py

@@ -1132,6 +1132,33 @@ def test_truncated_response(self, truncated_data):
        assert isinstance(model.backend.model.observed_RVs[0]._owner.op, pm.TruncatedNormal.rv_type)


+class TestConstrainedResponse(FitPredictParent):


Not a question regarding the test, but it made me think if we should make sure this also works with a categorical or multinomial model by chance? Did you by chance do this? If not, I can test one.

This makes me wonder, does it make sense to constrain such distributions?

I don't think there exists such a thing as a truncated categorical distribution. What would be the domain we would like to constrain? For the record, before thinking about it I tried the following, which doesn't make sense and Bambi/formulae failed (the reason is not connected to our problem, it's that Bambi is looking for the levels of the response variable but it's numeric)

import bambi as bmb import numpy as np import pandas as pd rng = np.random.default_rng(1234) y = rng.choice(10, 200) y_ = y[(y >= 2) & (y <= 8)] df = pd.DataFrame({"y": y_}) model = bmb.Model("constrained(y, 2, 8) ~ 1", df, family="categorical") model.build()

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/tomas/Desktop/OSS/bambinos/bambi/bambi/models.py", line 360, in build self.backend.build(self) File "/home/tomas/Desktop/OSS/bambinos/bambi/bambi/backend/pymc.py", line 62, in build for name, values in spec.response_component.response_term.coords.items(): File "/home/tomas/Desktop/OSS/bambinos/bambi/bambi/terms/response.py", line 59, in coords return self.family.get_coords(self) File "/home/tomas/Desktop/OSS/bambinos/bambi/bambi/families/univariate.py", line 143, in get_coords return {name: [level for level in response.levels if level != response.reference]} TypeError: 'NoneType' object is not iterable

However, one may want to truncate/constrain a binomial or a multinomial distribution. For example, if n=30 we may want to truncate the binomial so the number of successes can only be between 5 and 25. I don't know in which scenarios that would apply, but I think there's nothing that prevents us from implementing it.

Anyway, it doesn't work and I don't think supporting it will be easy. See

rng = np.random.default_rng(1234) y = rng.binomial(20, 0.5, size=200) y_ = y[(y >= 6) & (y <= 12)] df = pd.DataFrame({"y": y_}) model = bmb.Model("constrained(p(y, 20), 6, 12) ~ 1", df, family="binomial") model.build()

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/tomas/Desktop/OSS/bambinos/bambi/bambi/models.py", line 170, in __init__ design = fm.design_matrices( File "/home/tomas/anaconda3/envs/bambi/lib/python3.10/site-packages/formulae/matrices.py", line 556, in design_matrices design = DesignMatrices(description, data, env) File "/home/tomas/anaconda3/envs/bambi/lib/python3.10/site-packages/formulae/matrices.py", line 58, in __init__ self.response.evaluate(data, env) File "/home/tomas/anaconda3/envs/bambi/lib/python3.10/site-packages/formulae/matrices.py", line 137, in evaluate self.term.set_type(self.data, self.env) File "/home/tomas/anaconda3/envs/bambi/lib/python3.10/site-packages/formulae/terms/terms.py", line 820, in set_type self.term.set_type(data, env) File "/home/tomas/anaconda3/envs/bambi/lib/python3.10/site-packages/formulae/terms/terms.py", line 433, in set_type component.set_type(data, env) File "/home/tomas/anaconda3/envs/bambi/lib/python3.10/site-packages/formulae/terms/call.py", line 104, in set_type x = self.call.eval(data_mask, self.env) File "/home/tomas/anaconda3/envs/bambi/lib/python3.10/site-packages/formulae/terms/call_resolver.py", line 272, in eval return callee(*args, **kwargs) File "/home/tomas/Desktop/OSS/bambinos/bambi/bambi/transformations.py", line 145, in constrained return truncated(x, lb, ub) File "/home/tomas/Desktop/OSS/bambinos/bambi/bambi/transformations.py", line 93, in truncated raise ValueError("'truncated' only works with 1-dimensional arrays") ValueError: 'truncated' only works with 1-dimensional arrays

@tomicapretto I have a situation where the response is indeed a constrained binomial (sum score) for a clinical risk scale. I am thinking that there would be situations where a user might have a constrained multinomial as well. I have not run into that personally.

The documentation in brms says truncation should work with the binomial family but I cannot make it work and I don't find examples where someone does it (doc: https://paul-buerkner.github.io/brms/reference/brmsformula.html). See:

library(brms) y <- rbinom(200, 20, 0.5) df <- data.frame(y = y[(y >= 8 & y <= 14)]) # model <- brm(y | trials(20) ~ 1, data=df, family="binomial") model1 <- brm( y | trials(20) | trunc(lb=0, ub=14) ~ 1, data=df, family="binomial" ) # Error in trunc(lb = 0, ub = 14) : # supplied argument name 'lb' does not match 'x' model2 <- brm( y | trunc(lb=0, ub=14) | trials(20) ~ 1, data=df, family="binomial" ) # Error in trials(20) : could not find function "trials"

@tomicapretto I have a situation where the response is indeed a constrained binomial (sum score) for a clinical risk scale. I am thinking that there would be situations where a user might have a constrained multinomial as well. I have not run into that personally.

Now that I think of the all the sum scores that I seen, I have only ever seen lower bound constrained values on the numerator.

@zwelitunyiswa do you mean that the lower bound is greater than zero?

I think in the future we can try to support that. But given how things are implemented now, it's not possible.

I would recommend using a custom family where the likelihood is a truncated binomial. The pattern would be the one shown here #750 (comment)

@tomicapretto yeah, there is a sum score for a risk scale in wound care whose min score is 4/23.

GStechschulte

This looks great! Thanks a lot. I left two comments that don't necessarily require changes, but rather if we have thought of that comment.

codecov-commenter · 2023-12-02T14:32:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (2d4b260) 89.83% compared to head (c0b52ac) 89.93%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #764      +/-   ##
==========================================
+ Coverage   89.83%   89.93%   +0.09%     
==========================================
  Files          45       45              
  Lines        3768     3784      +16     
==========================================
+ Hits         3385     3403      +18     
+ Misses        383      381       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tomicapretto added 2 commits November 29, 2023 14:43

Add constrained transformation

e83be83

Add tests for constrained transformation

15c0119

tomicapretto mentioned this pull request Nov 29, 2023

ValueError: Non-parent parameter not found in posterior #750

Closed

tomicapretto marked this pull request as ready for review November 29, 2023 18:01

tomicapretto assigned GStechschulte Dec 1, 2023

GStechschulte reviewed Dec 2, 2023

View reviewed changes

bambi/terms/utils.py Outdated Show resolved Hide resolved

GStechschulte reviewed Dec 2, 2023

View reviewed changes

GStechschulte approved these changes Dec 2, 2023

View reviewed changes

Simplify detection of responses of special kinds

c0b52ac

tomicapretto merged commit 312afa2 into bambinos:main Dec 2, 2023
4 checks passed

tomicapretto deleted the constrained branch December 2, 2023 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add constrained transform #764

Add constrained transform #764

tomicapretto commented Nov 29, 2023 •

edited

Loading

GStechschulte Dec 2, 2023 •

edited

Loading

tomicapretto Dec 2, 2023

tomicapretto Dec 2, 2023

tomicapretto Dec 2, 2023

zwelitunyiswa Dec 2, 2023

tomicapretto Dec 2, 2023

zwelitunyiswa Dec 2, 2023

tomicapretto Dec 2, 2023

zwelitunyiswa Dec 2, 2023

GStechschulte left a comment

codecov-commenter commented Dec 2, 2023 •

edited

Loading

		@@ -1132,6 +1132,33 @@ def test_truncated_response(self, truncated_data):
		assert isinstance(model.backend.model.observed_RVs[0]._owner.op, pm.TruncatedNormal.rv_type)


		class TestConstrainedResponse(FitPredictParent):

Add constrained transform #764

Add constrained transform #764

Conversation

tomicapretto commented Nov 29, 2023 • edited Loading

GStechschulte Dec 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GStechschulte left a comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 2, 2023 • edited Loading

Codecov Report

tomicapretto commented Nov 29, 2023 •

edited

Loading

GStechschulte Dec 2, 2023 •

edited

Loading

codecov-commenter commented Dec 2, 2023 •

edited

Loading