Replies: 2 comments 1 reply
-
We will be posting tutorials on exactly these kinds of situations soon.
But briefly:
- you should not compare a statistical test within a Bayesian model to a
frequentist t-test - the p values mean very different things (and we don't
really refer to the bayesian version as a p value per se).
- you should not do a t-test after extracting parameters from a
hierarchical bayesian model because the samples are not independent (each
participant posterior is drawn from the group) and the degrees of freedom
will be inflated.
- It is better to do the Bayesian (group-difference hierarchical) test,
comparing the group distributions of high vs low. Even better would be to
estimate the impact of condition (high vs low) as a within-subject effect
in the same model fit to all of the data, but for that it would make sense
to have hypotheses on which parameters you expect to differ and then adding
high vs low as an interaction with the other parameters (eg. a ~ 1 +
C(cond) (1 + C(cond)|subject) where cond="high" or "low"). If you allow all
parameters to interact with cond it will be a complex model and may have
issues with convergence.
- For your last question, the hierarchical model and group only model are
not the same. The group only model tries to fit all the data with a single
set of parameters (ie for an 'uber subject'). But that model might not do a
good job at fitting any individual. The hierarchical model allows each
individual to deviate from the group to the extent their data warrants it.
The best fitting model of an uber subject may not have the same group means
as the hierarchical model.
Michael
…On Wed, May 8, 2024 at 1:23 PM JoeSu112 ***@***.***> wrote:
Hi all,
I'm running a DDM estimation with HSSM, here is the example of data:
>>> data.head(8)
subject trial choice val_1 val_2 ... beta1 beta2 beta3 high low0 1 1 2 2 3 ... -0.129059 -0.870941 0.148376 0 11 1 2 2 6 8 ... -0.405614 -1.594386 0.084912 1 02 1 3 2 7 7 ... -1.732451 1.732451 -0.247493 0 13 1 4 1 8 7 ... 3.452939 -2.452939 0.393725 0 14 1 5 1 6 9 ... -2.933824 -0.066176 -0.191176 0 15 1 6 1 10 8 ... 3.498771 -1.498771 0.277641 0 16 1 7 2 1 4 ... -1.253420 -1.746580 0.098632 0 17 1 8 1 9 10 ... 4.272727 -5.272727 0.502392 1 0
I separate the data into two part, one is "HIGH" (high==1), the other is
"LOW" (high==0), then estimate these two data respectively. Here is the
code for the model:
DDM = hssm.HSSM(
data=HIGH,
model="ddm",
prior_settings="safe",
loglik_kind="approx_differentiable",
include=[
{
"name": "v",
"prior": {
"Intercept": {"name": "Normal", "mu": 0, "sigma": 5},
"beta1": {"name": "Normal", "mu": 0, "sigma": 5},
"beta2": {"name": "Normal", "mu": 0, "sigma": 5},
"beta3": {"name": "Normal", "mu": 0, "sigma": 5},
},
"formula": "v ~ 1 + beta1 + beta2 + beta3 + (beta1|subject) + (beta2|subject) + (beta3|subject)",
"link": "identity",
},
{
"name": "a",
"formula": "a ~ 1 + (1|subject)",
"link": "identity",
},
{
"name": "z",
"formula": "z ~ 1 + (1|subject)",
"link": "identity",
},
{
"name": "t",
"formula": "t ~ 1 + (1|subject)",
"link": "identity",
},
],
)
This is the model for estimating HIGH data. For estimating LOW data, just
replace "data=HIGH" with "data=LOW". Then, after sampling, I have a bunch
of group parameters (a_Intercept, t_Intercept, z_Intercept, v_Intercept,
v_beta1, v_beta2, v_beta3), and the subject effect of each parameter
(a_1|subject[i], t_1|subject[i], z_1|subject[i], ...) for HIGH, and another
bunch of group parameters and subject effect for LOW.
And my problem is that how can I compare the difference, also do the
significance test between the group parameters of HIGH and LOW. I used the
following code (take a as example):
a_high=HIGH.posterior["a_Intercept"]a_low=LOW.posterior["a_Intercept"]mean_diff=a_high-a_lowsig=az.plot_posterior(mean_diff, ref_val=0)sig.set_title("a")sig_fig=sig.figuresig_fig.savefig(cwd + "/a_sig.png")
As the code shows, it calculate the difference between group parameters of
HIGH and LOW in each sample, then see what proportion of the difference is
larger or smaller than zero. If the proportion of large/smaller than zero
is over 95%, then I say it's significantly different.
I also calculate the parameter of each subject (e.g., a_Intercept +
a_1|subject[i]) in HIGH/LOW, then do paired t-test between them. It turns
that the significance results are a bit different between these two methods:
*group-difference-hierarchical* a t z v_intercept v_beta1 v_beta2 v_beta3
p-value 0.054 < 0.01 0.072 0.131 0.151 0.129 0.196
*subject paired t-test* a t z v_intercept v_beta1 v_beta2 v_beta3
p-value 0.03347 0.00031 0.53627 0.12849 0.1077 0.01518 0.5233
Moreover, I use another model only estimate the group parameter without
subject effect:
DDM_high = hssm.HSSM(
data=HIGH,
model="ddm",
prior_settings="safe",
loglik_kind="approx_differentiable",
include=[
{
"name": "v",
"prior": {
"Intercept": {"name": "Normal", "mu": 0, "sigma": 5},
"beta1": {"name": "Normal", "mu": 0, "sigma": 5},
"beta2": {"name": "Normal", "mu": 0, "sigma": 5},
"beta3": {"name": "Normal", "mu": 0, "sigma": 5},
},
"formula": "v ~ 1 + beta1 + beta2 + beta3",
"link": "identity",
}
],
)
Then I only have a bunch of group parameters (a, t, z, v_Intercept,
v_beta1, v_beta2, v_beta3) for HIGH and LOW. And I also use the difference
to do significance test, and the result is also somewhat different from the
first method:
*group-difference-hierarchical* a t z v_intercept v_beta1 v_beta2 v_beta3
p-value 0.054 < 0.01 0.072 0.131 0.151 0.129 0.196
*group-parameters-only* a t z v_intercept v_beta1 v_beta2 v_beta3
p-value < 0.01 < 0.01 0.091 0.019 0.002 0.006 0.047
Because in my understanding, those two models both give me the parameters
for the whole group, so the significance result should be the same but the
above result seems not.
I wonder which method a correct way to do significance test is. Or if
there is other correct way to do so.
Thanks for any help!!
—
Reply to this email directly, view it on GitHub
<#421>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAG7TFEGYMTFDZ7AK63ZWTTZBJNRNAVCNFSM6AAAAABHNL63I6VHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZWGYZDSNZQG4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
JoeSu112
-
You can extract the group traces from each of the models (applied to high
and low datasets), see the main tutorial on hssm site for how to do that.
Then just subtract the two traces for the group parameter you are
interested in (e.g. a_intercept), and call it e.g. a_diff. You can then do
az.plot_posterior(a_diff, ref_val=0)
which will plot the posterior on the difference between the two
distributions with 0 as a reference value to compare. It will also plot the
HDI on that difference so you can see if that HDI includes 0 or not as a
measure of significance.
…On Thu, May 9, 2024 at 9:06 AM James ***@***.***> wrote:
@frankmj <https://github.com/frankmj> If we have already finished the
estimation with hierarchical model (= subject) for both high and low data
separately, what is the best practice to do "Bayesian (group-difference
hierarchical) test, comparing the group distributions of high vs low. "? Is
there any "one line code" to this? Thanks for you reply~
—
Reply to this email directly, view it on GitHub
<#421 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAG7TFFTE74HBHKIEPAHZH3ZBNYEXAVCNFSM6AAAAABHNL63I6VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGNRYGYZDA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
I'm running a DDM estimation with HSSM, here is the example of data:
I separate the data into two part, one is "HIGH" (high==1), the other is "LOW" (high==0), then estimate these two data respectively. Here is the code for the model:
This is the model for estimating HIGH data. For estimating LOW data, just replace "data=HIGH" with "data=LOW". Then, after sampling, I have a bunch of group parameters (a_Intercept, t_Intercept, z_Intercept, v_Intercept, v_beta1, v_beta2, v_beta3), and the subject effect of each parameter (a_1|subject[i], t_1|subject[i], z_1|subject[i], ...) for HIGH, and another bunch of group parameters and subject effect for LOW.
And my problem is that how can I compare the difference, also do the significance test between the group parameters of HIGH and LOW. I used the following code (take a as example):
As the code shows, it calculate the difference between group parameters of HIGH and LOW in each sample, then see what proportion of the difference is larger or smaller than zero. If the proportion of large/smaller than zero is over 95%, then I say it's significantly different.
I also calculate the parameter of each subject (e.g., a_Intercept + a_1|subject[i]) in HIGH/LOW, then do paired t-test between them. It turns that the significance results are a bit different between these two methods:
Moreover, I use another model only estimate the group parameter without subject effect:
Then I only have a bunch of group parameters (a, t, z, v_Intercept, v_beta1, v_beta2, v_beta3) for HIGH and LOW. And I also use the difference to do significance test, and the result is also somewhat different from the first method:
Because in my understanding, those two models both give me the parameters for the whole group, so the significance result should be the same but the above result seems not.
I wonder which method a correct way to do significance test is. Or if there is other correct way to do so.
Thanks for any help!!
Beta Was this translation helpful? Give feedback.
All reactions