Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results on Nov. and Dec. 2010 for different combinations #63

Open
soroush-ziaeinejad opened this issue Jan 15, 2023 · 21 comments
Open

Results on Nov. and Dec. 2010 for different combinations #63

soroush-ziaeinejad opened this issue Jan 15, 2023 · 21 comments

Comments

@soroush-ziaeinejad
Copy link
Contributor

@hosseinfani

This issue page is created to showcase the results of SEERa using different combinations of tml and gel methods. A line chart and some stats will be provided for each instance.

@soroush-ziaeinejad
Copy link
Contributor Author

soroush-ziaeinejad commented Jan 15, 2023

TML: LDA.gensim / GEL: DynAERNN

Pred Eval Mean Chart

Gensim DynAERNN Mean Std
nDCG 0.0669 0.0349
map 0.0002 0.0005
success 0.7623 0.2355

@soroush-ziaeinejad
Copy link
Contributor Author

soroush-ziaeinejad commented Jan 15, 2023

TML: Gsdmm / GEL: DynAERNN

Pred Eval Mean Chart

Gsdmm DynAERNN Mean Std
nDCG 0.0609 0.0320
map 0.0002 0.00004
success 0.7638 0.2388

@hosseinfani
Copy link
Member

@soroush-ziaeinejad
please put the baselines for each metric in one figure, e.g., ndcgs for all combinations in one figure, ....

@soroush-ziaeinejad
Copy link
Contributor Author

@soroush-ziaeinejad please put the baselines for each metric in one figure, e.g., ndcgs for all combinations in one figure, ....

I had it in mind but because we gradually achieve the results, I decided to put charts like these, and when all combinations are completed, I will draw and put that kind of chart to compare baselines.

@soroush-ziaeinejad
Copy link
Contributor Author

soroush-ziaeinejad commented Jan 16, 2023

TML: Gsdmm / GEL: DynAE

Pred Eval Mean Chart

Gsdmm DynAE Mean Std
nDCG 0.0605 0.0321
map 0.0002 0.00004
success 0.7601 0.2334

@soroush-ziaeinejad
Copy link
Contributor Author

soroush-ziaeinejad commented Jan 16, 2023

TML: Gsdmm / GEL: DynRNN

Pred Eval Mean Chart

Gsdmm DynAE Mean Std
nDCG 0.0618 0.0315
map 0.0002 0.00004
success 0.7761 0.2184

@soroush-ziaeinejad
Copy link
Contributor Author

soroush-ziaeinejad commented Jan 16, 2023

TML: LDA.gensim / GEL: DynAE

Pred Eval Mean Chart

Gensim DynAERNN Mean Std
nDCG 0.0669 0.0360
map 0.0002 0.0005
success 0.7659 0.2271

@soroush-ziaeinejad
Copy link
Contributor Author

soroush-ziaeinejad commented Jan 16, 2023

TML: LDA.gensim / GEL: DynRNN

Pred Eval Mean Chart

Gensim DynAERNN Mean Std
nDCG 0.0671 0.0354
map 0.0003 0.0005
success 0.7654 0.2319

@hosseinfani
Copy link
Member

@soroush-ziaeinejad
Also, add the min-max to the plots to show the +/- std. Drop the max.

@soroush-ziaeinejad
Copy link
Contributor Author

soroush-ziaeinejad commented Jan 16, 2023

@hosseinfani,

Here is the comparison between these 6 combinations:
1. gsdmm / DynAE
2. gsdmm / DynAERNN
3. gsdmm / DynRNN
4. LDA/ DynAE
5. LDA/ DynAERNN
6. LDA/ DynRNN

for 3 metrics:
1. nDCG
2. map
3. success

nDCG

ndcg

map

map

success

success

@hosseinfani
Copy link
Member

@soroush-ziaeinejad
thanks. make them till 1,000, also, write your analysis of the figures here.

@soroush-ziaeinejad
Copy link
Contributor Author

@hosseinfani,

Here is the comparison between these 6 combinations till k=1000:
1. gsdmm / DynAE
2. gsdmm / DynAERNN
3. gsdmm / DynRNN
4. LDA/ DynAE
5. LDA/ DynAERNN
6. LDA/ DynRNN

for 3 metrics:
1. nDCG
2. map
3. success

nDCG

ndcg_1000

map

map_1000

success

success_1000

@soroush-ziaeinejad
Copy link
Contributor Author

@hosseinfani ,

I added +- std to the baselines chart and here is the result for success metric.
success_1000_shadow
I checked the std values and they are greater than the mean (up to 10 times in some cases) and it causes wide shadows and high overlapping areas. Any suggestions?

@hosseinfani
Copy link
Member

@soroush-ziaeinejad
drop the legends for stds. make the negative stds to zero. drill down for a sample cutoff like k=400 and double check the root cause of high variations.

@soroush-ziaeinejad
Copy link
Contributor Author

success_1000_shadow2
ndcg_1000_shadow2
map_1000_shadow2

@soroush-ziaeinejad
Copy link
Contributor Author

Using variance instead of std leads to these results for success, ndcg, and map, respectively.
success_1000_shadow3
ndcg_1000_shadow3
map_1000_shadow3

@hosseinfani
Copy link
Member

@soroush-ziaeinejad so let's proceed with var then. btw, the metric values are very low for practical use though

@soroush-ziaeinejad
Copy link
Contributor Author

@hosseinfani, btm_DynAERNN and btm_DynRNN are added (btm_DynAE is still running). A new legend to better show the label for each line is also added.
success_1000_shadow7
ndcg_1000_shadow7
map_1000_shadow7

@soroush-ziaeinejad
Copy link
Contributor Author

Meanwhile, I am putting the results for the toy dataset [1-4] Dec 2010. Right now, more combinations have results for this dataset compared to the main one.

success_1000_shadow
ndcg_1000_shadow
map_1000_shadow

@hosseinfani
Copy link
Member

@soroush-ziaeinejad
Thanks, Soroush. I am going to allocate more time to your paper draft. We need to 80-20 time split, 80 to paper writeup :)

@soroush-ziaeinejad
Copy link
Contributor Author

Thanks @hosseinfani

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants