Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Release of Multimodal-CoT Large 738M Model #65

Open
Amyyyyeah opened this issue Nov 11, 2023 · 3 comments
Open

Request for Release of Multimodal-CoT Large 738M Model #65

Amyyyyeah opened this issue Nov 11, 2023 · 3 comments

Comments

@Amyyyyeah
Copy link

Amyyyyeah commented Nov 11, 2023

I've recently come across your paper detailing the impressive capabilities of the Multimodal-CoT Large 738M model, particularly its performance across various metrics (95.91, 82.00, 90.82, 95.26, 88.80, 92.89, 92.44, 90.31, and 91.68).

I am writing to inquire about the possibility of its public release because we have noted that the GitHub version, which shows a performance score of 90.45, differs from the one reported in your paper (91.68 performance score). Access to this model could significantly aid in ongoing research and development efforts in our field.

Thank you for your time and your contributions to the field. I look forward to your response and the opportunity to work with this innovative model.

@dingning97
Copy link

Hi. Can you reproduce the 91.68% accuracy using T5-large model ?
I tried to reproduce the experiments with "declare-lab/flan-alpaca-large" model, but only got ~90.5% accuracy for the test set of ScienceQA.

@1-sf
Copy link

1-sf commented Jan 15, 2024

Hi @dingning97 and @Amyyyyeah , I too get a similar avg accuracy 90.45%. I see https://huggingface.co/cooelf/mm-cot/tree/main also has a similar accuracy which is lower than the one discussed in the paper.

First of all thanks for the authors for such an innovative idea, it'll be great if the authors can release the model weights which will be very beneficial for people like us

@cooelf @astonzhang

@cooelf
Copy link
Contributor

cooelf commented May 19, 2024

Hi guys, thanks for your interest. The released models are my reproduced ones using a limited computation resource after my internship finishes. It is possible to obtain better results with more hyper-parameter searching.

BTW, we are inspired by an increase in the base model compared with the original one. We will update the paper with the latest results based on our released models for consistence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants