Skip to content

Commit

Permalink
Updates the index.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mmaaz60 committed Jul 15, 2023
1 parent 089a8ab commit 447e900
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 9 deletions.
9 changes: 4 additions & 5 deletions _layouts/default.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ <h2>{{ site.description | default: site.github.project_tagline }}</h2>

<aside id="sidebar">
{% if site.show_downloads %}
<a href="https://mbzuaiac-my.sharepoint.com/:f:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EoS-mdm-KchDqCVbGv8v-9IB_ZZNXtcYAHtyvI06PqbF_A?e=1sNbaa" class="button">
<small>Download</small>
<span style="font-size:0.75em">Benchmarking QA</span>
</a>
<a href="https://mbzuaiac-my.sharepoint.com/:u:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EatOpE7j68tLm2XAd0u6b8ABGGdVAwLMN6rqlDGM_DwhVA?e=90WIuW" class="button">
<small>Download</small>
<span style="font-size:0.9em">Videos</span>
Expand All @@ -48,11 +52,6 @@ <h2>{{ site.description | default: site.github.project_tagline }}</h2>
<small>Download</small>
<span style="font-size:0.8em">Dense Captions</span>
</a>
</a>
<a href="https://mbzuaiac-my.sharepoint.com/:f:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EoS-mdm-KchDqCVbGv8v-9IB_ZZNXtcYAHtyvI06PqbF_A?e=1sNbaa" class="button">
<small>Download</small>
<span style="font-size:0.75em">Benchmarking QA</span>
</a>
{% endif %}

<!-- {% if site.github.is_project_page %}-->
Expand Down
9 changes: 5 additions & 4 deletions index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,11 @@ The framework enables an in-depth evaluation of video-based conversational model

Our framework introduces a benchmark designed to assess the text generation performance of video-based conversational models. We leverage a test set of 500 samples curated from the ActivityNet-200 videos for this purpose.

You can download the videos from [here](https://mbzuaiac-my.sharepoint.com/:u:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EatOpE7j68tLm2XAd0u6b8ABGGdVAwLMN6rqlDGM_DwhVA?e=90WIuW) and
corresponding human-generated detailed descriptions from [here](https://mbzuaiac-my.sharepoint.com/:u:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EYqblLdszspJkayPvVIm5s0BCvl0m6q6B-ipmrNg-pqn6A?e=QFzc1U).
For quantitative evaluation, we curate a test set based on the ActivityNet-200 dataset, featuring videos with rich, dense descriptive captions and associated question-answer pairs from human annotations.
We develop an evaluation pipeline using the GPT-3.5 model that assigns a relative score to the generated predictions on a scale of 1-5.

The generated question-answer pairs are available for download [here](https://mbzuaiac-my.sharepoint.com/:f:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EoS-mdm-KchDqCVbGv8v-9IB_ZZNXtcYAHtyvI06PqbF_A?e=1sNbaa)
and the corresponding videos can be downloaded from [here](https://mbzuaiac-my.sharepoint.com/:u:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EatOpE7j68tLm2XAd0u6b8ABGGdVAwLMN6rqlDGM_DwhVA?e=90WIuW).

Our benchmarks cover five key aspects:

Expand All @@ -35,8 +38,6 @@ Our benchmarks cover five key aspects:

&nbsp;

We generate task-specific question-answers by querying the GPT-3.5-Turbo model using the human-generated detailed video descriptions. The generated question-answer pairs are available for download [here](https://mbzuaiac-my.sharepoint.com/:f:/g/personal/hanoona_bangalath_mbzuai_ac_ae/EoS-mdm-KchDqCVbGv8v-9IB_ZZNXtcYAHtyvI06PqbF_A?e=1sNbaa).

Follow the steps below to perform the quantitative benchmarking:

**Step 1:** Run the inference using the provided question-answer pairs for each criteria.
Expand Down

0 comments on commit 447e900

Please sign in to comment.