Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic Visual Navigation by Watching YouTube Videos #8

Open
zhaoyucs opened this issue Jun 10, 2021 · 1 comment
Open

Semantic Visual Navigation by Watching YouTube Videos #8

zhaoyucs opened this issue Jun 10, 2021 · 1 comment

Comments

@zhaoyucs
Copy link
Collaborator

利用视频中隐含的语义信息做自动导航任务的强化训练

信息

  • 主要作者:(Matthew Chang, Saurabh Gupta)
  • 单位:University of Illinois at Urbana-Champaign
  • 论文链接

1 学习到的新东西:

利用第一视角视频做预训练:把视频看做图片序列,假设图片之间隐含action,预测action,类似mask language model
pseudo-labeling:用小样本的标准数据集训练一个模型去自动标注大数据集,相当于meta learning了。

2 通过Related Work了解到了哪些知识

一些强化学习的东西,比如Qlearning
利用视频资源的方式,相比于单个图片,视频是图片的时序序列,蕴含了更多结构性的语义信息。

3 实验验证任务,如果不太熟悉,需要简单描述

最终任务是训练agent找东西,一个简单的导航任务

4 在你认知范围内,哪些其它任务可以尝试

快用videos来预训练吧

5 好的词语、句子或段落

As humans, we can efficiently solve such tasks in novel environments in a zero-shot manner.
Building computational systems that can similarly leverage such semantic regularities for navigation has been a long-standing goal.

@izhx
Copy link
Owner

izhx commented Jun 16, 2021

NIPS 2021 还没审完稿呢

@zhaoyucs zhaoyucs added 2020 and removed 2021 labels Jun 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants