About
MMGR Workshop
Welcome to 2nd MMGR Workshop co-located with ACM Multimedia 2024!
Information generation (IG) and information retrieval (IR) are two key representative approaches of information acquisition, i.e., producing content either via generation or via retrieval. While traditional IG and IR have achieved great success within the scope of languages, the under-utilization of varied data sources in different modalities (i.e., text, images, audio, and video) would hinder IG and IR techniques from giving the full advances and thus limits the applications in the real world. Knowing the fact that our world is replete with multimedia information, this workshop encourages the development of deep multimodal learning for the research of IG and IR. Benefiting from a variety of data types and modalities, some latest prevailing techniques are extensively invented to show great facilitation in multimodal IG and IR learning, such as DALL-E, Stable Diffusion, GPT4, Sora, etc. Given the great potential shown by multimodal-empowered IG and IR, there can be still unsolved challenges and open questions in these directions. With this special issue, we aim to encourage more explorations in Deep Multimodal Generation and Retrieval, providing a platform for researchers to share insights and advancements in this rapidly evolving domain.
Calls
Call for Papers
In this workshop, we welcome three types of submissions:
- Position or perspective papers (The same format & template as the main conference, but the manuscript’s length is limited to one of the two options: a) 4 pages plus 1-page reference; or b) 8 pages plus up to 2-page reference.): original ideas, perspectives, research vision, and open challenges in the topics of the workshop;
- Featured papers (title and abstract of the paper, plus the original paper): already published papers or papers summarizing existing publications in leading conferences and high-impact journals that are relevant for the topics of the workshop;
- Demonstration papers (up to 2 pages in length, plus unlimited pages for references): original or already published prototypes and operational evaluation approaches in the topics of the workshop.
We will select from the accepted papers the Best Paper Award, which will be announced during the workshop.
Topics and Themes
Topics of interests include but not limited to:
- Multimodal Semantics Understanding, such as
- - Vision-Language Alignment Analysis
- - Multimodal Fusion and Embeddings
- - Large-scale Vision-Language Pre-training
- - Structured Vision-Language Learning
- - Visually Grounded Interaction of Language Modeling
- - Commonsense-aware Vision-Language Learning
- - Visually Grounded Language Parsing
- - Semantic-aware Vision-Language Discovery
- - Large Multimodal Models
- Generative Models for Image/Video Synthesis, such as
- - Text-free/conditioned Image Synthesis
- - Text-free/conditioned Video Synthesis
- - Temporal Coherence in Video Generation
- - Image/Video Editing/Inpainting
- - Visual Style Transfer
- - Image/Video Dialogue
- - Panoramic Scene Generation
- - Multimodal Dialogue Response Generation
- - LLM-empowered Multimodal Generation
- Multimodal Information Retrieval, such as
- - Image/Video-Text Compositional Retrieval
- - Image/Video Moment Retrieval
- - Image/Video Captioning
- - Image/Video Relation Detection
- - Image/Video Question Answering
- - Multimodal Retrieval with MLLMs
- - Hybrid Synthesis with Retrieval and Generation
- Explainable and Reliable Multimodal Learning, such as
- - Explainable Multimodal Retrieval
- - Relieve Hallucination of LLMs
- - Adversarial Attack and Defense
- - Multimodal Learning for Social Good
- - Multimodal-based Reasoning
- - Multimodal Instruction Tuning
- - Efficient Learning of MLLMs
Submission Instructions
Page limits include diagrams and appendices. Submissions should be written in English, and formatted according to the current ACM two-column conference format. Authors are responsible for anonymizing the submissions. Suitable LaTeX, Word, and Overleaf templates are available from the ACM Website (use “sigconf” proceedings template for LaTeX and the Interim Template for Word).
Review Process
All submissions will be peer-reviewed by at least two reviewers of experts in the field. The reviewing process will be two-way anonymized. Acceptance will be dependent on the relevance to the workshop topics, scientific novelty, and technical quality. The accepted workshop papers will be published in the ACM Digital Library.
Important Dates
- Paper Submission:
July 19, 2024 Aug 7, 2024 (AoE) - Notification of Acceptance:
August 12, 2024 (AoE) - Camera-ready Submission:
August 19, 2024 (AoE) [Firm Deadline] - Workshop dates: 28 October, 2024 (AoE)
Papers
Accepted Papers
TBA
Workshop Schedule
Program
TBA
Committee
Program Committee
Xun Yang
University of Science and Technology of ChinaChenliang Li
Wuhan UniversityYixin Cao
Singapore Management UniversityPeiguang Jing
Tianjin UniversityJunyu Gao
Chinese Academy of SciencesPing Liu
A*STARQingji Guan
Beijing Jiaotong UniversityWeijian Deng
Australian National UniversityJieming Zhu
Huawei Noah’s Ark LabJaiyi Ji
Xiamen UniversityYuan Yao
National University of SingaporeXiangtai Li
ByteDance/TiktokOrganization
Workshop Organizers
Wei Ji
National University of SingaporeHao Fei
National University of SingaporeYinwei Wei
Monash UniversityZhedong Zheng
University of MacauJuncheng Li
Zhejiang UniversityZhiqi Ge
Zhejiang UniversityLong Chen
The Hong Kong University of Science and TechnologyLizi Liao
Singapore Management UniversityYueting Zhuang
Zhejiang UniversityRoger Zimmermann
National University of SingaporeContact
Contact us