About

MMIR Workshop

Welcome to 1st MMIR Workshop co-located with ACM Multimedia 2023!

Information retrieval (IR) is a fundamental technique that aims to acquire information from a collection of documents, web pages, or other sources. While traditional text-based IR has achieved great success, the under-utilization of varied data sources in different modalities (i.e., text, images, audio, and video) would hinder IR techniques from giving its full advancement and thus limits the applications in the real world. Within recent years, the rapid development of deep multimodal learning paves the way for advancing IR with multimodality. Benefiting from a variety of data types and modalities, some latest prevailing techniques are invented to show great facilitation in multimodal and IR learning, such as CLIP, ChatGPT, GPT4, etc. In the context of IR, deep multimodal learning has shown the prominent potential to improve the performance of retrieval systems, by enabling them to better understand and process the diverse types of data that they encounter. Given the great potential shown by multimodal-empowered IR, there can be still unsolved challenges and open questions in the related directions. With this workshop, we target providing a platform for discussion about multimodal IR among scholars, practitioners, and other interested parties.

Calls

Call for Papers

In this workshop, we welcome three types of submissions:

  1. Position or perspective papers (4~8 pages in length, plus unlimited pages for references): original ideas, perspectives, research vision, and open challenges in the topics of the workshop;
  2. Featured papers (title and abstract of the paper, plus the original paper): already published papers or papers summarizing existing publications in leading conferences and high-impact journals that are relevant for the topics of the workshop;
  3. Demonstration papers (up to 2 pages in length, plus unlimited pages for references): original or already published prototypes and operational evaluation approaches in the topics of the workshop.
All the accepted papers will be archived in the ACM MM proceedings. Authors of accepted papers will be presented at the workshop.

We will select from the accepted papers the Best Paper Award, which will be announced during the workshop.


Topics and Themes

Topics of interests include but not limited to:

  • Image-text Multimodal Learning and Retrieval, such as
    • - Vision-language Alignment Analysis
    • - Multimodal Fusion and Embeddings
    • - Vision-language Pre-training
    • - Structured Vision-language Learning
    • - Commonsense-aware Vision-language Learning
  • Video-text Understanding and Retrieval, such as
    • - Video-text Retrieval
    • - Video (Corpus) Moment Retrieval
    • - Video Relation Detection
    • - Video Question Answering
    • - Video Dialogue
  • Dialogue Multimodal Retreival, such as
    • - Multimedia Pre-training in Dialogue
    • - Multimedia Search and Recommendation
    • - Multimodal Response Generation
    • - User-centered Dialogue Retreival
    • - New Applications on ChatGPT&Visual-GPT and Beyond
  • Reliable Multimodal Retrieval, such as
    • - Explainable Multimodal Retrieval
    • - Typical Failures of ChatGPT and other Large Models
    • - Adversarial Attack and Defense
    • - New Evaluation Metrics
  • Multimedia Retrieval Applications, such as
    • - Multimodal-based Reasoning
    • - Unapired Image Captioning
    • - Multimodal Information Extraction
    • - Multimodal Translation
    • - Opinion/Sentiment-oriented Multimodal Analysis for IR

Submission Instructions

Page limits include diagrams and appendices. Submissions should be written in English, and formatted according to the current ACM two-column conference format. Authors are responsible for anonymizing the submissions. Suitable LaTeX, Word, and Overleaf templates are available from the ACM Website (use “sigconf” proceedings template for LaTeX and the Interim Template for Word).


Review Process

All submissions will be peer-reviewed by at least two reviewers of experts in the field. The reviewing process will be two-way anonymized. Acceptance will be dependent on the relevance to the workshop topics, scientific novelty, and technical quality. The accepted workshop papers will be published in the ACM Digital Library.


Important Dates

  • Paper Submission: July 29, 2023 (AoE)
  • Notification of Acceptance: August 7, 2023 (AoE)
  • Camera-ready Submission: August 22, 2023 (AoE)
  • Workshop dates: October 28, 2023 - November 3, 2023 (AoE)

Papers

Accepted Papers

  1. Self-Distilled Dynamic Network for Language-based Fashion Retrieval
    Hangfei Li, Yiming Wu, Fangfang Wang
  2. Video Referring Expression Comprehension via Transformer with Content-conditioned Query
    Jiang Ji, Meng Cao, Tengtao Song, Long Chen, Yi Wang, Yuexian Zou
  3. Boon: A Neural Search Engine for Cross-Modal Information Retrieval
    Yan Gong, Georgina Cosma
  4. On Popularity Bias of Multimodal-aware Recommender Systems: a Modalities-driven Analysis
    Daniele Malitesta, Giandomenico Cornacchia, Claudio Pomo, Tommaso Di Noia
  5. TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content
    Avinash Anand, Raj Shivprakash Poonam Jaiswal, Pijush Bhuyan, Mohit Gupta, Siddhesh Bangar, Md Modassir Imam, Rajiv Ratn Shah, Shin'ichi Satoh
  6. Metaverse Retrieval: Finding the Best Metaverse Environment via Language
    Ali Abdari, Alex Falcon, Giuseppe Serra
  7. Prescription Recommendation based on Intention Retrieval Network and Multimodal Medical
    Feng Gao, Yao Chen, Maofu Liu

Workshop Schedule

Program

Date: November 2, 2023 (full day). Room: Provinces 1. Please note the schedule is in Ottawa time zone. The program at a glance can be downloaded here.

Also you can online join via Zoom Meeting (click to enter), ID: 335 825 3206, Passcode: 118404


09:00 - 09:10   |   Welcome Message from the Chairs
09:10 - 09:40   |   Keynote 1: Hypergraphs for Multimedia Retrieval and Insight, by Prof. Marcel Worring
09:40 - 10:10   |   Keynote 2: Revisiting Pseudo Relevance Feedback: New Developments and Applications, by Prof. Shin’ichi Satoh
10:10 - 10:40   |   Coffee Break
10:40 - 11:00   |   Presentation 1: Video Referring Expression Comprehension via Transformer with Content-conditioned Query
11:00 - 11:20   |   Presentation 2: Prescription Recommendation based on Intention Retrieval Network and Multimodal Medical Indicator
11:20 - 11:40   |   Presentation 3: Self-Distilled Dynamic Network for Language-based Fashion Retrieval
11:40 - 12:00   |   Presentation 4: Metaverse Retrieval: Finding the Best Metaverse Environment via Language
12:00 - 12:30   |   Keynote 3: Task Focused IR in the Era of Generative AI, by Prof. Chirag Shah
12:30 - 15:00   |   Lunch Break
15:00 - 15:20   |   Presentation 5: On Popularity Bias of Multimodal-aware Recommender Systems: a Modalities-driven Analysis
15:20 - 15:40   |   Presentation 6: Boon: A Neural Search Engine for Cross-Modal Information Retrieval
15:40 - 16:10   |   Coffee Break
16:10 - 16:30   |   Presentation 7: TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content
16:30 - 16:50   |   Presentation 8: Zero-Shot Composed Image Retrieval with Textual Inversion
16:50   |   Workshop Closing

Talks

Invited Speakers

Marcel Worring

University of Amsterdam

Shin'ichi Satoh

National Institute of Informatics

Chirag Shah

University of Washington

Committee

Program Committee

Long Chen

The Hong Kong University of Science and Technology

Xun Yang

University of Science and Technology of China

Chenliang Li

Wuhan University

Yixin Cao

Singapore Management University

Peiguang Jing

Tianjin University

Junyu Gao

Chinese Academy of Sciences

Zheng Wang

Wuhan University

Ping Liu

A*STAR

Qingji Guan

Beijing Jiaotong University

Weijian Deng

Australian National University

Jieming Zhu

Huawei Noah’s Ark Lab

Organization

Workshop Organizers

Wei Ji

National University of Singapore

Yinwei Wei

Monash University

Zhedong Zheng

National University of Singapore

Hao Fei

National University of Singapore

Tat-seng Chua

National University of Singapore

Contact

Contact us

Join and post at our Google Group!
Email the organziers at mmir23@googlegroups.com .