Xiao Hong Shu
Duration: May 2023 - Present
Description: During the internship, I mainly focus on Multimodal Recommendation System, collaborating with Application Model Group of AI Technology Department of Xiaohongshu. I'm delighted to cooperate with Jiarui Jin.
Shanghai Qi Zhi Institute
Duration: October 2023 - April 2024
Description: During the internship, I am delighted to cooperate with Gu Zhang and Yanjie Ze, advised by Prof. Huazhe Xu, focusing in Robotics Learning .
Publication
I'm interested in Data Mining, Recommendation and Robotics.
Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation Kangning Zhang,
Jiarui JinYingjie Qin,
Ruilong Su,
Jianghao Lin,
Weinan ZhangYong Yu arxiv
In this paper, we propose an ID-free Multimodal Token Representation scheme named MOTOR that representations each item using learnable multimodal tokens and utilizes a Token Cross Network to capture the implicit interaction patterns between these tokens.
DREAM: A Dual Representation Learning Model for Multimodal Recommendation Kangning Zhang,
Yingjie Qin,
Ruilong Su,
Yifan LiuJiarui Jin,
Weinan ZhangYong Yu arxiv
In this paper, we propose a novel Dual Representation learning framework called DREAM, denoting to integrate behavioral and multimodal information through separate dual lines and address the issue of Modal Information Forgetting.
Acceptted by Robotics: Science and System (RSS), 2024
project
/
arxiv
/
code
We present 3D Diffusion Policy (DP3), a novel visual imitation learning approach that incorporates the power of 3D visual representations into diffusion policies.
Acceptted by International Conference on Information and Knowledge Managemen(CIKM), 2024
arxiv
/
code
In this paper, we first systematically investigate the misalignment issue in multimodal recommendations, and propose a solution named AlignRec.
We also find that the multimodal features generated by AlignRec are better than currently used ones.
Acceptted by The Web Conference (WWW), 2024
arxiv
/
code
In this paper, we introduce ClickPrompt, aiming to model both the semantic knowledge and collaborative knowledge for accurate CTR estimation, and meanwhile address the inference inefficiency issue.