MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models I’m excited to share our new paper, MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language M...
I’ve earned my Songwriting certificate from Berklee Online via Coursera.
我部署了一个模仿游戏《原神》中神里绫华人格的聊天机器人。在左侧边栏也可以与绫华聊天。快试试吧。 A chatbot inspired by Kamisato Ayaka’s personality from Genshin Impact is now available below and in the sidebar to the left. She would be delighted...
Thrilled to share that I’ve successfully passed my dissertation defense! Update on 07/02/2025 I have got my official diploma!!! Here is the official certified PDF: And perm download link...
SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding My Adobe internship work has been accepted as a conference paper at ICLR 2025: “SV-RAG: LoRA-Contextualizing Adaptat...
I am a TA for CSE-587: Data Intensive Computing in the Fall 2024 semester. While creating assignment problems, I created a bird flock simulation using PySpark.
Our paper TextLap: Customizing Language Models for Text-to-Layout Planning has been accepted to EMNLP 2024 and is available on arXiv.
I’m exploring AI for music as a hobby and will be attending the 25th International Society for Music Information Retrieval (ISMIR) remotely to gain insights into the latest research trends and em...
We build the MMR: Multi-Modal Reading Benchmark for Evaluating Reading Ability of Large Multimodal Models. The MMR Benchmark paper and code is released and currently available on arXiv.
For some unknown reason, LaTeX occasionally fails to find citations even when they are explicitly listed using \bibitem. This issue can often be resolved by compiling the document twice locally: th...
A new version of content is available.