
Multimodal Large Language Models: A Survey - IEEE Xplore
Abstract: The exploration of multimodal language models integrates multiple data types, such as images, text, language, audio, and other heterogeneity. While the latest large language …
[2306.13549] A Survey on Multimodal Large Language Models
Jun 23, 2023 · First of all, we present the basic formulation of MLLM and delineate its related concepts, including architecture, training strategy and data, as well as evaluation. Then, we …
A survey on multimodal large language models - PubMed
Nov 12, 2024 · Recently, the multimodal large language model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful large language models (LLMs) …
Paper List and Resource Repository for Embodied AI
Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning …
(PDF) Multimodal Large Language Models: A Survey
Nov 23, 2023 · This paper begins by defining the concept of multimodal and examining the historical development of multimodal algorithms.
Multimodal Large Language Models: A Survey - Semantic Scholar
Nov 22, 2023 · This survey provides the first comprehensive analysis of mathematical reasoning in the era of multimodal large language models (MLLMs), and explores multimodal …
Large multimodal models evaluation: a survey | CoLab
Nov 18, 2025 · In this paper, we introduce a benchmark, SHIELD, to evaluate MLLMs for face spoofing and forgery detection. Specifically, we design true/false and multiple-choice …
This paper first reviews the basic architecture of current MLLM research, providing a detailed introduction to the model training strategies (pre-training, instruction-tuning, and alignment …
[2311.13165] Multimodal Large Language Models: A Survey
Nov 22, 2023 · A practical guide is provided, offering insights into the technical aspects of multimodal models. Moreover, we present a compilation of the latest algorithms and commonly …
[2306.13549] A Survey on Multimodal Large Language Models
Feb 28, 2024 · We write this survey to provide researchers with a grasp of the basic idea, main method, and current progress of MLLMs. Note that we mainly focus on visual and language …