About 2,650,000 results
Open links in new tab
  1. Multimodal Large Language Models: A Survey - IEEE Xplore

    Abstract: The exploration of multimodal language models integrates multiple data types, such as images, text, language, audio, and other heterogeneity. While the latest large language …

  2. [2306.13549] A Survey on Multimodal Large Language Models

    Jun 23, 2023 · First of all, we present the basic formulation of MLLM and delineate its related concepts, including architecture, training strategy and data, as well as evaluation. Then, we …

  3. A survey on multimodal large language models - PubMed

    Nov 12, 2024 · Recently, the multimodal large language model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful large language models (LLMs) …

  4. Paper List and Resource Repository for Embodied AI

    Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning …

  5. (PDF) Multimodal Large Language Models: A Survey

    Nov 23, 2023 · This paper begins by defining the concept of multimodal and examining the historical development of multimodal algorithms.

  6. Multimodal Large Language Models: A Survey - Semantic Scholar

    Nov 22, 2023 · This survey provides the first comprehensive analysis of mathematical reasoning in the era of multimodal large language models (MLLMs), and explores multimodal …

  7. Large multimodal models evaluation: a survey | CoLab

    Nov 18, 2025 · In this paper, we introduce a benchmark, SHIELD, to evaluate MLLMs for face spoofing and forgery detection. Specifically, we design true/false and multiple-choice …

  8. This paper first reviews the basic architecture of current MLLM research, providing a detailed introduction to the model training strategies (pre-training, instruction-tuning, and alignment …

  9. [2311.13165] Multimodal Large Language Models: A Survey

    Nov 22, 2023 · A practical guide is provided, offering insights into the technical aspects of multimodal models. Moreover, we present a compilation of the latest algorithms and commonly …

  10. [2306.13549] A Survey on Multimodal Large Language Models

    Feb 28, 2024 · We write this survey to provide researchers with a grasp of the basic idea, main method, and current progress of MLLMs. Note that we mainly focus on visual and language …