GPT-4V论文 - 专知

会员服务 ·

GPT-4V

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Arxiv

0+阅读 · 10月29日

GPT-5 Model Corrected GPT-4V's Chart Reading Errors, Not Prompting

Arxiv

0+阅读 · 10月8日

JEEM: Vision-Language Understanding in Four Arabic Dialects

Arxiv

0+阅读 · 3月27日

BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs

Arxiv

0+阅读 · 3月27日

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Arxiv

0+阅读 · 3月7日

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Arxiv

0+阅读 · 2024年12月29日

Improved GUI Grounding via Iterative Narrowing

Arxiv

1+阅读 · 2024年12月20日

Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?

Arxiv

1+阅读 · 2024年12月15日

From Concept to Manufacturing: Evaluating Vision-Language Models for Engineering Design

Arxiv

0+阅读 · 2024年12月9日

Improved GUI Grounding via Iterative Narrowing

Arxiv

0+阅读 · 2024年12月9日

A Survey on Multimodal Large Language Models

Arxiv

0+阅读 · 2024年11月29日

A Survey on Multimodal Large Language Models

Arxiv

0+阅读 · 2024年11月26日

Improved GUI Grounding via Iterative Narrowing

Arxiv

0+阅读 · 2024年11月24日

GPT-4V Cannot Generate Radiology Reports Yet

Arxiv

0+阅读 · 2024年11月14日

MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations

Arxiv

0+阅读 · 2024年11月12日

参考链接

微信扫码咨询专知VIP会员