Anthropic论文 - 专知

会员服务 ·

Anthropic

Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory

Arxiv

0+阅读 · 7月3日

An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models

Arxiv

0+阅读 · 3月15日

An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models

Arxiv

0+阅读 · 1月21日

Toward Democracy Levels for AI

Arxiv

0+阅读 · 2024年12月8日

Toward Democracy Levels for AI

Arxiv

0+阅读 · 2024年11月14日

Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling

Arxiv

0+阅读 · 2024年10月23日

Sabotage Evaluations for Frontier Models

Arxiv

0+阅读 · 2024年10月28日

Jailbreaking LLMs with Arabic Transliteration and Arabizi

Arxiv

0+阅读 · 2024年10月3日

Mapping Technical Safety Research at AI Companies: A literature review and incentives analysis

Arxiv

0+阅读 · 2024年9月12日

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

Arxiv

0+阅读 · 2024年7月9日

Public Constitutional AI

Arxiv

0+阅读 · 2024年6月24日

Self and Cross-Model Distillation for LLMs: Effective Methods for Refusal Pattern Alignment

Arxiv

0+阅读 · 2024年6月17日

Killer Apps: Low-Speed, Large-Scale AI Weapons

Arxiv

1+阅读 · 2024年6月17日

Identification of Stone Deterioration Patterns with Large Multimodal Models

Arxiv

0+阅读 · 2024年6月5日

Backdoor Removal for Generative Large Language Models

Arxiv

0+阅读 · 2024年5月13日

参考链接

微信扫码咨询专知VIP会员