经验池论文 - 专知

会员服务 ·

经验池

ReMoD: Rethinking Modality Contribution in Multimodal Stance Detection via Dual Reasoning

Arxiv

0+阅读 · 11月8日

Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound

Arxiv

0+阅读 · 10月17日

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Arxiv

0+阅读 · 10月9日

CONTHER: Human-Like Contextual Robot Learning via Hindsight Experience Replay and Transformers without Expert Demonstrations

Arxiv

0+阅读 · 3月20日

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

Arxiv

0+阅读 · 3月24日

Simplifying Deep Temporal Difference Learning

Arxiv

0+阅读 · 3月25日

Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers

Arxiv

0+阅读 · 2024年11月22日

Using Diffusion Models as Generative Replay in Continual Federated Learning -- What will Happen?

Arxiv

0+阅读 · 2024年11月10日

Simplifying Deep Temporal Difference Learning

Arxiv

0+阅读 · 2024年10月23日

Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences

Arxiv

0+阅读 · 2024年10月4日

Few-Shot Class-Incremental Learning with Non-IID Decentralized Data

Arxiv

0+阅读 · 2024年9月18日

Multi-State TD Target for Model-Free Reinforcement Learning

Arxiv

0+阅读 · 2024年8月2日

HiER: Highlight Experience Replay for Boosting Off-Policy Reinforcement Learning Agents

Arxiv

0+阅读 · 2024年7月26日

Q-Pensieve: Boosting Sample Efficiency of Multi-Objective RL Through Memory Sharing of Q-Snapshots

Arxiv

0+阅读 · 2024年7月25日

HiER: Highlight Experience Replay for Boosting Off-Policy Reinforcement Learning Agents

Arxiv

0+阅读 · 2024年7月9日

参考链接

微信扫码咨询专知VIP会员