MLinfo | 機械学習・AI論文まとめ

MLinfo|日々更新される技術をキャッチアップ/検索

「RLHF」の検索結果

5 件

すべて arxiv github huggingface 実装あり

arxivPaper only2026-07-23

Artificial Epanorthosis: Why large language models overuse a classical rhetorical figure, and how to mitigate it

Artificial Epanorthosisは、大規模言語モデルが古典的なルレチックの表現を使用する傾向に注目した。結果は、モデルのトレーニングデータの形状がこの傾向に影響していることができた。

深層学習軽量化・量子化分類生成テキスト

用途: 大規模言語モデル上のエパノルシス
難易度: Hard
コスト: High

arxivPaper only2026-07-22

How Fast Can Reward Models Score? A Systems Study of C++ and PyTorch Inference Runtimes for RLHF

In RLHF pipelines, reward scoring blocks policy updates. Slow scoring bottlenecks the entire loop, since no up

CPUで試しやすいコンピュータビジョンセグメンテーション生成

用途: 生成
難易度: Hard
コスト: Medium

arxivGitHubあり2026-07-22

Rushes: A Human Preference Dataset for Pluralistic Alignment

We introduce Rushes, a dataset and benchmark for studying revealed human engagement preferences in interactive

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-07-15

The Dynamic Verifiable Multi-Agent Human Agentic Loyalty Loop (DVM-HALL) Model and the Net Human-Agent Score (NHAS) in Autonomous Commerce

自動販売店で客と交わるAIロボットの信頼性を確立する必要がある。このモデルは、客とロボットの信頼関係を構築し、客の買い物をサポートすることを目的としている。

強化学習RLHF

用途: 自動販売店で客と交わるAIロボットの信頼性の確立
難易度: Hard
コスト: Medium

arxivPaper only2026-06-27

The Two Genie Game: Adoption and Welfare in Audit-Grounded AI Governance

We ask under what conditions an agent with a harm-minimizing policy can displace an approval-seeking (RLHF) ag

コンピュータビジョンセグメンテーション

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium