MLinfo | 機械学習・AI論文まとめ

Rethinking the Divergence Regularization in LLM RL

この論文では、LLM RLの安定性を向上させるために、離散化と重み付けを用いた分散化されたPPOを提案します。また、安定性の向上によって、大規模言語モデルを用いたRLの適用が可能になります。

用途: LLM RLの安定性向上
難易度: Hard
コスト: High

品質予測/異常検知コンピュータビジョンセグメンテーション生成画像テキスト

Echo-Memory: A Controlled Study of Memory in Action World Models

この研究では、エピソード記憶を制御するために、エピソード記憶モデルを設計および評価しました。エピソード記憶モデルは、エピソード内の重要な情報を記憶し、エピソード間の相関関係を特定することができます。

用途: エピソード記憶
難易度: Hard
コスト: High

Bandits for Efficient Experimentation: Adapting to Control Group, Preferences, and Context Drifts

この研究では、有効なバンドのオブザーバブックを設計しました。このオブザーバブックは、ユーザの相互作用とコンテキストの変化に応じて、有効バンドをアダプティブに選択することができます。

深層学習軽量化・量子化回帰テキスト

用途: 有効なバンドのオブザーバブック
難易度: Hard
コスト: Medium

Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan

この研究では、低リソース言語NMTのために、データ合成方法を開発しました。これにより、データ合成されたコーパスを使用し、NMTモデルをパラメータ効率的にフィーヌチュン化できます。

深層学習軽量化・量子化生成翻訳テキスト

用途: NMT低リソースデータ合成
難易度: Hard
コスト: Medium

Your Model Already Knows: Attention-Guided Safety Filter for Vision-Language-Action Models

Vision-Language-Action (VLA) models have demonstrated impressive end-to-end performance across a variety of ro

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Learning to Attack and Defend: Adaptive Red Teaming of Language Models via GRPO

AIリードチームは、進化する攻撃者と防御者に対処するために、継続的対応が必要です。強化学習を使うと、新しい攻撃を探し出すことができ、同時に強化学習を使って防御を強化することもできます。新しいフレームワークAdvGRPOは

用途: 攻撃の応答
難易度: Hard
コスト: High

What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks

大規模言語モデル（LLM）を運用するコンテンツモデレーションシステムは、有害なオンラインコンテンツを防止するために重要な役割を果たします。しかし、これらのシステムの主な目標は単にトークナイズされたテキストを操作することに

自然言語処理大規模言語モデル分類検出画像

用途: 文書の分類
難易度: Hard
コスト: High

Correlation Is Not Enough: Embedding Human Metadata for Individual Causal Discovery

バイオメディカル言語モデルの場合も、Cosine Similarityで2つのトピックを関連付ける際に、0.83をスコアに返却しますが、実際にはその2つは関係がありません。このことから、off-the-shelfのバイオ

用途: 個体の因果検出
難易度: Hard
コスト: High

Algorithm for Contextual Queueing Bandits with Rate-Optimal Queue Length Regret

Contextual queueing bandits provide a framework for learning to schedule heterogeneous jobs under unknown cont

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

少数データ向き表形式向きCPUで試しやすいMI向き深層学習軽量化・量子化回帰テキスト表形式

In-Context Learning for Latent Space Bayesian Optimization

Bayesian optimization (BO) is a central tool for sample-efficient design, and latent-space Bayesian optimizati

用途: 回帰
難易度: Hard
コスト: High

End-to-End Context Compression at Scale

Long-context language model inference is bottlenecked by memory, as the KV cache grows with context length. Re

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Muon Learns More Robust and Transferable Features than Adam

Muon has recently emerged as a state-of-the-art optimizer for pretraining Large Language Models (LLMs) and vis

深層学習Transformer分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

ReCoVLA: VLM-Guided Reward Compilation for Failure Recovery in Vision-Language-Action Policies

Vision-language-action (VLA) policies provide strong priors for language-conditioned manipulation, but remain

自然言語処理RAGテキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Code Is More Than Text: Uncertainty Estimation for Code Generation

コード生成を安全かつ信頼できる方法で行うことを目的とした研究。コード生成における不確実性を推定する方法を提案し、コードの解釈可能性と安全性を向上させる。

用途: コード生成を安全かつ信頼できる方法で行う
難易度: Hard
コスト: High

PRISM: Recovering Instruction Sets from Language Model Activations

ラングラージュモデルを解釈するためのアクティベーション分析を提案。モデルを分析することで、モデルがどのようなコードを生成しているかを理解する。

用途: ラングラージュモデルを解釈するためのアクティベーション分析
難易度: Hard
コスト: High

Streaming Interventions: Can Video Large Language Models Correct Mistakes as They Occur?

動画大規模言語モデルを使用した質問に対する回答を研究。モデルの能力と限界を調査し、質問に対する答えを生成するための方法を提案した。

深層学習軽量化・量子化テキスト動画マルチモーダル

用途: 動画大規模言語モデルを使用した質問に対する回答
難易度: Hard
コスト: High

BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference

ラングラージュモデルを効率的に推論することを目的とした研究。モデルの推論を効率化するために、モデルの深さを自動的に調整する方法を提案した。

用途: ラングラージュモデルを効率的に推論する
難易度: Hard
コスト: High

Breaking the Tokenizer Barrier: On-Policy Distillation across Model Families

On-Policy Distillation (OPD) has become a core technique in the post-training of Large Language Models (LLMs)

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Graph Mamba Operator: A Latent Simulator for Interacting Particle Systems

Modeling interacting dynamical systems requires capturing spatial interactions alongside long-range temporal d

深層学習グラフニューラルネットテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

LargeMonitor: Monitoring Online Task-Free Continual Learning via Large Pretrained Models

オンライン学習の継続学習では、モデルは非駅性データストリームから知識を継続的に蓄積する必要があります。モデルのパラメータはトレーニング中に効果的に調整される必要がありますが、パラメータ効率的なプロンプトチューニングや

深層学習軽量化・量子化検出テキストマルチモーダル

用途: オンライン学習の継続学習
難易度: Hard
コスト: High

Now You (Still) See Me: Detecting Evasive Steganographic Payloads in LLMs

最近の研究では、線形プローブを使用して暗示された秘密を内部アクティブ化から回復し、ステラングラフィック侵入の検出を改善しました。しかし、ステラングラフィック侵入を検出し、内部アクティブ化を検知するには、ステラングラフィッ

自然言語処理大規模言語モデル検出テキスト

用途: ステルタグラフィックの侵入検出
難易度: Hard
コスト: High

Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models

この研究では、機械学習モデルをプライバシー保護のための適応化する際、プライバシー保護の実験的な効果を分析することに関与します。

深層学習軽量化・量子化異常検知テキスト

用途: プライバシー保護のベンチマーク
難易度: Hard
コスト: High

Distilling Safe LLM Systems via Soft Prompts for On Device Settings

この研究では、強力な防御ガードモデルと低パラメータのLLMを組み合わせたデュアルモデルシステムを導入し、安全なLLMのデプロイに使用できます。

用途: セーフなLLMのディストリビュート
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化生成テキスト強化学習

Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short

この研究では、強化学習のトレーニングに使用するRewardsの検証が難しい場合は、Rewardがグループレベルでは無価値で、グループ間の優劣比較が不可能となる問題に対処するためのReasoning Arenaを提案します

用途: 強化学習のトレーニング
難易度: Hard
コスト: High

Scaling Neural Network Verification with Tensor Parallelism and Fully Sharded Data Parallelism

この研究では、Tensor ParallelismとFully Sharded Data Parallelism技術を利用して、GPU メモリ限界のある従来の検証アーキテクチャの制約を解いて、機械学習ネットワークの検証を

深層学習CNNテキスト音声

用途: 予測ネットワークの検証
難易度: Hard
コスト: High

説明可能センサ/時系列深層学習CNN画像テキストマルチモーダル

Zero-Shot Semantic Re-Identification for Autonomous Driving: A VLM Baseline Study

この研究では、ゼロショットセマンティック再特定の基準を設定し、画像のセマンティック特定を自動化します。

用途: セマンティック再特定
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化テキスト強化学習

PBSD: Privileged Bayesian Self-Distillation for Long-Horizon Credit Assignment

この研究では、長期的なタスクの再帰の信用割当問題に対処するために、長期的なタスクの再帰をサポートするPrivileged Bayesian Self-Distillation (PBSD) を提案します。

用途: 低レベルタスクの再帰
難易度: Hard
コスト: High

Conan-embedding-v3: Fusing Modality-Specific Models for Omni-Modal Embedding

この研究では、テキスト、画像、ビデオ、アウディオ等の異なるモダリティのデータを統合したオムニモダル検索システムを構築します。

自然言語処理ファインチューニング回帰検索画像

用途: オムニモーダル検索
難易度: Hard
コスト: High

PRISM: Topology-Aware Cross-Modal Imputation for Modality-Deficient Federated Graph Learning

Multimodal federated graph learning (MM-FGL) aims to collaboratively learn from decentralized graphs with text

自然言語処理RAG画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能品質予測/異常検知深層学習Transformer分類セグメンテーションテキスト

Intention Driven Identification of In-Possession Match Phases in Association Football through Temporal Graph Learning

Understanding tactical organisation of association football, hereafter referred to as football, requires ident

用途: 分類
難易度: Hard
コスト: Low

Internalizing Geometric Law: Learning from Solver Residuals for Precision-Critical Generation

自然言語から機械設計や技術図案などの正確な構成を作成することができるシステムを開発しました。このシステムは、Geometric Constraintsを満たす正確な構成を作成するために、Constraint DSL (D

用途: 機械設計や技術図案の生成
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer検出生成埋め込み

Multi-View Speech Representation Learning for Parkinson's Disease Detection Using Context-guided Cross-modal Attention

パーキンソン病（PD）の早期検出への取り組みとして、脳の損傷が発症前に生じる話術障害を分析するため、音声分析を用いてパーキンソン病の診断を提唱しています。

用途: パーキンソン病の早期検出
難易度: Hard
コスト: High

MI向き自然言語処理ファインチューニング画像テキスト

Orange Lab: Lowering Barriers to Data Mining through Embedded Interactive Workflows

この論文では、data mining におけるビジュアルプログラミングフレームワーク、Orange Lab を提唱しました。これにより、Webベースのデータ分析環境を提供し、ユーザーフェイシングの分析ツールとしてデータ分

用途: データ分析フロー
難易度: Hard
コスト: Medium

The Injection Paradox: Brand-Level Suppression in Safety-Trained LLM Recommendations via RAG Context Injection

この論文では、RAG によって安全に訓練されたLLMに攻撃を加えた結果、RAGによって安全に訓練されたLLMの推論が抑制されることを示しています。これは、RAGによって訓練されたLLMが、推論を抑制するために使われたコン

用途: LLM の安全な推論
難易度: Hard
コスト: High

Asymptotic Optimality of Thompson Sampling for Risk-Averse Bandits with Sub-Gaussian Rewards

これは、不確実性やリスクを減らすために、$\rho$-NPTS (Nonparametric Thompson Sampling) というアレイフリーの非パラメトリックベースのThompson Samplingで、リスク

用途: リスク厳格なマルチ腕バンディットの最適化
難易度: Hard
コスト: Medium

Crop Recommendation and Agricultural Query Answering System Using Spatio-Temporal Graph Neural Networks and Hybrid Retrieval Augmentation

This paper presents a unified system designed to support precision agriculture by integrating advanced weather

用途: 生成
難易度: Hard
コスト: Low

Late-Layer Fusion is Enough: Dual-Path Vision Token Routing for Multimodal Large Language Models under Visual Saturation

Multimodal large language models (MLLMs) commonly inherit the deep, symmetric Transformer backbone designed fo

用途: 生成
難易度: Hard
コスト: High

Driving Video Retrieval for Complex Queries with Structured Grounding

Video retrieval at scale is central to data curation and safety validation in autonomous driving, where users

コンピュータビジョンマルチモーダルテキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

From Shortcuts to Reasoning: Robust Post-Training of Theory of Mind with Reinforcement Learning

理論的思考は、最新の基礎モデルシステムが安全かつ効果的に現実世界で動作するには必須のスキルであると考えられています。しかし、理論的思考の進進には、「ショートカット」問題が存在し、タスクは99％の正解率を達成するのに、ただ

自然言語処理RAGテキストマルチモーダル強化学習

用途: 理論的思考の強化問題
難易度: Hard
コスト: High

Beyond FLOPs: Benchmarking Real Inference Acceleration of LLM Pruning under a GEMM-Centric Taxonomy

分析研究は、LLM推論速度を速めるため、トークン、レイヤー、ヘッド、次元、注意パターンの削減技術である削減技術を適用し、広範なパラダイムとして成長しています。削減方法の実装によって、実現された加速の度合いは、ハードウェア

用途: LLM推論加速問題
難易度: Hard
コスト: High

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

LLM推論において、長いコンテキストを扱うことが多く、GPUメモリボトルネックの問題が起きます。この課題に対処するために、Neural Memory Indexerと呼ばれる Neural Memory Indexerを

用途: GPUメモリ確保問題
難易度: Hard
コスト: High

深層学習軽量化・量子化テキストマルチモーダル強化学習

Stage-1 Controls the Entropy Regime, Not the Outcome

Two-stage post-training -- a Stage-1 warm-start (supervised fine-tuning, SFT, or on-policy distillation, OPD)

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

MilliVid: Hierarchical Latents for Long-Range Consistency in Video Generation

Video generative models have become increasingly powerful, but long-range consistency remains challenging to a

深層学習Transformer生成テキスト動画

用途: 生成
難易度: Hard
コスト: High

INFUSER: Influence-Guided Self-Evolution Improves Reasoning

Self-evolution offers a scalable path to stronger reasoning: a pretrained language model improves itself with

機械学習教師なし学習テキスト教師なし

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Decoy-Calibrated Failure Audits for Language Models

Useful audits reveal not only how often a model fails, but also where its failures concentrate. An auditor may

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能センサ/時系列品質予測/異常検知自然言語処理大規模言語モデルテキスト時系列

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

Clinical early warning systems built on electronic health records, in which clinical observations are recorded

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Understanding Quantization-Aware Training: Gradients at Quantized Weights Bias to the Low-Loss Basin

Post-training quantization (PTQ) converts a trained full-precision model into low-bit weights without task-lev

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Structure-Aware Modeling of Multiple-Choice Questions Improves Automatic Difficulty Estimation

質問の難易度を自動で推定することで、教材の質問を作成する際の手間を軽減し、学習者の成果を高めることができます。

用途: 質問の難易度推定
難易度: Hard
コスト: Medium

Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops

エージェントの安全性を向上するために、ハッカーのフェイクオートを作成して、リスクを評価するための新しいアプローチを提案します。

用途: エージェントの安全性向上
難易度: Hard
コスト: High

From Hazard Functions to Language Space: Cox-Supervised Distillation of Survival Risk into a Large Language Model

言語モデルの寿命リスクへの適用を実現するために、コックス比例危険モデルを使用して、新しいアプローチを提案します。

深層学習軽量化・量子化生成画像テキスト

用途: 言語モデルの寿命リスクへの適用
難易度: Hard
コスト: High

Diffuse AI Control on Fuzzy Tasks

この論文では、AI 系統が安全性の検証を容易にするために、新しいフレームワークを提出する。これにより、AI 系統の安全性の評価がより効果的になる。

用途: AI 安全性の検証
難易度: Hard
コスト: High

自然言語処理大規模言語モデルテキストマルチモーダル

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics

この論文では、VLM ゲームエージェントの評価基準が提供され、さまざまなタイプのエージェント間の比較が可能になる。

用途: VLM ゲームエージェントの評価基準
難易度: Hard
コスト: High

AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing

この論文では、ロボット手術の制御を改善するために、ロボットの視覚的シーンの動作と操作を同時にモデル化する方法を提案する。

深層学習Transformer画像テキスト動画

用途: リモートハンドリングの制御
難易度: Hard
コスト: High

説明可能コンピュータビジョンセグメンテーションテキスト

Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting

この論文では、AI エヴァルレーション結果をより効果的に解釈するために、新しいフレームワークを提案する。

用途: AI エヴァルレーション結果の解釈
難易度: Hard
コスト: Medium

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

この論文では、エージェントの委譲能力を改善するために、新しいフレームワークを提案する。これにより、エージェントがより効率的にタスクを分割できる。

用途: エージェントの委譲
難易度: Hard
コスト: High

Beyond Probabilistic Similarity: Structural, Temporal, and Causal Limitations of Retrieval-Augmented Generation in the Legal Domain

この論文では、法令上の異議申し立てを検出し、法令上の違反を最小限に抑える方法を提案する。

自然言語処理RAG検出生成テキスト

用途: 法令上の異議申し立ての検出
難易度: Hard
コスト: Low

Observability for Delegated Execution in Agentic AI Systems

この論文では、分散型エクスキューションの観察性を考慮するために、新しいフレームワークを提案する。これにより、分散型エクスキューションの評価がより効果的に行える。

用途: 分散型エクスキューションの観察性
難易度: Hard
コスト: High

An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors: A Vendor-Neutral Reference for FP8, BF16, MXFP4, and Microscaling Formats

この論文では、数値形式の標準化を提案する。これにより、数字の解釈と操作がより効率的に行える。

機械学習教師あり学習テキスト

用途: 数値形式の標準化
難易度: Easy
コスト: Medium

(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs

この論文では、自動化された形式化を提案する。これにより、形式化プロセスがより効率的に行える。

用途: 自動化された形式化
難易度: Hard
コスト: High

自然言語処理大規模言語モデル画像テキストマルチモーダル

SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks

Spatial reasoning is a foundational capability for multimodal large language models (MLLMs) to perceive and op

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知コンピュータビジョン動画認識検出画像テキスト

ArtiFact: A Large-Scale Multi-Modal Cultural Heritage Dataset

LLMを用いた臨床研究論文の草案作成を支援するために、生成されたテキストを検証するためのアーキテクチャを設計。これにより、虚偽の citaion、数字の不正確な記録、およびガイドライン違反が防がれます。

用途: 医学論文執筆のサポート
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer分類検出テキスト

ATN3D: Density-Aware LiDAR-Radar Early 3D Object Detection Under Extreme Sparsity

自動運転車やインテリジェント輸送システムなどの自動化された車両の感知には3次元オブジェクト検出が必要です。道路での長距離検出は困難ですが、道路ではこの「長距離」に対する感知と決定の時間は約1-2秒です。2つの主な課題が現

用途: 車のデッキの長距離認識に対する3次元オブジェクト検出
難易度: Hard
コスト: High

AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving

LLM間でモデル呼び出しと外部ツールの呼び出しが交互になり、サーバのサーヒングがステートレスの要求処理からステートフルなプログラム実行に移行します。これらのワークロードの評価は、各設計点ごとに専門的なアクセラレータ時間を

用途: LLMのサーバー処理のためのシミュレータ
難易度: Hard
コスト: High

Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text

Chain-of-Thought (CoT) improves the performance of Large Language Models (LLMs) and has been extended to Multi

深層学習軽量化・量子化画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

TABVERSE: Benchmarking Cross-Format Table Understanding in LLMs and VLMs

Large Language Models (LLMs) and Vision-Language Models (VLMs) are increasingly evaluated on table reasoning t

自然言語処理大規模言語モデルQA画像テキスト

用途: QA
難易度: Hard
コスト: High

AI Scientists Are Only as Good as Their Evidence: A Stratified Ablation of Proprietary Data and Reasoning Skills in Drug-Asset Valuation

AI Scientist agents are often evaluated as if capability were mainly a function of model quality, prompting, o

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

Two-server secure inference allows a client to query a hosted large language model (LLM) without revealing pro

用途: 生成
難易度: Hard
コスト: High

SecureClaw: Clawing Back Control of LLM Agents

Tool-using large language model (LLM) agents face two distinct security failures: unauthorized external action

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Hard
コスト: High

Model Poisoning Against Federated Model Adaptation with Chain of Bit-Flips

Federated Learning (FL) allows a set of clients to collectively train a global model without sharing local tra

深層学習CNNテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Emergence of Context Characteristics Sensitivity in Large Language Models

During instruction fine-tuning (IFT), large language models (LLMs) learn to follow instructions by using the p

深層学習Transformerテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Closing the Prior-Posterior Loop: Self-Reflective Molecular Design with Analysis-Driven LLM Iteration

Can a general-purpose large language model design molecules with the precision of a seasoned chemist? Current

MI向き自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs

Existing sparse attention and KV cache compression methods for long-context LLM inference typically apply fixe

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル検出生成テキスト

Deterministic Integrity Gates for LLM-Assisted Clinical Manuscript Preparation: An Auditable Biomedical Informatics Architecture

Objective. Large language models (LLMs) increasingly draft clinical research manuscripts, but their fluency ca

用途: 検出
難易度: Hard
コスト: High

Targeting World Models to Compromise Robot Learning Pipelines

世界モデルがロボットの学習パイプラインに導入されると、安全でないロボットがDeploymentされるリスクが生じる可能性があることが示されている。

用途: ロボットの安全な使用を確保する
難易度: Hard
コスト: High

LLM-Orchestrated Conformance Checking in Stroke Care Without Computer-Interpretable Guidelines

医療のガイドラインとの適用を自動的に評価することを目的とするコンフォーマンスチェックフレームワークが開発された。Large Language Models (LLMs) を用いて、コンフォーマンスチェックを実現する。

説明可能自然言語処理大規模言語モデルテキスト

用途: 医療におけるガイドラインの適用を支援する
難易度: Hard
コスト: High

AliyunConsoleAgent: Training Web Agents in Real-World Cloud Environments via Distillation and Reinforcement Learning

Webエージェントを自動化するためのAliyunConsoleAgentが提案され、ドキュメントの検証とWebエージェントの開発を簡素化する。

深層学習軽量化・量子化テキスト強化学習

用途: Webエージェントの自動化
難易度: Hard
コスト: High

品質予測/異常検知深層学習Attention機構生成テキスト

SIFT: Selective-Index For Fast Compute of RAG Prefill by Exploiting Attention Invariance

RAGプレフィルへの速力向上を目的としたSIFTが提案され、TTFTを短縮し、コストを削減する。

用途: RAGプレフィルへの速力向上
難易度: Hard
コスト: High

MI向き品質予測/異常検知コンピュータビジョンマルチモーダル分類検出画像

Context-Aware Deep Learning for Defect Classification in Atomic-Resolution STEM

マテリアルの非破壊検査を目的としたContext-Aware Deep Learningが提案され、エアロックの欠陥を検出する。

用途: マテリアルの非破壊検査
難易度: Hard
コスト: High

AI Assurance in UK Defence: Challenges in Operationalising JSP 936

スキルアジュストの能力獲得を目的としたCapability-Aligned Hierarchical Learningが提案され、LLMsが外部ツールを操作してタスクを実行する能力を獲得する。

生成AIGANテキスト

用途: スキルアジュストの能力獲得
難易度: Hard
コスト: High

RunAgent SuperBrowser: A Theory of Autonomous Web Navigation Grounded in Human Browsing Behaviour

We present SUPERBROWSER, an autonomous web-navigation agent designed against a single guiding hypothesis: a we

MLOpsパイプライン構築画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Real-time body pose non-verbal communication with a consistency-based reliability measure

Body movement communicates intent at distances and in conditions where neither the face, nor speech can be cap

機械学習教師なし学習分類予測テキスト

用途: 分類
難易度: Hard
コスト: Low

強化学習方策勾配 (PPO / A3C)画像テキスト

PhysScene: A Scene Graph Dataset for Scientific Visual Reasoning in Physics Experiments

Scene Graphs (SGs) provide structured representations of visual scenes by modeling objects and their pairwise

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Experience Makes Skillful: Enabling Generalizable Medical Agent Reasoning via Self-Evolving Skill Memory

Medical agent systems are increasingly expected to support interactive clinical decision making rather than on

生成AIGANQAテキスト

用途: QA
難易度: Hard
コスト: Low

表形式向き品質予測/異常検知深層学習軽量化・量子化埋め込みテキスト表形式

TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders

可勉強のターブルの信号に関する表現モデルが、異なるトレーニングパラダイムを持つモデルを評価しやすくする基準であるTRL-Benchを提案している。

用途: 可勉強のタブラー信号に対する表現モデルの評価基準を標準化する
難易度: Hard
コスト: High

Anything2Skill: Compiling External Knowledge into Reusable Skills for Agents

代理機器が外部の知識を活用して、多くのタスクを効率的に解決できる「Anything2Skill」を提案している。

自然言語処理RAG生成テキスト

用途: 代理機器が外部の知識を活用して、効率的に問題解決ができる技術の開発
難易度: Hard
コスト: Low

Brain-Prompt Injection: A Route-Safety Audit for BCI-LLM Agents

脳-エージェント接続での脳サイン入力を安全に実行できるシステムを提案し、脳サイン入力攻撃を検知することができる。

用途: BCI-LLMエージェントに脳サインを入力する際に安全さを確保するためのシステム
難易度: Hard
コスト: High

End-to-End Training for Discrete Token LLM based TTS System

エンドツーエンドトレーニングによるTTSシステムを提案し、エンドツーエンドトレーニングの利点を確認している。

自然言語処理大規模言語モデル分類生成テキスト

用途: エンドツーエンドトレーニングによるTTSシステムの提案
難易度: Hard
コスト: High

品質予測/異常検知深層学習Transformer生成テキスト

MASS: Deep Research for Social Sciences with Memory-Augmented Social Simulation

Social Scienceにおける、Memory-Augmented Social Simulationを利用した深層学習を利用して、新しい研究方法を提案し、Social Scienceの研究実現を実現した

用途: Social Scienceにおける、Memory-Augmented Social Simulationを利用した深層学習を利用した研究の実現
難易度: Hard
コスト: High

Culturally-Adapted Red-Teaming Across East and Southeast Asian Contexts: A Methodological and Comparative Analysis

Multilingual safety evaluation of large language models (LLMs) has predominantly relied on direct translation

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

自然言語処理プロンプトエンジニアリング生成画像テキスト

IMUG-Bench: Benchmarking Unified Multimodal Models on Interleaved Understanding and Generation

In recent years, unified multimodal models (UMMs) have emerged to support both understanding and generation wi

用途: 生成
難易度: Hard
コスト: High

Unified Energy for Invariant and Independent Decoding in Diffusion Language Models

Diffusion Language Models (DLMs) enable parallel text generation by iteratively denoising a full sequence, off

用途: 生成
難易度: Hard
コスト: High

SEF-CLGC at SemEval-2026 Task 11: Logical Notation Impact on Language Model Performance

This paper revisits our pipeline called Syllogistic Evaluation Framework-Common Logic Grammar Construction (SE

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models

Egocentric visionを使用して、ペダストリアンの歩く道に渡るのを予測する。Closed-ended visual question answering（VQA）問題に形式することで、ビジョン言語モデルを使用

深層学習TransformerQA画像テキスト

用途: ペダストリアンが歩く道に渡るのを予測する
難易度: Hard
コスト: High

Steganography Without Modification: Hidden Communication via LLM Seeds

大規模言語モデル（LLM）の推論スタックには、モデルの重み、サンプリングコード、および出力分布を変更することなく、暗号化なしで秘密コミュニケーションを行うステゴグラフィチャンネルが存在する。送信者はシークレットデータを秘

用途: 暗号化なし: LLMのシードを使用した秘密のコミュニケーション
難易度: Hard
コスト: High

From USD Scenes to Knowledge Graphs: Zero-Shot Ontology Grounding with LLMs

3次元シミュレーションシーンから知識グラフを構築することが、ロボットのタスク推論に重要な役割を果たすが、シーンのオブジェクトを形式的な分類にマッピングするステップが、現実に現れていない。LLMを使用して、このマッピングの

自然言語処理大規模言語モデルテキスト3D

用途: 3次元シミュレーションシーンから知識グラフを構築する
難易度: Hard
コスト: High

Vision Language Model Helps Private Information De-Identification in Vision Data

ビジュアル言語モデル（VLM）は、プライバシー保護において有効性の高い能力をもつ。しかし、視覚データを扱う際のプライバシーリスクについては、それまでほとんど注目されていなかった。VLMを使用して、プライバシー保護を確保す

コンピュータビジョン物体検出分類検出画像

用途: ビジョン言語モデルを使用したビジュアルデータのプライバシー保護
難易度: Hard
コスト: High

Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges

大規模言語モデルのプライバシーリスクについては、既に研究が行われていたが、マルチモデル大規模言語モデル（MLLM）のプライバシーリスクについては、まだ十分に調査されていなかった。MLLMでは、テキストだけでなく画像データ

自然言語処理大規模言語モデル画像テキスト

用途: マルチモデル大規模言語モデルにおけるプライバシーリスク
難易度: Hard
コスト: High

A Regret Minimization Framework on Preference Learning in Large Language Models

強化学習（RL）では、与えられた問題に対して、正しいアクションを見つけることを目的としたことが多いが、人間のフィードバックから学習する場合、人間の意思決定の選択のための意思決定のフレームワークを構築する必要性から、可否決

用途: 可能な行動の選択のための意思決定フレームワーク
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル異常検知テキスト

ComplexConstraints and Beyond: Expert Rubrics for RLVR

訓練データ以外の問題解決を検討したため、新しい評価方法であるexpert-curated rubric-based evaluationを提案。

用途: 訓練データ以外の問題解決
難易度: Hard
コスト: High

Graph2Idea:Retrieval-Augmented Scientific Idea Generation with Graph-Structured Contexts

科学的アイデア生成には、現実に実現可能な高質のアイデアを必要とするが、この課題を解く方法は不足していたため、新しい方法であるGraph2Ideaを提案。

用途: 科学的アイデア生成
難易度: Hard
コスト: High

Context Rot in AI-Assisted Software Development: Repurposing Documentation Consistency for AI Configuration Artifacts

AIアシスタントを使用

用途: コンテキストの保持のための開発方法
難易度: Hard
コスト: High

DynaOD: Dynamic Origin-Destination Flow Generation with Discrete-to-Continuous Temporal Semantic Modeling

Dynamic origin-destination (OD) flow generation seeks to synthesize realistic mobility dynamics from temporal

用途: 生成
難易度: Hard
コスト: High

Context-Fractured Decomposition Attacks on Tool-Using LLM Agents: Exploiting Artifact Provenance Gaps

Tool-using LLM agents interact with the world through actions that persist state in artifacts (e.g., workspace

MI向き自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

REFLECT: Intervention-Supported Error Attribution for Silent Failures in LLM Agent Traces

Large language model (LLM) agents now solve complex tasks through long plan-and-execution traces, yet the abil

自然言語処理大規模言語モデル分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

See More, Think Deeper: Query-Expanded Visual Evidence and Answer-Clue Guided Reflection for Long Video Understanding

Recent advances in Video Large Language Models (Video-LLMs) have enabled performance on long-video understandi

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

センサ/時系列品質予測/異常検知深層学習軽量化・量子化生成テキスト音声

BareWave: Waveform-Native Flow-Matching Text-to-Speech

Removing intermediate representations and separately trained decoding stages has become an important direction

用途: 生成
難易度: Hard
コスト: High

MI向き深層学習軽量化・量子化テキストマルチモーダル強化学習

Personalization Meets Safety:Mechanisms,Risks,and Mitigations in Personalized LLMs

Large Language Models (LLMs) have enabled increasingly personalized interactions by adapting to users' prefere

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能自然言語処理大規模言語モデル生成異常検知テキスト

A Multi-Agent System for IPMSM Design Optimization via an FEA-AI Hybrid Approach

Interior permanent magnet synchronous motor (IPMSM) design requires balancing conflicting objectives and multi

用途: 生成
難易度: Hard
コスト: High

SafeRun: Enabling Determinism in LLM Planning for Running

LLMを利用したランニングの計画における決定論的安定性を確保するために、SafeRunというフレームワークを提案。LLMと決定論的ソルバーを分離して、安全ルールの厳格な実施を確保。

用途: ランナーの安全と安定度の向上
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化テキスト音声

TLDR: Compressing Audio Tokens for Efficient Autoregressive Text-to-Speech

オーディオTokenと文書をモデル化するためにコーデックベースのARトークのジェネレーターが強力な文を音声の質を高めました。しかし、このアプローチでは、音声Tokenのシーケンスはテキストシーケンスより長くなるため、AR

用途: オーディオTokenの圧縮による話者ジェネレータの効率化
難易度: Hard
コスト: High

少数データ向き表形式向き自然言語処理大規模言語モデル分類生成回帰

LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)

LLMがTABULARデータ分析で機能を自動化できるようにした。しかし、標準化されたプラットフォームの欠如は、比較やコスト的評価を行うのを難しくしている。複雑なメソッドの設計により、各コンポーネントの具体的な貢献をはっき

用途: TABULARデータ分析のLLMパラダイムの比較評価
難易度: Hard
コスト: High

コンピュータビジョンセグメンテーション生成テキスト

The Token Not Taken: Sampling, State, and the Variability of AI Agent Outputs

Agentic AIシステムの不確実性が、同じ要求から異なる計画、ツールの呼び出しなどが生成されることを示唆している。このようにしてシステムの信頼性を確保するには、AIエージェントのパラメータを確立することが重要となる。

用途: AIエージェントのパラメータの確立に寄与する
難易度: Hard
コスト: High

コンピュータビジョンマルチモーダルQA画像テキスト

Baichuan-M4: A Clinical-Grade Medical Agent System for Continuous Care

連続的な治療に適した臨床級LLM医系であるBaichuan-M4を導入。臨床的な医療エージェントシステムであるBaichuan-M4は、統合的な医療エージェントシステムをベースとし、医療エージェントと医療エージェントの連

用途: 統合医療医系のためのLLMベースの医療エージェント
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト自己教師

RTL-BenchLS: A Large-Scale Benchmark for RTL Reasoning and Generation with Large Language Models

LLMベースのRTL生成と推論は、ハードウェア設計自動化の新たな方向を示唆します。しかし、ベンチマークは、大規模化とタスクスコープの制約がある。現存するベンチマークでは、前向きモデルの実績

用途: RTLリージョニングと生成のための大規模ベンチマーク作成
難易度: Hard
コスト: High

Diverse Thinking Schemata Elicit Better Reasoning in Large Language Models

Large reasoning models (LRMs) have attracted increasing attention for their ability to solve complex mathemati

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

An Effective Router for Vision-Language Model Selection

Vision-language models (VLMs) with varying performance and resource requirements are widely deployed, making i

自然言語処理大規模言語モデル異常検知画像テキスト

用途: 異常検知
難易度: Hard
コスト: High

CARE: A Conformal Safety Layer for Medical Summarization

Large language models (LLMs) are increasingly used for medical summarization, but their outputs can omit medic

自然言語処理大規模言語モデル検出要約テキスト

用途: 検出
難易度: Hard
コスト: High

NutriMLLM: Multimodal Large Language Models for Dietary Micronutrient Analysis

Comprehensive estimation of dietary micronutrients from food images could improve clinical nutrition care, but

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

PACT: Learning Diverse Diagnostic Strategies via Privileged Synthesis and Branch Consensus

Clinical diagnosis requires flexible use of multiple reasoning paradigms under incomplete patient information.

用途: 生成
難易度: Hard
コスト: High

強化学習方策勾配 (PPO / A3C)生成要約検索

Report on CHIIR 2026 Workshop on Generative AI and Academic Search (GAI&AS)

This report summarizes the CHIIR 2026 Workshop on Generative AI and Academic Search (GAI\&AS), which examined

用途: 生成
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理ファインチューニング検出画像テキスト

Failure-Aware Refinement of Vision-Language Model for Lithography Defect Detection

Semiconductor lithography inspection requires reliable detection of small pattern defects such as bridge, burr

用途: 検出
難易度: Hard
コスト: High

Order Matters: Unveiling the Hidden Impact of Macro Placement Sequences via Proxy-Guided LLM Evolution

Macro placement is a fundamental step in modern chip physical design, playing a crucial role in determining th

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

表形式向きCPUで試しやすいセンサ/時系列深層学習Transformer予測テキスト時系列

FAME: Forecastability-Aware Mixture of Experts for Heterogeneous Time Series Forecasting

この研究では、複数の時系列予測を合わせたモデルを使用して、個々の時系列の特性を考慮した予測を行うFAMEを提案します。このモデルは、個々の時系列の特性を考慮することで、より正確な予測が可能になります。

用途: 多様な時系列予測
難易度: Easy
コスト: Low

MI向き品質予測/異常検知自然言語処理ファインチューニング分類生成テキスト

Quality-Diversity Search in Sound Generation: Investigating Innovation Engines for Audio Exploration

この研究では、音楽生成における多様性を促進するためのオープンソース・フレームワークを開発します。このフレームワークは、音楽生成における多様性の促進を支援するために、進化的プロセスと多様性促進アルゴリズムを組み合わせたもの

用途: 音楽生成における多様性の促進
難易度: Hard
コスト: Low

条件最適化コンピュータビジョンセグメンテーションテキスト

Quantitative Performance Analysis of Stopping Criteria for CMA-ES

この研究では、CMA-ESアルゴリズムの停止条件を評価します。この研究では、CMA-ESアルゴリズムの停止条件が機能するかどうかを調べ、アルゴリズムを改良するための情報を提供します。

用途: 最適化アルゴリズムの評価
難易度: Hard
コスト: Medium

Causally Evaluating the Learnability of Formal Language Tasks

この研究では、形式言語の学習性を評価するための方法を開発します。この方法は、形式言語の学習性がどれだけのデータを必要とするかを評価することができます。

用途: 形式言語の学習性評価
難易度: Hard
コスト: High

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

この研究では、大規模言語モデルの安全性を評価するためのフレームワーク、PsychoSafe を開発します。このフレームワークは、大規模言語モデルの安全性を評価し、潜在的なリスクを軽減することができます。

自然言語処理大規模言語モデル生成テキスト強化学習

用途: 大規模言語モデルの安全性評価
難易度: Hard
コスト: High

IS-CoT: Breaking the Long-form Generation Collapse via Interleaved Structural Thinking

この研究では、長文生成モデルの改良を実現するためのフレームワーク、IS-CoT を開発します。このフレームワークは、長文生成モデルの生成性とコントロール性を改善することができます。

用途: 長文生成モデルの改良
難易度: Hard
コスト: High

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

この研究では、マルチモーダル言語モデルの評価のためのフレームワークを開発します。このフレームワークは、マルチモーダル言語モデルの生成性とコントロール性を評価することができます。

用途: マルチモーダル言語モデルの評価
難易度: Hard
コスト: High

Where Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving

Multimodal large language models (MLLMs) achieve strong results on visual reasoning benchmarks, but answer acc

自然言語処理大規模言語モデルQA画像テキスト

用途: QA
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化検出生成テキスト

Gradient-Guided Reward Optimization for Inference-time Alignment

Ensuring the reliability of Large Language Models (LLMs) under distribution drift requires inference-time adap

用途: 検出
難易度: Hard
コスト: High

Civil Court Simulation with Large Language Models

Court simulation bridges legal education and judicial practice, yet human-based simulations are costly and dif

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Automated IEP Generation from Traditional Chinese Parent-Teacher Interviews via Corpus-Grounded Feature Diffusion

Writing Individualized Education Programs (IEPs) is a high-labor, knowledge-intensive document burden; English

用途: 生成
難易度: Hard
コスト: High

Clinically Grounded Privacy Evaluation of Medical LMs

Medical language models (LMs) can memorize and reproduce protected health information, but privacy evaluations

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

UXBench: Benchmarking User Experience in AI Assistants

As AI assistants serve millions of users daily, evaluating user experience (UX) beyond general model capabilit

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト音声

OpenBibleTTS: Large-Scale Speech Resources and TTS Models for Low-Resource Languages

Recent advances in neural text-to-speech (TTS) and multilingual speech generation have substantially improved

用途: 生成
難易度: Hard
コスト: High

センサ/時系列深層学習Transformerテキスト音声

Overcoming Decoder Inconsistencies in Whisper for Dravidian and Low-Resource Languages

WhisperのようなマルチリンガルASRモデルの音声認識能力をDravidian言語で向上させるために、データセットと言語分析を用い、モデルをフィネチュアリングし、デコーダの不平衡を解消し、音声認識誤差を低減した。

用途: Dravidian言語の音声認識を改善する
難易度: Hard
コスト: Medium

Detecting Differences Is Not Understanding Structure: Large Language Models Fail at Graph Isomorphism

この研究では、大きな言語モデルがグラフの同型性を推論できるかどうか調査し、小さなグラフでは同型性を認識できたものの、シードノードラベルを入れ替えてグラフ同型性を検証した結果、同型性が識別されなかった。

自然言語処理大規模言語モデル検出テキスト

用途: グラフの同型性を推論する
難易度: Hard
コスト: High

DECSELFMASK: Leveraging Unlabeled Text via Self-Relevance-Guided Masking for Decoder-Only Classification

予備情報が少ない場合や医療分野などの特定の分野の場合、分類タスクは難しいようになるが、この研究では、モデルが未分類データを操作して、分類モデルの性能を向上させる方法である、DecSelfMaskを提案した。

自然言語処理RAG分類生成テキスト

用途: 分類タスクの性能向上
難易度: Hard
コスト: High

自然言語処理大規模言語モデル生成テキストマルチモーダル

H2HMem: A Multimodal Memory Benchmark for Agents in Human-Human Interactions

大きな言語モデルには記憶や推論機能があるが、ユーザーとの対話におけるこれらの機能の効果はまだ理解されているわけではない。これを受け、この研究では、人間の相互作用、特に会話における記憶と推論能力を評価するためのマルチモーダ

用途: マルチモーダル記憶の評価
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理RAG生成検索テキスト

AbstRAG: Learning to Abstract for Retrieval Problems

この研究では、検索タスクにおける抽象レベルにおけるギャップを解消するためのフレームワークであるAbstRAGを提案し、検索タスクにおけるギャップを解消したことで、モデルが検索タスクにおいて正しく情報を開示した。

用途: リトラバージャグによる検索
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデル分類セグメンテーションテキスト

MUDIDI: A Two-Stage Framework for Multilingual Dictionary Digitization with Language Models

この研究では、低リソース言語や絶滅言語の辞書のデジタル化が重要であるが、マルチモーダル辞書をデジタル化する方法は今まで難しかったが、この研究では、最近のビジョン言語モデルを用いて辞書のデジタル化が容易になり、辞書内の文字

用途: ムルティリンガル辞書のデジタル化
難易度: Hard
コスト: High

品質予測/異常検知コンピュータビジョンマルチモーダル分類画像テキスト

Guide Me Out: A Framework to Benchmark VLM Operators Communication in Crisis Scenarios

危機管理では、コミュニケーションと地理

用途: 危機管理におけるコミュニケーションを評価する
難易度: Hard
コスト: High

Toward Signing Activity Projection in Sign Language Interaction

Social robots must interact robustly not only with users assumed by speech-centered systems but also with dive

深層学習Transformerテキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

What Should a Skill Remember? Quality-Cost Trade-offs in Cost-Aware Skill Rewriting for Language Model Agents

Large language model agents increasingly rely on skills: reusable procedural documents encoding workflows, too

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

LexRubric: A Rubric-Guided Diagnostic Benchmark for Open-Ended Legal Tasks

As large language models (LLMs) are increasingly applied to real-world legal tasks, evaluating the reliability

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer分類テキスト音声

Is Text All You Need? Text as a Universal Information Bottleneck for Speech LLMs

Large language models (LLMs) provide a powerful reasoning backbone for speech understanding, but integrating c

用途: 分類
難易度: Hard
コスト: High

CPUで試しやすい自然言語処理大規模言語モデルテキスト

In-Context Learning for the Imputation of Public Opinion Data with Large Language Models

Large language models have been widely evaluated as simulators of individual survey responses. In practice, ho

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Multi-Hop Knowledge Composition is Bound by Pretraining Exposure

Large Language Models fail at implicit multi-hop reasoning: a model answers "When was $X$ born?" and "Who is $

MI向き自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向き品質予測/異常検知自然言語処理大規模言語モデルテキスト

How Far Can Prompting Go for Minimal-Edit Ukrainian Grammatical Error Correction?

Fine-tuned Large Language Models (LLMs) dominate in Ukrainian grammatical error correction (GEC), while API-ac

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer分類画像テキスト

NüshuVoice: Reviving the Voice of Endangered Nüshu with Pitch-Aware Text-to-Speech

Nüshu is an endangered phonetic script historically used by women in Jiangyong County, southern Hunan, China.

用途: 分類
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト

TruthSplit: Operationalizing Conditional Validity in Arguments Through Multi-Perspective Reasoning

We present TruthSplit, an interactive system for multi-perspective argument analysis. Existing argumentation t

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Symbolic and Abstractive Reasoning with Complex Visual Queries

Understanding and reasoning over abstract visual content remains a challenge for current multi-modal large lan

自然言語処理大規模言語モデル画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能自然言語処理RAG画像テキストマルチモーダル

Explicit Representation Alignment for Multimodal Sentiment Analysis

Multimodal affective analysis aims to understand human sentiment and emotion by jointly modeling heterogeneous

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向き説明可能深層学習軽量化・量子化検出画像テキスト

MAAM: Anchor-Preserving Compression and Contextual Calibration for Chinese Discriminatory Language Detection

Chinese discriminatory-language detection is challenging because harmful intent is often implicit and context-

用途: 検出
難易度: Hard
コスト: High

Emergent Misalignment Can Be Induced by Sycophancy and Reversed via Alignment Gating

Prior work has shown that fine-tuning large language models on malicious or incorrect outputs in narrow domain

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

CRANE: Knowledge Editing for Reasoning MLLMs

The emergence of reasoning multimodal large language models (MLLMs), which generate explicit chain-of-thought

自然言語処理大規模言語モデル異常検知画像テキスト

用途: 異常検知
難易度: Hard
コスト: High

Bridging the Agent-World Gap: Text World Models for LLM-based Agents

Large language model (LLM)-based agents are increasingly used in interactive textual environments, from web na

用途: 生成
難易度: Hard
コスト: High

Personal Salience: Highlighting Is Social, but Individuality Lives in Selection

Social highlighters let people mark passages that matter to them. We ask how much of an individual is recovera

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Document-Authored Control-Signal Impersonation: A Low-Cost Indirect Prompt Attack on RAG Safety Boundaries

Retrieval-augmented generation (RAG) systems often serialize user queries, retrieved documents, metadata, syst

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル要約テキスト

Language-Aware Token Boosting: LLM Language Confusion Reduction Without Tuning

Large language models (LLMs) sometimes exhibit language confusion when generating non-English text. Existing a

用途: 要約
難易度: Hard
コスト: High

表形式向き品質予測/異常検知自然言語処理RAG分類QA画像

ChinaHeritaQA: A Culturally-Grounded Visual Question Answering Dataset for World Heritage Sites in China

We introduce ChinaHeritaQA, a multimodal benchmark dataset for evaluating the cultural reasoning abilities of

用途: 分類
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル要約テキスト強化学習

Multilingual Sentiment Aware Text Summarization A Reinforcement Learning Approach for Consistency Maintenance

Reinforcement Learning from Human Feedback (RLHF) has significantly improved the quality and fluency of large

用途: 要約
難易度: Hard
コスト: High

Are Reasoning Vision-Language Models Robust to Semantic Visual Distractions?

Reasoning Vision-Language Models (VLMs) achieve strong performance on complex multimodal tasks, but reliable r

コンピュータビジョンマルチモーダル画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理RAG生成画像テキスト

Latent Spatial Memory for Video World Models

Video world models that maintain 3D spatial consistency across generated frames typically rely on explicit poi

用途: 生成
難易度: Hard
コスト: High

コンピュータビジョン動画認識テキストマルチモーダル

MemoryVLA++: Temporal Modeling via Memory and Imagination in Vision-Language-Action Models

Temporal modeling is essential for robotic manipulation, as effective control requires both memory of past int

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Beyond Spherical Harmonics: Rethinking Appearance Models for Radiance Reconstruction

View-dependent appearance modeling remains a challenging problem in novel-view synthesis and reconstruction. A

用途: 生成
難易度: Hard
コスト: Medium

POTATR: A Lightweight Image-to-Graph Model for Page-Level Table Extraction

Large-scale document processing requires contextually aware table extraction (TE) that is both accurate and ef

深層学習Transformer検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

HDSL: A Hierarchical Domain-Specific Language for Structured 3D Indoor Scene Generation and Localized Editing with LLM Agents

Text-driven indoor scene generation and editing require an intermediate representation that language models ca

自然言語処理大規模言語モデル生成テキスト3D

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知コンピュータビジョンセグメンテーション生成画像テキスト

Cranio-Diff: Diffusion-based Cross-domain Craniofacial Reconstruction with 2D X-ray Skull Guidance and Structural Identity Constraints

The state-of-the-art generative models, such as CycleGAN, Pix2Pix, and diffusion models have demonstrated rema

用途: 生成
難易度: Hard
コスト: High

SoccerNet 2026 Player-Centric Ball-Action Spotting:Retraining and Post-Processing Extensions to the FOOTPASS Baselines

We describe our system for the SoccerNet 2026 Player-Centric Ball-Action Spotting Challenge, which requires pr

深層学習グラフニューラルネット画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能深層学習Transformerテキスト動画

MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

The dominant paradigm in video retrieval relies on embedding-based full-corpus scanning, which suffers from in

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理RAG生成テキスト音声

CineDance: Towards Next-Generation Multi-Shot Long-Form Cinematic Audio-Video Generation

The fidelity and structural diversity of training datasets fundamentally determine the capabilities of video g

用途: 生成
難易度: Hard
コスト: High

ContextShift: A Controlled Benchmark for Context Dependence in Object Detection

Modern object detectors achieve strong performance on standard benchmarks, yet their robustness to contextual

コンピュータビジョン物体検出検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

少数データ向き自然言語処理プロンプトエンジニアリング分類セグメンテーション画像

Training-Free Generalized Few-Shot Segmentation through Open-Vocabulary Semantic Arbitration

Generalized Few-Shot Semantic Segmentation (GFSS) has traditionally been approached as a representation-learni

用途: 分類
難易度: Hard
コスト: High

説明可能深層学習Transformer分類検出画像

Leveraging Morphology for Historical Script Metrological Analysis

Advances in handwritten text recognition have enabled large-scale transcription of historical documents, but s

用途: 分類
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト動画

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

Image and video captioning are fundamental tasks that bridge the visual and linguistic domains, playing a crit

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理RAG画像テキスト音声

Echo-DM: Ultrasound Marker Removal via Conditional Latent Diffusion and Region-Aware Fusion

Clinical ultrasound images often contain artificial markers, such as measurement calipers and text, to assist

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

ExDet: Open-Domain Open-Vocabulary Detection with Cross-modal Extrapolation and Rectification

Open-domain open-vocabulary detection (ODOVD) requires detectors to generalize to both novel categories and un

深層学習軽量化・量子化分類検出画像

用途: 分類
難易度: Hard
コスト: High

センサ/時系列コンピュータビジョン動画認識画像テキストマルチモーダル

IB-HFN: Information Bottleneck-Driven SAR-Optical Fusion Network for High-Fidelity Cloud Removal

Synthetic aperture radar (SAR)-assisted optical cloud removal aims to recover surface information obscured by

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

深層学習正規化・最適化手法分類生成セグメンテーション

Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

The rapid development of pretrained foundation models has enabled more general image segmentation. Multimodal

用途: 分類
難易度: Hard
コスト: High

Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

Visual reasoning requires integrating evidence distributed across regions, attributes, and relations, making s

深層学習軽量化・量子化画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

See More, Match Better: Multi-Source Feature Fusion for Two-View Correspondence Learning

Two-view correspondence learning aims to distinguish true correspondences (inliers) from false ones (outliers)

自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

品質予測/異常検知深層学習Transformer生成テキスト動画

LiteVSR: Lightweight Adaptation of Frozen Diffusion Transformers for Video Super-Resolution

Adapting large-scale pre-trained video generators for Video Super-Resolution (VSR) in novel domains remains co

用途: 生成
難易度: Hard
コスト: High

説明可能コンピュータビジョンマルチモーダル生成画像テキスト

MAGIS: Evidence-Based Multi-Agent Reasoning for Interpretable Strabismus Clinical Decision-Making

Strabismus is a common ocular disorder that requires fine-grained subtype diagnosis for individualized treatme

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知深層学習Transformer検出画像テキスト

Temporal-Aware Reasoning Optimization for Video Temporal Grounding

Multi-modal Large Language Models (MLLMs) have achieved remarkable progress in video temporal grounding with r

用途: 検出
難易度: Hard
コスト: High

CP4D: Compositional Physics-aware 4D Scene Generation

4D generation (\textit{i.e.}, dynamic 3D generation) has recently emerged as a rapidly growing research fronti

MI向き自然言語処理RAG生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

Vision-Language Guided Hyperspectral Object Tracking via Semantics Fusion and Contextual Template Updating

Hyperspectral object tracking (HOT) leverages the rich spectral information provided by hyperspectral videos (

深層学習軽量化・量子化画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

コンピュータビジョンセグメンテーション生成画像テキスト

OmniGen-AR: AutoRegressive Any-to-Image Generation

Autoregressive (AR) models have demonstrated strong potential in visual generation, offering superior performa

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

HDRAgent: An Agentic Framework for Multi-Exposure HDR Imaging

Most existing multi-exposure HDR methods follow a fixed feed-forward reconstruction paradigm, making them pron

用途: 生成
難易度: Hard
コスト: High

Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

Reward models are central to text-to-image post-training, but visual preference is subjective and better repre

深層学習軽量化・量子化生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

Frequency Decoupled Framework for Screen Content Image Super-Resolution

Methods based on implicit neural representations have demonstrated superior performance in Screen Content Imag

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

コンピュータビジョンセグメンテーション異常検知テキストマルチモーダル

Scaling by Diversified Experience for Vision-Language-Action Models

Vision-Language-Action models face significant challenges in real-world deployment due to the entanglement of

用途: 異常検知
難易度: Hard
コスト: High

When Vision Misleads, Let Location Speak: A Worldwide Image Geo-Localization Method via Location Attention Mechanism and Large Multimodal Models

Worldwide image geo-localization aims to determine the capture location of an image on a global scale. Existin

深層学習Transformer検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

Modeling Components and Connections in Cyber-Physical Systems

Text based configuration files for cyber-physical systems show the hierarchy of component modules well but oft

強化学習モデルベース画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

センサ/時系列深層学習軽量化・量子化検出画像テキスト

VGP-Nav: Metric-Aware Visual Geometric Perception for Robot Navigation

Reliable robotic navigation necessitates the seamless integration of accurate global localization and dense, m

用途: 検出
難易度: Hard
コスト: High

RPO-PDT: Demonstrating Role-Play-Based Knowledge Adaptation for Student Support Dialogue (Demonstration System)

We present RPO-PDT: a retrieval-grounded, role-play-based dialogue system for adaptive student support in high

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

LAEI: Layered Autonomous Edge Intelligence Framework for Robust UAV Swarm Operations

Autonomous UAV swarms require scalable coordination mechanisms that maintain mission performance under limited

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

BLM-SGAN: Bidirectional Language Modeling for Semantic-Spatial Text-to-Image Generation

Despite the success of image generation from text descriptions, it still faces challenges that are difficult t

用途: 生成
難易度: Easy
コスト: Low

Momentum for Reasoning: Dense Intrinsic Signals in Policy Optimization

Reinforcement learning with verifiable rewards (RLVR) has emerged as a powerful paradigm for eliciting long-ch

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Continuous Language Diffusion as a Decoder-Interface Problem

Gaussian-corrupted sentence embeddings have no direct linguistic interpretation, yet continuous diffusion lang

深層学習Transformer埋め込みテキスト

用途: 埋め込み
難易度: Hard
コスト: High

Q-Delta: Beyond Key-Value Associative State Evolution

Linear attention reformulates sequence modeling as recurrent state evolution, enabling efficient linear-time i

深層学習RNN / LSTMテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

How Many Counterfactuals Does It Take? Probing VLM Hallucinations Through Circuits and Causal Effects

Visual Language Models (VLMs) are known to produce hallucinated predictions that are not grounded in visual ev

自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

TeamHerald@CHIPSAL 2026: Hate Speech Detection and Sentiment Analysis of Nepali Memes using Transformer-based Architectures and Ensemble Learning

The analysis of internet memes in the Nepali language is complicated by frequent code-mixing and a lack of est

深層学習Transformer分類検出画像

用途: 分類
難易度: Hard
コスト: Low

センサ/時系列深層学習軽量化・量子化生成画像テキスト

IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking

Simulation plays a key role in automated robotics research supported by large language models (LLMs). However,

用途: 生成
難易度: Hard
コスト: High

MI向き自然言語処理大規模言語モデル生成テキストマルチモーダル

Artificial Intelligence for Mathematical Reasoning: An Integrated Survey of Language Models, Neuro-symbolic Systems, and Verified Discovery

Mathematical reasoning has long served as a stringent test of machine intelligence; over the past decade, it h

用途: 生成
難易度: Hard
コスト: High

SNR-ST-Mix: Sample-specific Neighborhood Regression Mixup for Augmented Spatial Transcriptomics Imputation with Deep Neural Network

Purpose: Spatial transcriptomics (ST) enables gene expression measurements within the tissue context. However,

深層学習軽量化・量子化分類回帰テキスト

用途: 分類
難易度: Hard
コスト: High

表形式向き品質予測/異常検知深層学習軽量化・量子化生成テキスト表形式

Agentic Search for Counterfactual Recourse under Fixed LLM Budgets

Counterfactual recourse aims to provide actionable feature changes that would alter an unfavorable decision ma

用途: 生成
難易度: Hard
コスト: High

Activation Steering Induces Emergent Misalignment: A More Comprehensive Evaluation

Activation steering has emerged as a popular inference-time technique for modulating the behavior of large lan

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Rank Intervals for Leaderboards: A Hierarchical Framework for Model Evaluation

Pretrained models are often evaluated on multi-task leaderboards to measure their applicability in diverse con

機械学習教師あり学習テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

A Comparison of SSL-Based Feature Extractors and Back-End Classifiers for Spoofing Detection: A Multi-Corpus Training and Cross-Linguistic Analysis

Voice biometric systems face growing threats from spoofing attacks, yet the evaluation of detection models rem

深層学習CNN分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

SpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM Serving

Prefill-decode (PD) disaggregation decouples prompt processing from token generation, but it also turns the ke

用途: 生成
難易度: Hard
コスト: High

深層学習Transformer予測テキスト強化学習

Towards Long-Horizon Vessel Trajectory and Destination Forecasting with Reasoning Large Language Models

Long-horizon maritime trajectory prediction is important for shipping management, logistics planning, and mari

用途: 予測
難易度: Hard
コスト: High

センサ/時系列自然言語処理プロンプトエンジニアリング予測テキスト時系列

Tyan-WP: A Wind Power Foundation Model for Ultra-Short-Term Probabilistic Forecasting

Global wind power capacity, especially in China, is booming, with new farms spanning diverse terrains and clim

用途: 予測
難易度: Hard
コスト: High

Convolutional Sparse Coding via the Locally Competitive Algorithm on Loihi 2

Sparse coding provides a principled framework for signal representation by expressing an input as a linear com

深層学習CNNテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

センサ/時系列自然言語処理大規模言語モデルテキスト時系列

Lost in the Non-convex Loss Landscape: How to Fine-tune the Large Time Series Model?

Recently, large time series models (LTSMs) have gained increasing attention due to their similarities to large

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列自然言語処理大規模言語モデル分類テキスト音声

Titans-as-a-Layer: Test-Time Memory for Conversational Speech Emotion Recognition

Speech emotion recognition (SER) is commonly formulated as utterance-level classification, although conversati

用途: 分類
難易度: Hard
コスト: High

Calibration of Structured Ignorance Certificates for Diagnosing Unknown Unknowns in Reasoning Models

Large language models frequently fail in a characteristic way: rather than acknowledging ignorance, they produ

用途: 生成
難易度: Hard
コスト: High

表形式向き深層学習RNN / LSTM検出生成予測

Physics-Guided Dual Decoding and Spectral Supervision for Global 3D Hydrometeor Prediction

While global data-driven models excel at predicting continuous atmospheric variables, three-dimensional hydrom

用途: 検出
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化テキスト強化学習

Autonomous Aerial Manipulation via Contextual Contrastive Meta Reinforcement Learning

Unmanned aerial vehicles (UAVs) are increasingly being deployed in logistics, service robotics, and other real

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Querying Counterfactuals on Tissue Graphs with Supervised Disentanglement

\textit{Tissue graph counterfactuals} ask how a cell's expression would change under altered spatial neighbor

機械学習教師なし学習テキスト教師なし

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Can the Environment Speak for Itself? $T^{2}$-GRPO: A Turn-Trajectory Group Relative Policy Optimization for Caregiver Agents

Optimizing large language models (LLMs) for long-horizon caregiver agents requires balancing delayed task obje

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Hybrid E-Assessment in Higher Education: Semi-Automated Grading of Paper-Based Written Examinations

This paper examines the limitations of fully digital and partially digital e-assessment approaches in summativ

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Hard
コスト: High

少数データ向き深層学習Transformer生成画像テキスト

ZIPP:Zero-shot Image Personalization from Personas

Text-to-image diffusion models are increasingly deployed in open-ended creative contexts, yet their outputs re

用途: 生成
難易度: Hard
コスト: High

Beyond Pass Rate: A Multilingual, Execution-Grounded Evaluation of Open Code LLMs

Code generation models are typically compared using compact execution benchmarks and aggregate pass rates, but

用途: 生成
難易度: Hard
コスト: High

Inference-Time Conformal Reasoning with Valid Factuality Control for Large Language Models

Large language models (LLMs) increasingly perform multi-step reasoning, where intermediate claims form implici

用途: 生成
難易度: Hard
コスト: High

Governance Controls for AI-Generated Test Artifacts in Autonomous Software Testing

Artificial Intelligence (AI) and Large Language Models (LLMs) are increasingly used in autonomous software tes

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

表形式向き説明可能自然言語処理大規模言語モデル分類検出生成

Bridging Expert Knowledge and Automated Feature Engineering via Self-Evolution

In high-stakes settings such as brand compliance, clinical care, and content moderation, machine learning cann

用途: 分類
難易度: Hard
コスト: High

説明可能深層学習Transformer生成テキスト

RadOT-Eval: Auditable Structured-Evidence Transport for Radiology Report Evaluation

Automatic evaluation is critical for high-stakes text generation, where errors often involve omitted findings,

用途: 生成
難易度: Hard
コスト: High

APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing

W4A4 quantization promises full utilization of INT4 Tensor Cores, yet group dequantization overhead on CUDA Co

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Structuring agentic AI for HPC code modernization

Modernization of legacy scientific codes is often necessary to keep up with the ever-evolving changes in the c

深層学習Transformerテキスト3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Building Customer Support AI Agents at 100M-User Scale: An Evaluation-Driven Framework

The rapid rise in LLM capabilities has made AI agents increasingly viable across a broad range of tasks. Among

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

The Amplifying Mirror: Locating and Steering the Partisan Direction inside a Large Language Model

Large language models are rapicly replacing search engines as the primary interface between people and informa

説明可能自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

Co-Evolving Skill Generation and Policy Optimization

Skill-augmented reinforcement learning improves language agents by storing reusable procedural knowledge acqui

深層学習軽量化・量子化生成テキスト強化学習

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知深層学習Transformer翻訳テキスト音声

HydraQE: OSU's Submission for the IWSLT 2026 Speech Translation Metrics Shared Task

We present HydraQE, our contribution to the IWSLT 2026 Speech Translation Metrics shared task. HydraQE is an e

用途: 翻訳
難易度: Hard
コスト: High

品質予測/異常検知深層学習Transformer分類生成テキスト

Can LLMs understand LilyPond? A benchmark for symbolic music generation and understanding

Symbolic music evaluation for large language models remains fragmented across representations, datasets, and m

用途: 分類
難易度: Hard
コスト: High

Operationalizing Linguistic Methods through Prompt-Engineering Skills: An Automatic Chinese Web Neologism Detection Pipeline

We present a method for automatic Chinese web neologism detection that operationalizes traditional linguistic

自然言語処理大規模言語モデル分類検出生成

用途: 分類
難易度: Hard
コスト: High

Analyzing the Correlation Between Hallucinations and Knowledge Conflicts in Large Language Models

Hallucinations -- factually incorrect or unverifiable outputs -- remain one of the most challenging limitation

説明可能自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Hard
コスト: High

Lost in the Flow with Code Talkers: Unveiling the Instruction-Tuning Tax of Large Language Models in Code Tasks

AI coding assistants have significantly improved developer productivity by automatically suggesting code that

深層学習Transformer分類生成テキスト

用途: 分類
難易度: Hard
コスト: High

ClinicalAligner26AM: A Cross-Lingual Aligner for Dataset Translation; Evidences from the MultiClinCorpus Shared Task

Word-level cross-lingual alignment is central to annotation projection, translation auditing, and cross-lingua

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

From Player to Master: Enhancing Test-Time Learning of LLM Agents via Reinforcement Learning over Memory

Large language model (LLM) agents are increasingly deployed in long-running settings where improving through e

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

A retrieval conditioned rebinding circuit for dynamic entity tracking in large language models

To interpret context correctly and retrieve relevant information, large language models must bind entities to

説明可能自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Sycophancy Towards Researchers Drives Performative Misalignment

The increasing situational awareness of language models raises safety concerns: models might be aware when the

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト強化学習

From Holistic Evaluation to Structured Criteria: Rubrics Across the Evolving LLM Landscape

As Large Language Models (LLMs) advance toward open-ended autonomous agents, the mechanisms used to evaluate a

用途: 生成
難易度: Hard
コスト: High

表形式向きコンピュータビジョン動画認識テキスト動画マルチモーダル

Harnessing Streaming Video in the Wild

Vision-Language Models (VLMs) are increasingly required to process unbounded video streams in applications suc

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Detection and Interpretability Analysis of Quotation Errors by Large Language Models

Purpose - Quotation error refers to the inconsistency between cited information and its original source. This

説明可能深層学習軽量化・量子化検出テキスト

用途: 検出
難易度: Hard
コスト: High

Inside the LLM Word Factory

Transformer language models process input provided as subword fragments, but natural language semantics usuall

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Friend or Foe? Language as an ideological switch in open-weight LLMs under Russian disinformation stress

As Russia's war against Ukraine extends into generative AI, large language models (LLMs) adapted for local pos

MI向き自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

Back on Track: Aligning Rewards and States for Reasoning in Diffusion Large Language Models

Reinforcement learning (RL) holds immense promise for enhancing the reasoning capabilities of diffusion large

自然言語処理大規模言語モデル生成テキスト強化学習

用途: 生成
難易度: Hard
コスト: High

説明可能深層学習Transformer異常検知テキスト

Explaining Black-Box Language Models: Learning to Optimize Linguistically-Structured Word Subsets

As deep language models (DLMs) are increasingly deployed in high-stakes domains such as healthcare, understand

用途: 異常検知
難易度: Hard
コスト: Medium

SAEExplainer: Interpreting SAE Features with Activation-Guided Preference Optimization

Although Sparse Autoencoders (SAEs) have mitigated the opacity of large language models (LLMs) by decomposing

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer分類検出生成

TRADE: Transducer-Augmented Decoder for Speech LLM

Speech Large Language Models (Speech LLMs) lack a principled mechanism for streaming inference: their label-sy

用途: 分類
難易度: Hard
コスト: High

品質予測/異常検知コンピュータビジョンセグメンテーションテキスト

More Yap Less Meaning: Uncovering Self-Improvement Behavior in SLMs

Recently, language models have made rapid progress across various domains and applications. However, their cap

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Beyond Linear Activation Steering: Invertible Latent Transformations for Controlling LLM Behavior

Activation steering provides a lightweight inference-time mechanism for controlling large language models (LLM

用途: 生成
難易度: Hard
コスト: High

Sycophancy as a Multilingual Alignment Failure: How Safety Degrades Across Languages, Topics, and Models

Safety-aligned large language models often exhibit sycophancy, which is the tendency to affirm users' opinions

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知MLOpsパイプライン構築要約テキスト

Segment-level Tree Search for Long Meeting Document Summarization

Meeting documents are challenging to summarize due to their length and complex conversational structure. Exist

用途: 要約
難易度: Hard
コスト: High

センサ/時系列自然言語処理プロンプトエンジニアリングテキスト音声

TinyGiantALM: A Compact Audio-Language Model for Intent-Aware Reasoning under Resource Constraints

Current advancements in Audio Reasoning rely on massive Large Audio-Language Models (LALMs), hindering deploym

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト音声

Hacking Generative Perplexity: Why Unconditional Text Evaluation Needs Distributional Metrics

Diffusion and continuous flow-based language models have emerged as the leading non-autoregressive alternative

用途: 生成
難易度: Hard
コスト: High

AsyncLane: Decoupling Refinement from Advancement in Diffusion Language Model Decoding

Block-wise semi-autoregressive decoding is the standard inference paradigm for diffusion large language models

用途: 生成
難易度: Hard
コスト: High

TimpaTeks: Automatic In-place Text Sequence Modification via Diffusion Language Model Steering

We extend activation steering to diffusion language models (DLMs) and study a novel problem that arose due to

生成AI拡散モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Impacts of Histories and Models on LLM Grading: A Study in Advanced Software Engineering Courses

Graduate-level research reading report assessment creates a substantial labor burden for educators. While larg

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

Large language models answer knowledge-intensive questions using both parametric memory and retrieved evidence

用途: 生成
難易度: Hard
コスト: High

深層学習Transformer画像テキストマルチモーダル

When Correct Decisions Hide Internal Stress: Decision-State Probing in Multimodal Language Models

Multimodal language models are typically evaluated through external behavior: selecting the correct image--tex

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Auditing Proprietary Alignment in Large Language Models: A Comparative Framework Without a Ground-Truth Standard

Large language models (LLMs) are increasingly released and deployed through opaque development and deployment

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Generalizing Geometry-Guided Mamba as a Plug-and-Play Context Module for CNN-based Semantic Segmentation

CNN-based semantic segmentation networks usually rely on context heads such as ASPP, PPM, or attention modules

深層学習CNNセグメンテーションテキスト

用途: セグメンテーション
難易度: Hard
コスト: Medium

DeepMine-Mamba: Mitigating Information Dilution in Mamba-Based State Space Models for Document Image Binarization

Document image binarization aims to separate foreground text from degraded backgrounds while preserving thin,

用途: 生成
難易度: Hard
コスト: Low

Stain-Aware Wavelet Regularization for Instant Adversarial Purification in Histopathology

Deep learning has become prevalent in computational pathology pipelines that support tasks such as cancer scre

自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

PRPO: Perception-Reinforced Policy Optimization via Token-Level Dynamic Advantage Reshaping

Reinforcement Learning with Verifiable Rewards (RLVR) has become an effective paradigm for improving the reaso

自然言語処理RAG画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

PhysAgent: Automating Physics-Based 4D Synthesis via Trajectory-Grounded Multi-Agent Feedback

Achieving fully automated, physically plausible 3D motion synthesis is a core objective in graphics and genera

MI向き深層学習軽量化・量子化生成テキスト3D

用途: 生成
難易度: Hard
コスト: High

BioVid: Autoregressive Video Generation with Biological Behavior Semantic Comprehension

Existing video generation frameworks treat sequence duration as an externally prescribed parameter -- fixed fr

深層学習Transformer生成テキスト動画

用途: 生成
難易度: Hard
コスト: High

Reconstructing Synthetic SDO/AIA 193 A EUV Images from He I 10830 A Observations with Diffusion Model Translator

Routine full-disk EUV imaging has been available only since the modern era, such as SOHO and SDO. To extend EU

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Learnable Token Sparsification for Efficient Gigapixel Whole Slide Image Reasoning

The processing of gigapixel whole slide images within vision language models faces a major difficulty due to a

深層学習軽量化・量子化画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト音声

OmniCap-IF: Benchmarking and Improving Instruction Following Abilities for Omni-Video Captioning

While Omni-modal Large Language Models (OLLMs) have demonstrated impressive capabilities in jointly processing

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer検出セグメンテーション異常検知

NGram-MoSE: Efficient Remote Sensing Super-Resolution via N-Gram Context and Mixture-of-Experts

Remote sensing applications for environmental monitoring and disaster management are frequently constrained by

用途: 検出
難易度: Hard
コスト: High

表形式向きコンピュータビジョン動画認識生成画像テキスト

DriveReward: A Comprehensive Dataset and Generative Vision-Language Reward Model for Autonomous Driving

Reward models play a pivotal role in reinforcement learning (RL) and multi-modal trajectory selection for auto

用途: 生成
難易度: Hard
コスト: High

少数データ向き深層学習Transformer画像テキストマルチモーダル

Look Less, Reason More: Block-wise Attention Skipping for Efficient Multimodal LLMs

Multimodal Large Language Models (MLLMs) face a significant inference bottleneck due to the quadratic computat

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

コンピュータビジョンセグメンテーション生成予測画像

EgoPriMo: Egocentric Motion Generation for Interactive Humanoid Control

Humanoid robots require whole-body motions that adapt to scene context, task requirements, and user intent. Mo

用途: 生成
難易度: Hard
コスト: High

Seeing is Believing: Aligning Prompt Rewriting with Visual Anchors for Text-to-Image Generation

Despite the impressive capabilities of text-to-image (T2I) models, an intent-generation gap often persists due

用途: 生成
難易度: Hard
コスト: High

自然言語処理大規模言語モデル画像テキストマルチモーダル

TVI-CoT: Text-Visual Interleaved Chain-of-Thought Reasoning for Multimodal Understanding

Chain-of-thought (CoT) reasoning has proven effective for enhancing problem-solving in large language models.

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Reinforcing Temporal Answer Grounding in Instructional Video via Candidate-Aware Causal Reasoning

The task of temporal answer grounding in instructional video (TAGV), which aims to locate precise video segmen

深層学習Transformer画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

表形式向きコンピュータビジョンセグメンテーション生成画像テキスト

Segmentation-Assisted Brain MRI Synthesis with Cross-Image Multi-Contrast Feature Memory Bank Retrieval Augmentation

Multi-contrast brain MRI provide complementary soft-tissue characteristics that aid in the screening and diagn

用途: 生成
難易度: Easy
コスト: Low

CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs

Vision-language models (VLMs) pretrained on large-scale image-text pairs demonstrate strong image-level unders

深層学習CNN検出生成セグメンテーション

用途: 検出
難易度: Hard
コスト: High

表形式向き品質予測/異常検知自然言語処理大規模言語モデルテキスト動画

CoVEBench: Can Video Editing Models Handle Complex Instructions?

While recent text-guided video editing models excel at elementary tasks (e.g., style transfer, object insertio

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

MI向き自然言語処理RAG生成セグメンテーション画像

SceneConductor: 3D Scene Generation from Single Image with Multi-Agent Orchestration

Generating complete 3D scenes from a single image requires inferring globally consistent geometry, object rela

用途: 生成
難易度: Hard
コスト: High

Safe, Fluent and Acceptable Motion Generation and Execution for Human--Robot Interaction in Manufacturing Environments

Robots operating in human environments must not only ensure physical safety but also exhibit behaviors that ar

品質予測/異常検知強化学習生成テキスト

用途: 生成
難易度: Hard
コスト: Medium

センサ/時系列コンピュータビジョン3D・点群テキスト3Dマルチモーダル

Language as a Sensor: Calibrated Spatial Belief Estimation in 3D Scenes from Natural Language

Robots deployed in human-centric environments routinely receive natural-language descriptions of spatial infor

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

自然言語処理ファインチューニング異常検知画像テキスト

Two Bridges, One Pathway: From VLMs to Generalizable VLAs with Embodied Trajectory-Coupled Data

Vision-language models (VLMs) are powerful general-purpose reasoners, yet converting them into robot control p

用途: 異常検知
難易度: Hard
コスト: High

センサ/時系列深層学習軽量化・量子化画像テキスト強化学習

Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning

Autonomous Underwater Vehicles (AUVs) traditionally rely on complex, heavily engineered pipelines for percepti

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

LUNA-AD: Lightweight Uncertainty-Aware Language Model with Lifelong Learning for Autonomous Driving

While large language models (LLMs) offer promising reasoning capabilities, their integration into safety-criti

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Personalized and Robust Proactive Robot Assistance with Uncertainty-Guided LLM Reasoning

Proactive robot assistance in household environments requires accurate prediction of human activities and obje

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Variational Proximal Policy Optimization

Reinforcement Learning from Human Feedback via Proximal Policy Optimization often suffers from policy mode col

自然言語処理RAGテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Forward-Free Diffusion Language Models

Diffusion language models generate text through iterative denoising, offering a powerful alternative to autore

品質予測/異常検知自然言語処理RAG生成テキスト

用途: 生成
難易度: Hard
コスト: High

Bayesian-Agent: Posterior-Guided Skill Evolution for LLM Agent Harnesses

LLM agents increasingly rely on external inference conditions: prompts, tools, memory, SOPs, skills, and harne

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Tensorizing Engram: Sharing Latents Across N-Gram Embeddings is Beneficial in LLMs

Modern language models represent text using discrete token-level embeddings, which forces recurring multi-toke

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

CATPO: Critique-Augmented Tree Policy Optimization

Reinforcement learning with verifiable rewards (RLVR) has become a dominant paradigm for improving the reasoni

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Chiaroscuro Attention: Spending Compute in the Dark

Standard transformers apply self-attention uniformly at every layer and token, regardless of whether the input

深層学習Transformer分類テキスト

用途: 分類
難易度: Hard
コスト: Low

Understanding the Sociocultural Dimensions of Mental Health Discourse in Arabic-Language X Communities

Computational mental health research has predominantly centered on English-speaking populations, leaving Arabi

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

TLRD: Teaching LLMs to Reason over Tabular Data with Tri-Level Rationale Distillation

Tabular data is a primary medium for storing real-world information, driving many industrial applications of m

表形式向き深層学習軽量化・量子化テキスト表形式

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

AgriGov: A Structured Multilingual Dataset Curation for Indian Government Schemes for Farmers

AgriGov is a curated, trilingual (English-Hindi-Marathi) dataset designed to address the scarcity of domain-gr

表形式向き自然言語処理RAG翻訳要約QA

用途: 翻訳
難易度: Hard
コスト: Low

SSR: Can Simulated Patients Learn to Stigmatize Themselves? Modeling Self-Stigma through Internal Monologue

Simulating patients with large language models (LLMs) is a promising tool for mental health training, but exis

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向き品質予測/異常検知深層学習軽量化・量子化生成テキスト

ZAS-SQL: Distilling Rules from Failures for Zero-Shot Text-to-SQL

Text-to-SQL translates natural language into executable SQL queries. Few-shot in-context learning methods buil

用途: 生成
難易度: Hard
コスト: High

Building Comparative Motivation Profiles with Instrumental Interventions

Safety evaluations often infer latent motivations from behavioral patterns, but the construct validity of thes

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

Multimodal large language models (MLLMs) have made substantial advancements in video understanding, yet the re

自然言語処理大規模言語モデル検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

説明可能自然言語処理大規模言語モデルテキスト教師なし

Shared Semantics, Divergent Mechanisms: Unsupervised Feature Discovery by Aligning Semantics and Mechanisms

As large language models are increasingly deployed in high-stakes settings, there is a growing need for tools

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能センサ/時系列深層学習グラフニューラルネット検出テキスト音声

Paediatric-HGNN: A Hybrid Heterogeneous Graph Neural Network for Detecting Disfluency in Children's Speech via Multiscale Acoustic Fusion

Automated stuttering detection (ASD) systems struggle with paediatric speech due to high acoustic variability

用途: 検出
難易度: Hard
コスト: Medium

品質予測/異常検知画像検査深層学習軽量化・量子化テキスト

AlignFed: Alignment-Aware Asynchronous Federated Fine-Tuning for Large Language Models in Heterogeneous Edge Environments

Large Language Models (LLMs) have significantly propelled the advancement of edge intelligence and have been w

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列自然言語処理大規模言語モデルテキスト音声

GlobeAudio: A Multilingual Multicultural Benchmark for Naturalistic Evaluation of Large Audio-Language Models

Large Audio-Language Models (LALMs) integrate audio perception and language understanding within a unified fra

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知深層学習Transformer生成要約テキスト

TextEconomizer: Enhancing Lossy Text Compression with Denoising Transformers and Entropy Coding

Lossy text compression reduces data size while preserving core meaning, making it well-suited for summarizatio

用途: 生成
難易度: Hard
コスト: Low

少数データ向きMI向き条件最適化自然言語処理ファインチューニングテキストマルチモーダル

CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning

Enabling robots to understand and execute tasks from natural language commands while maintaining data efficien

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Constrained Paraphrase Consistency for LLM Hallucination Detection

Large language models (LLMs) can generate factually inconsistent claims, motivating accurate and scalable hall

深層学習Transformer検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

Cross Paraphrastic Invariance Learning for Hallucination Detection

Large language models (LLMs) frequently generate hallucinations, which are unsupported by a source document. T

深層学習軽量化・量子化分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

ConSteer-RL: Steering Reasoning Capabilities in Large Language Models via Confidence-Aware Reinforcement Learning

Reinforcement Learning from Verifiable Rewards (RLVR) has recently become a key paradigm for improving the rea

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Assessing the Energy and Carbon Emissions of Neural Speaker Verification Model in Training and Inference

Deep-learning speaker verification (SV) increasingly relies on deep neural network backbones, whose environmen

深層学習CNNテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Aligned but Not Partner-Specific: Distinguishing How Multimodal LLM Agents Succeed in Reference Games Without Human-Like Conventions

Repeated reference games test whether interlocutors replace their initially long descriptions with shorter, pa

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Support Vector Rubrics: Closing the Gap Between Self-Generated and Human Rubrics

Rubric-based evaluation is a promising paradigm for judging large language model (LLM) outputs, yet self-gener

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

"I understand your perspective": LLM Persuasion and Sycophancy through the Lens of Communicative Action Theory

Large Language Models (LLMs) can generate high-quality arguments, yet their ability to engage in nuanced and p

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

SurgiQ: A Large-Scale Multi-Domain Benchmark for Evaluating Surgical Understanding in Large Language Models

Reliable evaluation of large language models in surgery remains underdeveloped. Broad medical benchmarks test

用途: 生成
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル画像テキストマルチモーダル

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet the

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Sign language models are predominantly trained with gloss-sequence or text supervision, thereby under-modeling

センサ/時系列機械学習時系列分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

Diffusion Language Model Parallel Decoding via Product-of-Experts Bridge

Diffusion language models (DLMs) offer substantial speed advantages through parallel decoding, but the lack of

用途: 生成
難易度: Hard
コスト: High

When Behavioral Safety Evaluation Fails: A Representation-Level Perspective

Large Language Model (LLM) safety has often been evaluated at the behavior level, which provides limited evide

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research

Large language models (LLMs) are increasingly used in academic research workflows, but scholarly tasks require

用途: 生成
難易度: Hard
コスト: High

Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems

Symbolic benchmarks have emerged as a key approach to assess model robustness under minor modifications to STE

品質予測/異常検知自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Arabic Sentence Segmentation Across Genres and Punctuation Conditions

Sentence segmentation in Arabic is challenging due to ambiguous and inconsistent punctuation, with many texts

深層学習軽量化・量子化セグメンテーションテキスト

用途: セグメンテーション
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル翻訳テキスト強化学習

Rewrite to Translate, Translate to Reward: Reinforcement Learning for Source Rewriting in Machine Translation

Although directly prompting off-the-shelf Large Language Models (LLMs) to generate meaning-preserving source r

用途: 翻訳
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成要約テキスト

Summarization is Not Dead Yet

The progress of large language models (LLMs) has fueled claims that model-generated summaries rival or even su

用途: 生成
難易度: Hard
コスト: High

MC-PDD: Masked Corpus-Level Pretraining Data Detection for Black-Box Large Language Models

Pretraining is fundamental to the development of Large Language Models (LLMs), yet the opacity of pretraining

自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Hard
コスト: High

Customer-Agent: Overcoming Context Limitations in Ultra-Long Shopping Trajectories via Tool-Augmented Agents and RLVR

Understanding customer shopping trajectories is essential for enabling personalized shopping experiences. Howe

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理埋め込み画像テキスト

FMRFusion: Frequency-Aware Multi-View Representation Learning for Heterogeneous Image Fusion

Infrared and visible image fusion aims to generate a composite image that retains significant target informati

用途: 埋め込み
難易度: Hard
コスト: Low

MechLens: Late Crystallization of Factual Knowledge Explains Intervention Effectiveness in Language Models

Understanding where LLMs store factual knowledge is critical for hallucination mitigation. We systematically q

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Hard
コスト: High

Defending Against Malicious Finetuning by Scaling Train-time Adversarial Attacks

Current open-weight large language models (LLMs) are prone to malicious finetuning attacks, which could compro

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Neutrality Bites: Gender Representation in AI-Generated Animal Stories

Gender bias in AI-generated stories is a well-documented problem. While much attention has been paid to reduci

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs

Backdoor attacks in large language models (LLMs) are often treated as isolated trigger-response failures, moti

深層学習軽量化・量子化分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

From `May' to `Is': Certainty Distortion in Language Model Rewriting

Humans increasingly turn to Language Models (LMs) in ways that shape beliefs and drive decisions, including di

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

POISE: Position-Aware Undetectable Skill Injection on LLM Agents

Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format expos

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Illusions of the Gold Standard: A Large-scale Analysis of Human Evaluation Protocols for Long-form Text Generation

Human evaluation plays a critical role in assessing the quality of generated text. However, the reliability an

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル要約テキスト

ROSUM-MCTS: Monte Carlo Tree Search-Inspired HDL Code Summarization with Structural Rewards

Large language models (LLMs) have shown promise in code summarization, yet their effectiveness for Hardware De

用途: 要約
難易度: Hard
コスト: High

Decoupling Semantics and Logic: A Training-Free Coarse-to-Fine Pipeline for Video Retrieval-Augmented Generation

This paper presents our system description for the 2nd Workshop on Multimodal Augmented Generation via Multimo

深層学習軽量化・量子化生成検索画像

用途: 生成
難易度: Hard
コスト: High

MemToolAgent overview with a simple restaurant booking scenario where the agent retrieves similar memories, receives feedback on an invalid time format, and generates a reflection to update its memory

Modern large language model (LLM) agents can use external tools to help users solve complex tasks. However, fo

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化生成画像テキスト

HACK++: Towards More Effective Head-Aware Key-Value Compression for Efficient Visual Autoregressive Modeling

Visual Autoregressive (VAR) models adopt a next-scale prediction paradigm, offering high-quality generation wi

用途: 生成
難易度: Hard
コスト: High

TIDE: Task-Isolated Diffusion for Unified Video Editing and Generation

Recent advances in Diffusion Transformers have driven rapid progress in video generation and editing, yet thes

用途: 生成
難易度: Hard
コスト: High

深層学習Transformer分類セグメンテーション回帰

How Much MRI Preprocessing Is Enough? A Cost-Utility Study for Brain MRI Foundation Models

MRI preprocessing defines the input distribution seen by brain MRI foundation models, yet it is usually treate

用途: 分類
難易度: Hard
コスト: High

Property-Informed Diffusion-Based Text-to-Microstructure Generation

Designing 3D metamaterial microstructures that meet the intended functions remains a major challenge, as it ty

自然言語処理RAG生成テキスト3D

用途: 生成
難易度: Hard
コスト: High

MI向きコンピュータビジョンマルチモーダル画像テキスト動画

IMAGINE: Adaptive Schema-Imagery Enhanced Composition for Composed Video Retrieval

Composed Video Retrieval (CVR) is designed to retrieve a target video that matches a reference video modified

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列コンピュータビジョンセグメンテーション分類画像テキスト

One Stone, Three Birds: Self-adaptive Optimal Transport for Multi-VLM Selection, Adaptation, and Ensembling

Vision-language models (VLMs) enable visual recognition from semantic class descriptions, which makes them att

用途: 分類
難易度: Hard
コスト: High

MI向き品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

VideoWeaver: Evaluating and Evolving Skills for Agentic Long Video Generation

Recent agent frameworks such as Claude Code, Codex, and OpenClaw are strong at tool use and orchestration, but

用途: 生成
難易度: Hard
コスト: High

深層学習Transformer検出セグメンテーションテキスト

OmniFaceRig: Fully Automatic Inner-Mouth-Aware Face Rigging Across Diverse 3D Character Topologies

Facial rigging - creating FACS-based blendshapes together with inner-mouth geometry (teeth, gums, and tongue)

用途: 検出
難易度: Hard
コスト: High

Uncertainty-Aware Intention Prediction for Human-to-Robot Assembly Teleoperation

In assisted teleoperation for human-robot collaboration, accurate intention prediction is critical for enablin

自然言語処理RAG分類検出セグメンテーション

用途: 分類
難易度: Hard
コスト: High

MotionVLA: Injecting Geometric Motion into Vision-Language-Action Model

Vision-language-action (VLA) models increasingly condition robot policies on history, depth, or 4D features to

自然言語処理RAG生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデルテキスト3D

Agentic Neuro-Symbolic Planning and Commissioning for Human-in-the-Loop Industrial Robotics with Digital Twins

Flexible robotic automation requires systems that interpret operator intent, verify physical feasibility, and

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer生成画像テキスト

SynthICL: Scalable In-context Imitation Learning with Synthetic Data

In-context imitation learning (ICIL) enables robots to learn new tasks from a small number of demonstrations b

用途: 生成
難易度: Hard
コスト: High

Continual Quadruped Robots Coordination via Semantic Skill Discovery

Multi-quadruped coordination has attracted increasing attention due to its enhanced payload capacity, broader

自然言語処理RAGテキスト動画強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Post-AGI Economies: Superposition and the Second Fundamental Theorem of Welfare Economics

The classical Second Welfare Theorem decentralizes any Pareto efficient allocation through prices and transfer

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Large-scale empirical tuning and comparison of default optimizers for variational inference

Black-box variational inference (BBVI) is a methodology for posterior approximation that relies on stochastic

MLOpsモデルデプロイテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

センサ/時系列自然言語処理プロンプトエンジニアリング予測テキスト時系列

Time series Foundation Models based on Physics-Informed Synthetic Histories for Cold-Start Photovoltaic Forecasting

At commissioning time, Photovoltaic (PV) operators must forecast production before target-site observations ar

用途: 予測
難易度: Hard
コスト: Low

Online Pandora's Box for Contextual LLM Cascading

この論文では、LLM APIを連携するための選択ツールであるPandora's Boxモデルを提案しています。Pandora's Boxモデルは、複数のLLM APIから生成した出力を評価するためのツールとなります。出力

用途: LLM APIを連携するための決定ツール
難易度: Hard
コスト: High

Transfer learning for causal forest

Transfer learning addresses the challenge of transfering knowledge from one domain to another. Traditional tra

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

The Effect of Training Task Diversity on In-Context Learning through the Lens of Low-Dimensional Subspaces

The transformer's emergent ability to perform in-context learning (ICL) has sparked a wide range of studies de

深層学習Transformer異常検知テキスト

用途: 分析対象の範囲が広い分散学習を効率的に行える方法を開発する
難易度: Hard
コスト: High

Representational Similarity and Model Behavior in Multi-Agent Interaction

Researchers have shown that neural similarity among humans predicts social closeness and cooperative success,

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Sparsely gated tiny linear experts

Sparsity allows scaling model parameters without proportionally increasing computational cost. While mixture o

説明可能深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

説明可能自然言語処理大規模言語モデル分類画像テキスト

arxivGitHubあり2026-06-05

LLM-Guided Evolution for Medical Decision Pipelines

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt

用途: 分類
難易度: Hard
コスト: High

MI向き品質予測/異常検知自然言語処理RAG生成テキスト

Beyond Individual Personas: Aligning Synthetic Dialogue to Population-Level Behavior Distributions

Synthetic dialogue corpora are increasingly used as proxies for target dialogue data, yet persona-grounded gen

用途: 生成
難易度: Hard
コスト: Low

Whose Norms? Disentangling Cultural and Personal Alignment in Large Language Models

Large language models are increasingly used for social decision-making situations that require balancing cultu

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

コンピュータビジョンセグメンテーション生成回帰テキスト

TBD-VLA: Temporal Block Diffusion Vision Language Action Model

Discrete Vision-Language-Action (VLA) models typically formulate action generation as next-token prediction ov

用途: 生成
難易度: Hard
コスト: High

DroneDAR: Long-Range Drone Distance Estimation Using Monocular Vision and Bounding-Box Features

Accurate distance estimation for small drones in long-range imagery is important for tracking and situational

深層学習Transformer検出回帰画像

用途: 検出
難易度: Hard
コスト: High

Planning-aligned Token Compression for Long-Context Autonomous Driving

Monolithic vision-action models represent an emerging paradigm in autonomous driving. However, this architectu

用途: 自動運転の長所期記憶
難易度: Hard
コスト: High

Re-imagining ISO 26262 in the Age of Autonomous Vehicles: Enhancing Controllability through Transferability and Predictability

The ISO 26262 standard defines functional safety for road vehicles through risk assessments based on Severity,

用途: 自律走行車の安全性を向上させる
難易度: Hard
コスト: Medium

arxivGitHubあり2026-06-05

RhinoVLA Technical Report

この論文では、VLAモデルをedgeハードウェアにデプロイするための手法を提案しています。この手法は、VLAモデルをedgeハードウェアにデプロイするためのフレームワークです。この手法は、edgeハードウェアを利用してV

深層学習軽量化・量子化画像テキストマルチモーダル

用途: VLAモデルをedgeハードウェアにデプロイするための手法
難易度: Hard
コスト: High

センサ/時系列品質予測/異常検知コンピュータビジョン3D・点群生成テキスト動画

Dash2Sim: Closed-Loop Driving Simulation from in-the-wild Dashcam Videos

この論文では、ドライビングシミュレーションのためのフレームワークを提案しています。このフレームワークは、ドライビングシミュレーションを目的とした機械学習フレームワークです。このフレームワークは、大量のデータを扱う必要があ

用途: ドライビングシミュレーションのためのフレームワーク
難易度: Hard
コスト: High

Does Appearance Help? A Systematic Study of Image-Based Re-Identification in Online 3D Multi-Pedestrian Tracking

3D Multi-Object Tracking (MOT)では、人の動きを検出し続けるために、3D点群データから3D人体の姿勢姿勢を推測する必要があり、主に幾何学情報に依存しているが、これは状況によっては人を分別するの

深層学習Transformer検出画像テキスト

用途: 3D人間の追跡システムの外観の有用性
難易度: Hard
コスト: High

Dreaming when Necessary: Advancing World Action Models with Adaptive Multi-Modal Reasoning

World Action Models (WAMs) offer a promising approach to embodied intelligence, yet existing methods rely heav

深層学習軽量化・量子化画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Think Like a Pilot: Fine-Grained Long-Horizon UAV Navigation

VLNベンチマークでは、ディシクリットな操作や粗い操作が使われ、UAVのヴィジョンラングジュアクション（VLJ）タスクでは短い操作が中心で、長時間飛行に対応できるfineグラINEDUAVナビゲーション（FLIGHT）ベ

コンピュータビジョンマルチモーダルテキスト動画

用途: ドローンの長時間飛行
難易度: Hard
コスト: High

Lane Change Trajectory Planning for Personalized Driving Comfort and Mobility Efficiency

車の乗り心地と移動効率の同時最適化を可能にするためのローカル方程式に基づく車の乗り心地と移動効率の同時最適化方法を提案した。

機械学習教師あり学習回帰テキスト

用途: 車の乗り心地と移動効率の最適化
難易度: Hard
コスト: Medium

Learning to Strategically Acquire Resources in Competition

We consider multiple agents competing to acquire some costly divisible resource (e.g. shares of a financial as

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

CPUで試しやすい強化学習方策勾配 (PPO / A3C)回帰テキスト

arxivGitHubあり2026-06-04

TorchKM: A GPU-Oriented Library for Kernel Learning and Model Selection

TorchKM is an open-source library for kernel machines, including support vector machines, kernel logistic regr

用途: 回帰
難易度: Hard
コスト: High

MI向き深層学習Transformer検出テキスト

End-to-End Subgraph Detection with GraphDETR

グラフ内でパターンの検出を行うためのフレームワークであるGraphDETRを導入し、グラフ内のパターン検出を集合学習問題として視覚化した。GraphDETRは、DETRObjを元にグラフ内の対象グラフを表現する方法を開発

用途: グラフ内におけるパターンの検出
難易度: Hard
コスト: High

センサ/時系列自然言語処理プロンプトエンジニアリングテキスト時系列

Causal Longitudinal Prior-Fitted Networks for Counterfactual Outcome Prediction

この研究では、対象変数が因果関係を持つタイムシリーズに対してカウンターファクタル予測を扱った。この際、カウンターファクタル予測では対象変数を含む時間系列に対して対象変数に対しての因果効果を推定するが、過去の観測値からこれ

用途: カウンターファクタル予測
難易度: Hard
コスト: High

CPUで試しやすいセンサ/時系列深層学習RNN / LSTM予測テキスト時系列

Zero-Copy Semantic Contagion: An In-Memory Streaming Architecture for Evolving Attention Graphs

分析モデルは、特定のアセットを中心とした分析に特化しており、異業連鎖の変動を反映していなかった。そのため、関連企業の注意を考慮し、連続時間グラフを用いて、分析結果をより包括的に表現することができる。

用途: 分析結果を連続時間グラフで表示
難易度: Hard
コスト: Low

Mitigating the Curse of Dimensionality in Uniform Convergence of Deep Neural Networks via Smooth Activations

この論文は、スムースアクティブ化を持つ深層ニューラルネットワークの非均等収束を扱い、統一収束を扱う理論枠組みを提案する。

MI向き自然言語処理RAG回帰テキスト

用途: 深層ニューラルネットワークの非均等収束
難易度: Hard
コスト: Low

Emergent Language as an Approach to Conscious AI

The question of whether artificial systems can be conscious remains open, in part because existing approaches

強化学習マルチエージェント検出生成テキスト

用途: 検出
難易度: Hard
コスト: Medium

HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers

HANDOFFは、人間を模倣するロボットの制御を実現するために構築されたフレームワークです。ロボットはタスクを認識し、動作を生成します。HANDOFFは、タスクに合わせて動作を生成するアジエントを形成するために、教師と学

用途: 人間臭いアジентыのロボット制御を実現
難易度: Hard
コスト: Medium

品質予測/異常検知自然言語処理RAGセグメンテーションテキスト動画

VOLT: Vision and Language Trajectory Segmentation for Faster-than-Demonstration Policies

この研究では、フェスタースター自動運

用途: フェスタースター自動運転用の高速動作
難易度: Hard
コスト: High

Synthetic Data Generation and Vision-based Wrinkle and Keypoint Detection for Bimanual Cloth Manipulation

布物操作の学習システムを開発しました。このシステムは、人間が布物操作を学習できます。

品質予測/異常検知深層学習CNN検出生成画像

用途: 布物操作の学習
難易度: Hard
コスト: Medium

品質予測/異常検知自然言語処理プロンプトエンジニアリングテキストマルチモーダル

MPCoT: Reward-Guided Multi-Path Latent Reasoning for Test-Time Scalable Vision-Language-Action

Vision-Language-Action(バブルラボ、VLアクション)ポリシーが長時間予測と高い不確実性の制御で脆弱であることを認識し、VLアクションポリシーが1パスでのアクションデコードのみを提供し、長時間予測のた

用途: long-horizonおよびhigh-uncertainty ControlでのVLAポリシーが脆弱である問題に対する解決策。
難易度: Hard
コスト: High

AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding

このリポジトリでは、画像認識モデルにアクション生成能力を付与することを目指したモデルを提案します。このモデルは、画像認識のための事前訓練モデルを用いて、複雑なアクションを生成することができます。

深層学習Transformer検出生成予測

用途: 画像認識とアクションの生成
難易度: Hard
コスト: High

MotionDisco: Motion Discovery for Extreme Humanoid Loco-Manipulation

この研究では、ヒューマノイドロボットのロコマニパションのための MotionDisco を提案し、ロボットは接触を検出して自律的に行動することができるようになります。

深層学習軽量化・量子化テキスト動画強化学習

用途: ヒューマノイドロボットのロコマニパション
難易度: Hard
コスト: High

arxivGitHubあり2026-06-04

A Conversational Framework for Human-Robot Collaborative Manipulation with Distributed Generative AI models

この研究では、人間-ロボット協力のためのDistributed Conversational Frameworkを提案します。

自然言語処理大規模言語モデル生成画像テキスト

用途: 人間-ロボット協力
難易度: Hard
コスト: High

World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

統合された視覚言語アクションモデルを提案し、これを用いたタスクの性能を向上させることができるようになる。

用途: 統合された視覚言語アクションモデル
難易度: Hard
コスト: High

T-FunS3D: Task-Driven Hierarchical Open-Vocabulary 3D Functionality Segmentation

Open-vocabulary 3D functionality segmentation enables robots to localize functional object components in 3D sc

自然言語処理RAG分類セグメンテーション画像

用途: 分類
難易度: Hard
コスト: High

コンピュータビジョンマルチモーダル異常検知テキスト動画

Towards a Data Flywheel for Embodied Intelligence in Logistics

Autonomous drivingでは、ロボットが視覚認識した情報に基づいて行動を決定する必要があるが、過去のデータで構築された空間モデルでは、ロボットの行動を予測することが困難であるため、空間モデルを構築することによ

用途: ロボットの行動予測に適した空間を構築
難易度: Hard
コスト: High

表形式向きコンピュータビジョンセグメンテーション検出テキスト表形式

TabSODA: Tabular Diffusion based Imputation with Skip Pattern Detection and Ordinal Awareness

本論文では、欠損値がある表格型データの欠損補完に関して取り組み、欠損値がないセルと同様に動作するSkipパターン検出と順序性意識のあるdiffusionベースの欠損補完アルゴリズムを提案しました。

用途: 表格型データの欠損補完
難易度: Hard
コスト: High

コンピュータビジョンセグメンテーション検出生成テキスト

Global Sketch-Based Watermarking for Diffusion Language Models

Watermarking methods for language models have been studied extensively in the autoregressive setting, where to

用途: 検出
難易度: Hard
コスト: High

センサ/時系列自然言語処理埋め込み・検索生成テキスト時系列

arxivGitHubあり2026-06-03

HyFAD: Hybrid Time-Frequency Diffusion with Frequency-Aware Embedding for Time Series Imputation

Diffusion models have demonstrated strong performance in time series modeling due to their ability to progress

用途: 生成
難易度: Hard
コスト: High

Knockoffs-based False Discovery Rate Control and Simplification for Deep Neural Networks

The deep neural network is a widely used framework in machine learning that has been widely applied in various

機械学習教師あり学習回帰テキスト

用途: 回帰
難易度: Hard
コスト: Medium

When Do Fewer Coordinates Suffice in DP-SGD?

Differential Privacyを使用してプライバシーを確保し、モデルが更新する必要のある少なくとも一部の座標を推定する方法を提案する

深層学習正規化・最適化手法テキスト

用途: プライバシーを確保するためのプライバシー保護
難易度: Hard
コスト: High

センサ/時系列深層学習RNN / LSTM分類テキスト時系列

Seq103: A Unified Neuroevolution Framework for Compact Sequence Architecture Discovery

Neuroevolution is a representative neural architecture search paradigm that evolves both network topology and

用途: 分類
難易度: Hard
コスト: Low

Mean-based algorithms: A lower bound and regret

Mean-based algorithms are a class of online learning algorithms that assign low probability to actions with lo

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Improved Approximation Guarantees for Groupwise Maximin Share Fairness

We study the problem of fairly allocating a set of indivisible goods to a set of $n$ agents with additive valu

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Learning to cooperate with emergent reputation via multi-agent reinforcement learning

Reputation, the aggregation of peer assessments diffused through social networks, is a pivotal mechanism for p

深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

センサ/時系列品質予測/異常検知自然言語処理ファインチューニング検出異常検知テキスト

TPA-AD: A Two-Stage Pseudo Anomaly-Guided Method for Bearing Time-Series Anomaly Detection

This paper proposes a two-stage pseudo anomaly-guided anomaly detection method (\textbf{T}wo-stage \textbf{P}s

用途: 検出
難易度: Hard
コスト: High

Conformal Language Modeling via Posterior Sampling

Large Language Models remain plagued by hallucinations. Recent work has sought to tame their prevalence using

用途: 生成
難易度: Hard
コスト: High

表形式向き深層学習Transformer生成テキスト表形式

AugMask: Training Diffusion Models on Incomplete Tabular Data via Stochastic Augmentation and Masking

Score-based diffusion models have emerged as prominent deep generative models; however, their application to t

用途: 生成
難易度: Hard
コスト: High

An Asymptotic Theory of Chain-of-Thought in In-Context Learning

この研究は、医療従事者が病気の症状を検出し、診断するのを支援するように設計されています。研究者らは、AIのアルゴリズムを開発し、そのアルゴリズムを臨床試験で検証したところ、AIが医療関係者とほぼ同じレベルの精度で病気の症

自然言語処理大規模言語モデル回帰テキスト

用途: 症状の検出と疾患の診断
難易度: Hard
コスト: High

CPUで試しやすい品質予測/異常検知深層学習Transformerテキスト

Spike-Aware C++ INT8 Inference for Sparse Spiking Language Models on Commodity CPUs

Spiking language models expose activation sparsity that dense Transformer runtimes do not directly exploit. Th

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

表形式向きセンサ/時系列品質予測/異常検知深層学習軽量化・量子化分類テキスト音声

When Tabular Foundation Models Transfer Across Modalities: A Systematic Evaluation Across 95 Datasets, 7 Modalities, and Two Regimes

We present a single classification pipeline that combines an Equiangular Tight Frame (ETF) preprocessing stage

用途: 分類
難易度: Hard
コスト: High

Decision-calibrated prediction sets for robust power system operations

Robust optimization offers a tractable approach to balance operating costs and reliability in power systems do

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

PliableBVS: A flexible Bayesian variable selection method for modeling interactions with mandatory modifying variables

High-dimensional interaction models are useful for studying, for example, how a large set of variables of inte

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Data-Automated Policy Learning for Nonlinear Welfare

This paper explores policy learning from observational data, focusing on a nonlinear welfare criterion in a bi

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

コンピュータビジョンセグメンテーション回帰テキスト

Simultaneous Model-Based Evolution of Constants and Expression Structure in GP-GOMEA for Symbolic Regression

Genetic programming (GP) approaches are among the state-of-the-art for symbolic regression, the task of constr

用途: 回帰
難易度: Hard
コスト: Medium

Pluralistic Leaderboards

Recent leaderboard-based evaluations of large language models aggregate user feedback by fitting a Bradley--Te

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Conditional Graph Diffusion for Negotiation Support: Overcoming Discrete Infeasibility and Preference Elicitation Gaps

Traditional bilateral negotiation support systems search over discrete allocation spaces. This approach encoun

用途: 生成
難易度: Hard
コスト: High

Practical and Optimal Algorithm for Linear Contextual Bandits with Rare Parameter Updates

We study linear contextual bandits under rare parameter updates: the learner may incorporate reward feedback i

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Truthful AI Advisors: A Pre-Specified Benchmark for Large Language Model Honesty Under Preference Misalignment

Large language models are increasingly deployed as advisors whose objective is not aligned with the user's: re

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Domination-Avoiding Learning Agents Cannot Collude

An influential paper of Calvano et al. empirically demonstrated that Q-learning agents spontaneously collude w

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Cheap Talk in Bilateral Trade

A single seller offers one or more goods to a single buyer. The buyer's values and the seller's costs are priv

品質予測/異常検知自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Fairness in two-player zero-sum games with bandit feedback

We study two-player zero-sum games (TPZSGs) with bandit feedback under fairness constraints requiring every ac

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-30

Active Learning with Foundation Model Priors: Efficient Learning under Class Imbalance

Real-world datasets across image and text domains are often characterized by skewed class distributions and no

少数データ向き条件最適化深層学習軽量化・量子化画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-05-30

Position: Prioritize Identifying Structure, Not Complex Models, for Scientific Discovery

Modern Machine Learning (ML) and Artificial Intelligence (AI) models, especially large language models (LLMs),

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-29

Institutions and the transmission of upper-tail human capital: scientific lineages across a millennium

What made useful knowledge cumulative was not discovery alone but the institutions that transmitted it. We pro

コンピュータビジョンセグメンテーション生成テキスト

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-05-29

From Talking Words to Sharing Thoughts: Scalable Multi-LLM Aggregation via Structured Message Passing

The emergence of specialized, domain-tuned Large Language Models (LLMs) has demonstrated that smaller models c

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-29

Used Car Salesbots? Honesty and Credulity of LLMs as Bargaining Agents under Partial Information

In this work we study agents in simulated bargaining scenarios, where a buyer and a seller communicate through

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivGitHubあり2026-05-29

Welfare, Improvability, and Variance: A Principal-Agent Approach to Optimal Benchmark Item Aggregation

AI benchmarks have well-documented limitations, with prior work examining contamination, saturation, and const

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

Evolutionary Rule Extraction from Corporate Default Prediction Models

Small and medium-sized enterprises (SMEs) represent the majority of firms in most economies and often face fin

説明可能条件最適化自然言語処理RAG分類生成回帰

用途: 分類
難易度: Hard
コスト: Low

Runtime Analysis of a Compact Genetic Algorithm on a Truly Multi-valued OneMax Function

Recently, the runtime analysis of multi-valued estimation-of-distribution algorithms in the framework of Ben J

深層学習テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Compute Allocation in Evolutionary Search: From Depth-Breadth to Multi-Armed Bandits

LLM-guided evolutionary search (Evolve systems) has reached state-of-the-art results on mathematical and combi

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivGitHubあり2026-05-28

PokerSkill: LLMs Can Play Expert-Level Poker without Training or Solvers

ポーカーはIAの代表的な問題です。しかし、強いエキスパートレベルを達成するために、長時間にわたるトレーニングと解釈が必要とされてきました。LLMを使用すると、トレーニングやソルバーが不要となり、ポーカーをプレイすることが

説明可能自然言語処理大規模言語モデルテキスト

用途: ポーカーゲーム
難易度: Hard
コスト: High

Evolutionary Dynamics of Cooperation in Next-Generation LLM Agent Systems: A Cross-Provider Empirical Extension

次世代LLMモデルの協力性に影響を与える要因について調査した。ChatGPT-4oとClaude 3.5 Sonnetは共通の協力性を持っていたが、提供元は違いだった。

用途: 次世代LLMモデルの協力性に影響を与える要因
難易度: Hard
コスト: High

Bridging Semantics and Strategy: A Dual-Stream Graph Network for Equitable Negotiation Forecasting

Forecasting outcomes in mixed-motive negotiations requires integrating explicit linguistic cues with latent st

深層学習Transformer予測テキスト

用途: 予測
難易度: Hard
コスト: High

arxivPaper only2026-05-27

Performance and Explainability Requirements of Evolutionary Algorithms in Real-World Physics-Informed Optimization

Evolutionary computation offers a variety of tools to solve complex real-world optimization problems. However,

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-05-27

Evolving to the Aesthetics of a Vision-Language Model

Evolutionary systems have demonstrated remarkable results in creative domains, with recent applications in gen

コンピュータビジョンマルチモーダル生成テキスト

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-05-27

Adaptive Bandit Algorithms for Contextual Matching Markets

We study bandit learning in matching markets, where players and arms constitute the two market sides, and the

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-26

Why Prompt Optimization Works, and Why It Sometimes Doesn't: A Causal-Inspired Edit-Level Analysis

強化学習を利用し、LLMを最適化するには、適切なパラメータを選択することが重要です。この研究では、強化学習のパラメータがLLMの性能にどのような影響を与えるかを調査し、パラメータを最適化する方法を提案することを目指す。

用途: 強化学習
難易度: Hard
コスト: High

arxivPaper only2026-05-26

Constitutional Arms Races in the Public Goods Game: Co-Evolving LLM Constitutions Under Cooperation-Defection Pressure

Frontier LLM agents engage in blackmail, sabotage, and document leaks under goal conflicts in agentic settings

説明可能自然言語処理大規模言語モデル生成回帰テキスト

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-05-26

Proper Calibeating

The classic concept of "calibrated forecasts" and its more recent refinement, "calibeating," are defined with

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-25

AgentSociety: Incentivizing Agentic Social Intelligence

The success of deployed agents relies on their ability to handle open-ended user requests using their inherent

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-23

Cloud Computing Review: A Decade of Research

The popularity and rapid development of Cloud Computing in recent years has led to a vast number of publicatio

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-22

Planktonzilla: Multimodal dataset and models for understanding plankton ecosystems

Marine plankton underpin aquatic food webs and play a key role in global CO2 sequestration, making reliable sp

少数データ向き深層学習Transformer分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

arxivPaper only2026-05-22

PeerBTS: Incentivizing Effort in Strategyproof Peer Selection

Peer selection, the evaluation and selection of agents by their peers, is an important problem in the field of

品質予測/異常検知自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-05-22

GENSTRAT: Toward a Science of Strategic Reasoning in Large Language Models

この論文では、大規模言語モデルに戦略的推論を評価する方法を提案します。

用途: 大規模言語モデルに戦略的推論の評価
難易度: Hard
コスト: High

arxivPaper only2026-05-21

Vector Policy Optimization: Training for Diversity Improves Test-Time Search

language modelは、現在、novelな環境に一般化することが求められ、推論尺度を伸ばす検索手法であるAlphaEvolveと組み合わせることが求められます。しかし、標準的なparadigmではLLMは、pre

用途: language modelの検索タスクに対応するために多様性を強化する
難易度: Hard
コスト: High

arxivPaper only2026-05-21

Not Yet: Humans Outperform LLMs in a Colonel Blotto Tournament

LLMに先行する存在としての人間の優位性を研究し、コロニエル・ブロットー・ゲームの一種であるColonel Blotto Tournamentで、人間がLLMに勝ったことが知られている。

用途: LLMの行動予測における人間の優位性
難易度: Hard
コスト: High

arxivPaper only2026-05-19

What Do Evolutionary Coding Agents Evolve?

コード生成を進化させるために、最近の研究では LLMs と進化する検索を組み合わせて、タスクに特化したフィードバックを使用してコードを生成、編集、そして選択することを実現している。タスクに特化した評価者でのベストスコアは

用途: コード生成を進化させる問題を解決する
難易度: Hard
コスト: High

arxivGitHubあり2026-05-19

optimize_anything: A Universal API for Optimizing any Text Parameter

LLM（大規模言語モデル）を利用してテキストパラメータを最適化するシステムを提案しました。このシステムは、単一のシステムでさまざまなタスク（単一タスク、複数タスク、未知の入力など）を実行可能でした。また、システムは、最適

用途: 任意のテキストパラメータを最適化することが可能
難易度: Hard
コスト: High

arxivPaper only2026-05-19

A Nash Equilibrium Framework For Training-Free Multimodal Step Verification

Multimodal large language models often generate reasoning chains containing subtle errors that lead to incorre

自然言語処理大規模言語モデルテキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-19

Real-Time Parallel Counterfactual Regret Minimization

この研究では、CFR（Counterfactual Regret Minimization）アルゴリズムを改良して、リアルタイムゲームの最適行動を推定することを目的としていますCFRは、決定を下す時間が厳密に制限されてい

CPUで試しやすい深層学習軽量化・量子化テキスト

用途: ゲームの最適行動推定
難易度: Hard
コスト: Medium

arxivPaper only2026-05-18

Reinterpreting Safety Thresholds as Neuron Spiking Thresholds

Surrogate Safety Measures (SSMs) are extensively utilised in the evaluation of traffic risk in automated drivi

説明可能深層学習Transformerテキスト3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-17

On the Complexity of Correlated Equilibria Beyond Normal-Form Games

Correlated equilibria are a fundamental solution concept in game theory. However, despite decades of research,

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-05-16

A Truthful Multiunit Profit-Optimal Mechanism for Synthesizing Social Laws

This paper studies Social Law Synthesis (SLS) in strategic multi-agent environments as a new multi-unit mechan

コンピュータビジョン動画認識生成テキスト

用途: 生成
難易度: Hard
コスト: High

MO-CAPO: Multi-Objective Cost-Aware Prompt Optimization

Large language models (LLMs) achieve strong performance across a wide range of tasks but are highly sensitive

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Structure Abstraction and Generalization in a Hippocampal-Entorhinal Inspired World Model

Hippocampal-Entorhinal の構造を取り入れ、抽象的な表現と予測的世界モデルを学習します。

自然言語処理RAG画像テキスト教師あり

用途: Hippocampal-Entorhinal の世界モデル
難易度: Hard
コスト: Low

説明可能品質予測/異常検知深層学習軽量化・量子化回帰テキスト

Towards Code-Oriented LM Embeddings for Surrogate-Assisted Neural Architecture Search

これは、パフォーマンスの高いモデルサイズの減少を実現するために、Perforated Neural Networkがキーワード検出タスクに適用されていることを検証したり、Edge Impulseで動作するキーワード検出シ

用途: キーワード検出
難易度: Hard
コスト: Low

Domain-Independent Game Abstraction using Word Embedding Techniques

ゲームの抽象化を実現する方法を提案した研究は、ゲームを大きくする要因を削減するために役立つ。しかし従来の方法は、別のゲームに応用する際にゲームごとに分析する必要がある。これは、抽象化を一般化するの難しい原因の1つとなる。

自然言語処理埋め込み・検索テキスト

用途: ゲームの抽象化を実現する
難易度: Hard
コスト: Low

On the Stability of Growth in Structural Plasticity

Standard deep-learning pipelines usually choose the network architecture before training and keep it fixed thr

深層学習CNN分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Darwin Family

用途: 自己進化言語モデルを対象とする訓練なしでの大規模言語モデルの拡大
難易度: Hard
コスト: High

Learning to Persuade a Biased Receiver

We study a repeated information design setting in which the receiver, who is also the decision-maker, updates

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Watermarking Game-Playing Agents in Perfect-Information Extensive-Form Games

ゲームにおけるAIツールの不正利用を検出、防止するための中間フォームゲームにおける水印技術の開発

用途: ゲームにおける不正行為への対処
難易度: Hard
コスト: High

Dual-axis attribution of zebrafish tectal microcircuits for energy-efficient and robust neurocomputing

保存エネルギーを活用するための脳モデルを設計し、脳モデルの中間表現を解釈することを目標とした方法を提案した。

用途: 保存エネルギーを活用するための脳モデルを設計する
難易度: Hard
コスト: Medium

MI向き品質予測/異常検知深層学習軽量化・量子化生成テキスト

Texture Regenerating and Grafting Using Genome-Driven Neural Cellular Automata

テクスチャの再生と接合を可能にする方法を提案し、NCAsをテクスチャ生成に利用することを目標としている。

用途: テクスチャの再生と接合を可能にする方法を提案する
難易度: Hard
コスト: High

The Geno-Synthetic Algorithm: Type-Factored Coevolutionary Optimization for Heterogeneous Genotypes and Assembled Phenotypes

多分類パラメーターを扱うためのタイプ-実現した共進化の方法を提案し、この方法が実

用途: 多分類パラメーターを扱うためのタイプ-実現した共進化の方法を提案する
難易度: Hard
コスト: High

Extended Scenario Bundle Analysis: A Formal Framework for Strategic Scenario Modeling

Strategic crisis analysis needs representations that combine qualitative expert judgement, explicit interdepen

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

TERMS-Bench: Diagnosing LLM Negotiation Agents Beyond Deal Rate

Negotiation is a central mechanism of economic exchange, shaping markets, procurement, labor agreements, and r

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Offline Two-Player Zero-Sum Markov Games with KL Regularization

We study the problem of learning Nash equilibria in offline two-player zero-sum Markov games. While existing a

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

MI向き品質予測/異常検知深層学習軽量化・量子化生成テキスト

ToolMol: Evolutionary Agentic Framework for Multi-objective Drug Discovery

Advances in large language models (LLMs) have recently opened new and promising avenues for small-molecule dru

用途: 生成
難易度: Hard
コスト: High

Solve the Loop: Attractor Models for Language and Reasoning

Solve the Loopは、屈折トランフォーマーの改善に役立つアルゴリズムを紹介する研究である。

用途: 屈折トランフォーマーの改善
難易度: Hard
コスト: High

Black-Box Optimization of Mixed Binary-Continuous Variables: Challenges and Opportunities in Evolutionary Model Merging

Model merging has emerged as a cost-effective alternative to training large language models (LLMs) from scratc

条件最適化自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

表形式向き品質予測/異常検知深層学習軽量化・量子化テキスト表形式

Graph-Grounded Optimization: Rao-Family Metaheuristics, Classical OR, and SLM-Driven Formulation over Knowledge Graphs

We propose graph-grounded optimization: a paradigm in which the decision variables, constraints, and objective

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Scaling Laws and Tradeoffs in Recurrent Networks of Expressive Neurons

再帰的ネットワークは複雑なプロセッサを持つため、最適化は難しい。計算資源に制限がある場合、パラメータを分配する際のバランスを取る必要がある。

用途: 再帰的ネットワークの構造の最適化を行う
難易度: Hard
コスト: Medium

Position Auctions with a Capacity Constraint

Sponsored search auctions are commonly modeled as an assignment of a fixed set of slots (positions) to a set o

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Decomposing Evolutionary Mixture-of-LoRA Architectures: The Routing Lever, the Lifecycle Penalty, and a Substrate-Conditional Boundary

We decompose an evolutionary mixture-of-LoRA system on a from-scratch ~150M-parameter widened-D substrate (D=1

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Energy-Efficient Implementation of Spiking Recurrent Cells on FPGA

FPGA上でスパイク神経ネットワークモデルを実装し、エネルギー消費を削減する方法を提案しています。

用途: エネルギー効率化
難易度: Hard
コスト: Medium

A Theory of Multilevel Interactive Equilibrium in NeuroAI

マルチエージェントシステムのゲーム理論的枠組みを構築し、エキサイタブルの理論的基盤を提供することを目指しています。

用途: マルチエージェントシステム
難易度: Hard
コスト: High

Joint sparse coding and temporal dynamics support context reconfiguration

この研究では、適応性とリメインリングの関係を調査しました。これは、動的な環境における学習において重要な要素です。

用途: 行動の適応性とリメインリング
難易度: Hard
コスト: High

Prospective Compression in Human Abstraction Learning

人間的抽象化を推定するための新たなアプローチを提案し、未知のタスクを効率的に学習することができます。

用途: 人間的抽象化
難易度: Hard
コスト: High

When to Ask a Question: Understanding Communication Strategies in Generative AI Tools

Generative AI models differ from traditional machine learning tools in that they allow users to provide as muc

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-05-10

Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution

Large Language Models exhibit mode collapse, producing homogeneous outputs that fail to explore valid solution

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-05-10

EvoPref: Multi-Objective Evolutionary Optimization Discovers Diverse LLM Alignments Beyond Gradient Descent

Gradient-based preference optimization methods for large language model (LLM) alignment suffer from preference

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-09

Evolutionary Ensemble of Agents

We introduce Evolutionary Ensemble (EvE), a decentralized framework that organizes existing, highly capable co

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivGitHubあり2026-05-09

ARES-LSHADE: Autoresearch-Enhanced LSHADE with Memetic Polish for the GNBG Benchmark

We present ARES-LSHADE, a memetic differential-evolution variant submitted to the GECCO 2026 competition on LL

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-09

AHD Agent: Agentic Reinforcement Learning for Automatic Heuristic Design

Automatic heuristic design (AHD) has emerged as a promising paradigm for solving NP-hard combinatorial optimiz

深層学習軽量化・量子化生成テキスト強化学習

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-05-08

Kernel Foundry: A Diagnosis-driven Evolutionary Kernel Optimizer with Multi-Experts

Generating high-performance GPU kernels remains challenging due to the need for both correctness and hardware-

用途: 生成
難易度: Hard
コスト: High

arxivGitHubあり2026-05-07

CoupleEvo: Evolving Heuristics for Coupled Optimization Problems Using Large Language Models

CoupleEvoは、大規模言語モデルを活用したカップルの最適化問題の自動ヒューリスティクーデザインアプローチを提案します。3つの進化的調整戦略が提示されます。

用途: カップルの最適化問題を解決する
難易度: Hard
コスト: High