MLinfo | 機械学習・AI論文まとめ

MLinfo|日々更新される技術をキャッチアップ/検索

「text」の検索結果

627 件

すべて arxiv github huggingface 実装あり

githubGitHubあり2026-06-10

screenpipe — YC (S26) | AI that knows what you've seen, said, or heard. Records everything you do, say, hear 24/7, local, private, secure

ユーザーの行動を認識し、オートエージェントを構築するためのツール。

自然言語処理大規模言語モデルテキスト音声マルチモーダル

用途: オートエージェント構築
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

transformers — 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

🤗 Transformersは、テキスト・ビジョン・音声など複雑なモデル定義をサポートするフレームワークで、インフェレンスターやトレーニングに使用できる。

深層学習Transformer分類テキスト音声

用途: 機械学習モデル定義
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

paperless-ngx — A community-supported supercharged document management system: scan, index and archive all your documents

paperless-ngxは、コミュニティによってサポートされたスーパーチャージドのドキュメント管理システムで、ドキュメントのスキャン・インデックス・アーカイブが可能である。

強化学習方策勾配 (PPO / A3C)分類テキスト

用途: ドキュメント管理
難易度: Easy
コスト: Low

→

githubGitHubあり2026-06-09

diffusers — 🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

.diffusion モデルのライブラリ。画像・動画・音声生成に利用可能。

生成AI拡散モデル生成画像テキスト

用途: 画像・動画・音声生成
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

label-studio — Label Studio is a multi-type data labeling and annotation tool with standardized output format

データラベル化と注釈化を行うためのツールです。

コンピュータビジョン物体検出分類セグメンテーション画像

用途: データラベル化ツール
難易度: Easy
コスト: Low

→

githubGitHubあり2026-06-09

cs249r_book — Machine Learning Systems

マシンラーニングシステムの理論と実装に関する本。

深層学習テキスト

用途: 機械学習システム
難易度: Easy
コスト: Medium

→

githubGitHubあり2026-06-09

awesome-llm-unlearning — A resource repository for machine unlearning in large language models

このリポジトリは大規モデルの無学習に関するリソースをまとめたものです。

自然言語処理大規模言語モデルテキスト

用途: 大規模言語モデルの無学習
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

Meshroom — Node-based Visual Programming Toolbox

ノードベースのビジュアルプログラミングツールです。

コンピュータビジョン3D・点群画像テキスト3D

用途: ビジュアルプログラミングツール
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

unsloth — Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

Unsloth Studioは、オープンモデルのトレーニングと実行を支援するWebUIです。このライブラリは、Gemma4、Qwen3.5などのオープンモデルのテストとトレーニングを支援するために使われます。

自然言語処理大規模言語モデルテキスト音声

用途: オープンモデルのトレーニングと実行
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

sglang — SGLang is a high-performance serving framework for large language models and multimodal models.

SGLangは、大規模言語モデルのサービングフレームワークです。このライブラリは、高性能なサービスフレームワークで、大規模言語モデルのサービングをサポートしています。

深層学習Transformer画像テキストマルチモーダル

用途: 大規模言語モデルのサービングフレームワーク
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

Sana — SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

SANAは、高解像度画像生成モデルSANAを紹介する本研究であり、低計算コストで優れた高解像度画像を生成できる。

深層学習Transformer生成画像テキスト

用途: 高解像度画像合成
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

Helios — Helios: Real Real-Time Long Video Generation Model

長時間のビデオ生成を実現するためのモデルのサポートを紹介している。

深層学習軽量化・量子化生成画像テキスト

用途: ビデオ生成
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

haystack — Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

オープンソースのAIオーケストレーションフレームワークです。LLMアプリケーションの構築に必要なパイプラインやエージェントワークフローの設計ができるようになっています。

深層学習Transformer生成要約テキスト

用途: LLMアプリケーションの構築
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

DocsGPT — Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

このリポジトリでは、トークナイザーの最適化を提供しています。

深層学習Transformerテキスト

用途: トークナイザーの最適化
難易度: Easy
コスト: Medium

→

githubGitHubあり2026-06-09

FunASR — Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

電気生理信号から表現を学習し、脳コンピューターインターフェースの開発を支援する。

深層学習Transformer分類検出テキスト

用途: 電気生理信号から表現を学習する
難易度: Easy
コスト: Low

→

githubGitHubあり2026-06-09

unstructured — Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

ドキュメントを構造化するために使えるオープンソースのETLソリューション。

表形式向き自然言語処理大規模言語モデル画像テキスト表形式

用途: ドキュメントの構造化
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

txtai — 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

LLMを利用するために、セマンティック検索やLLMのオーケストレーションなどを行えるフレームワーク。

深層学習Transformer生成テキスト

用途: セマンティック検索
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-09

TextBlob — Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

テキスト分析、センチメント分析や単語分割などを行えるライブラリ。

自然言語処理テキスト音声

用途: テキスト分析
難易度: Easy
コスト: Medium

→

arxivPaper only2026-06-08

Rethinking the Divergence Regularization in LLM RL

この論文では、LLM RLの安定性を向上させるために、離散化と重み付けを用いた分散化されたPPOを提案します。また、安定性の向上によって、大規模言語モデルを用いたRLの適用が可能になります。

自然言語処理大規模言語モデルテキスト強化学習

用途: LLM RLの安定性向上
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Echo-Memory: A Controlled Study of Memory in Action World Models

この研究では、エピソード記憶を制御するために、エピソード記憶モデルを設計および評価しました。エピソード記憶モデルは、エピソード内の重要な情報を記憶し、エピソード間の相関関係を特定することができます。

品質予測/異常検知コンピュータビジョンセグメンテーション生成画像テキスト

用途: エピソード記憶
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Bandits for Efficient Experimentation: Adapting to Control Group, Preferences, and Context Drifts

この研究では、有効なバンドのオブザーバブックを設計しました。このオブザーバブックは、ユーザの相互作用とコンテキストの変化に応じて、有効バンドをアダプティブに選択することができます。

深層学習軽量化・量子化回帰テキスト

用途: 有効なバンドのオブザーバブック
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan

この研究では、低リソース言語NMTのために、データ合成方法を開発しました。これにより、データ合成されたコーパスを使用し、NMTモデルをパラメータ効率的にフィーヌチュン化できます。

深層学習軽量化・量子化生成翻訳テキスト

用途: NMT低リソースデータ合成
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Your Model Already Knows: Attention-Guided Safety Filter for Vision-Language-Action Models

Vision-Language-Action (VLA) models have demonstrated impressive end-to-end performance across a variety of ro

深層学習軽量化・量子化テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Learning to Attack and Defend: Adaptive Red Teaming of Language Models via GRPO

AIリードチームは、進化する攻撃者と防御者に対処するために、継続的対応が必要です。強化学習を使うと、新しい攻撃を探し出すことができ、同時に強化学習を使って防御を強化することもできます。新しいフレームワークAdvGRPOは

強化学習方策勾配 (PPO / A3C)テキスト

用途: 攻撃の応答
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks

大規模言語モデル（LLM）を運用するコンテンツモデレーションシステムは、有害なオンラインコンテンツを防止するために重要な役割を果たします。しかし、これらのシステムの主な目標は単にトークナイズされたテキストを操作することに

自然言語処理大規模言語モデル分類検出画像

用途: 文書の分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Correlation Is Not Enough: Embedding Human Metadata for Individual Causal Discovery

バイオメディカル言語モデルの場合も、Cosine Similarityで2つのトピックを関連付ける際に、0.83をスコアに返却しますが、実際にはその2つは関係がありません。このことから、off-the-shelfのバイオ

深層学習Transformerテキスト

用途: 個体の因果検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Algorithm for Contextual Queueing Bandits with Rate-Optimal Queue Length Regret

Contextual queueing bandits provide a framework for learning to schedule heterogeneous jobs under unknown cont

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

In-Context Learning for Latent Space Bayesian Optimization

Bayesian optimization (BO) is a central tool for sample-efficient design, and latent-space Bayesian optimizati

少数データ向き表形式向きCPUで試しやすいMI向き深層学習軽量化・量子化回帰テキスト表形式

用途: 回帰
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

End-to-End Context Compression at Scale

Long-context language model inference is bottlenecked by memory, as the KV cache grows with context length. Re

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Muon Learns More Robust and Transferable Features than Adam

Muon has recently emerged as a state-of-the-art optimizer for pretraining Large Language Models (LLMs) and vis

深層学習Transformer分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

ReCoVLA: VLM-Guided Reward Compilation for Failure Recovery in Vision-Language-Action Policies

Vision-language-action (VLA) policies provide strong priors for language-conditioned manipulation, but remain

自然言語処理RAGテキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Code Is More Than Text: Uncertainty Estimation for Code Generation

コード生成を安全かつ信頼できる方法で行うことを目的とした研究。コード生成における不確実性を推定する方法を提案し、コードの解釈可能性と安全性を向上させる。

自然言語処理大規模言語モデル生成テキスト

用途: コード生成を安全かつ信頼できる方法で行う
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

PRISM: Recovering Instruction Sets from Language Model Activations

ラングラージュモデルを解釈するためのアクティベーション分析を提案。モデルを分析することで、モデルがどのようなコードを生成しているかを理解する。

自然言語処理大規模言語モデルテキスト

用途: ラングラージュモデルを解釈するためのアクティベーション分析
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Streaming Interventions: Can Video Large Language Models Correct Mistakes as They Occur?

動画大規模言語モデルを使用した質問に対する回答を研究。モデルの能力と限界を調査し、質問に対する答えを生成するための方法を提案した。

深層学習軽量化・量子化テキスト動画マルチモーダル

用途: 動画大規模言語モデルを使用した質問に対する回答
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference

ラングラージュモデルを効率的に推論することを目的とした研究。モデルの推論を効率化するために、モデルの深さを自動的に調整する方法を提案した。

品質予測/異常検知深層学習Transformerテキスト

用途: ラングラージュモデルを効率的に推論する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Breaking the Tokenizer Barrier: On-Policy Distillation across Model Families

On-Policy Distillation (OPD) has become a core technique in the post-training of Large Language Models (LLMs)

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Graph Mamba Operator: A Latent Simulator for Interacting Particle Systems

Modeling interacting dynamical systems requires capturing spatial interactions alongside long-range temporal d

深層学習グラフニューラルネットテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

LargeMonitor: Monitoring Online Task-Free Continual Learning via Large Pretrained Models

オンライン学習の継続学習では、モデルは非駅性データストリームから知識を継続的に蓄積する必要があります。モデルのパラメータはトレーニング中に効果的に調整される必要がありますが、パラメータ効率的なプロンプトチューニングや

深層学習軽量化・量子化検出テキストマルチモーダル

用途: オンライン学習の継続学習
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Now You (Still) See Me: Detecting Evasive Steganographic Payloads in LLMs

最近の研究では、線形プローブを使用して暗示された秘密を内部アクティブ化から回復し、ステラングラフィック侵入の検出を改善しました。しかし、ステラングラフィック侵入を検出し、内部アクティブ化を検知するには、ステラングラフィッ

自然言語処理大規模言語モデル検出テキスト

用途: ステルタグラフィックの侵入検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models

この研究では、機械学習モデルをプライバシー保護のための適応化する際、プライバシー保護の実験的な効果を分析することに関与します。

深層学習軽量化・量子化異常検知テキスト

用途: プライバシー保護のベンチマーク
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Distilling Safe LLM Systems via Soft Prompts for On Device Settings

この研究では、強力な防御ガードモデルと低パラメータのLLMを組み合わせたデュアルモデルシステムを導入し、安全なLLMのデプロイに使用できます。

深層学習軽量化・量子化テキスト

用途: セーフなLLMのディストリビュート
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short

この研究では、強化学習のトレーニングに使用するRewardsの検証が難しい場合は、Rewardがグループレベルでは無価値で、グループ間の優劣比較が不可能となる問題に対処するためのReasoning Arenaを提案します

品質予測/異常検知深層学習軽量化・量子化生成テキスト強化学習

用途: 強化学習のトレーニング
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Scaling Neural Network Verification with Tensor Parallelism and Fully Sharded Data Parallelism

この研究では、Tensor ParallelismとFully Sharded Data Parallelism技術を利用して、GPU メモリ限界のある従来の検証アーキテクチャの制約を解いて、機械学習ネットワークの検証を

深層学習CNNテキスト音声

用途: 予測ネットワークの検証
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Zero-Shot Semantic Re-Identification for Autonomous Driving: A VLM Baseline Study

この研究では、ゼロショットセマンティック再特定の基準を設定し、画像のセマンティック特定を自動化します。

説明可能センサ/時系列深層学習CNN画像テキストマルチモーダル

用途: セマンティック再特定
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

PBSD: Privileged Bayesian Self-Distillation for Long-Horizon Credit Assignment

この研究では、長期的なタスクの再帰の信用割当問題に対処するために、長期的なタスクの再帰をサポートするPrivileged Bayesian Self-Distillation (PBSD) を提案します。

品質予測/異常検知深層学習軽量化・量子化テキスト強化学習

用途: 低レベルタスクの再帰
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Conan-embedding-v3: Fusing Modality-Specific Models for Omni-Modal Embedding

この研究では、テキスト、画像、ビデオ、アウディオ等の異なるモダリティのデータを統合したオムニモダル検索システムを構築します。

自然言語処理ファインチューニング回帰検索画像

用途: オムニモーダル検索
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

PRISM: Topology-Aware Cross-Modal Imputation for Modality-Deficient Federated Graph Learning

Multimodal federated graph learning (MM-FGL) aims to collaboratively learn from decentralized graphs with text

自然言語処理RAG画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Intention Driven Identification of In-Possession Match Phases in Association Football through Temporal Graph Learning

Understanding tactical organisation of association football, hereafter referred to as football, requires ident

説明可能品質予測/異常検知深層学習Transformer分類セグメンテーションテキスト

用途: 分類
難易度: Hard
コスト: Low

→

arxivGitHubあり2026-06-08

Internalizing Geometric Law: Learning from Solver Residuals for Precision-Critical Generation

自然言語から機械設計や技術図案などの正確な構成を作成することができるシステムを開発しました。このシステムは、Geometric Constraintsを満たす正確な構成を作成するために、Constraint DSL (D

自然言語処理大規模言語モデル生成テキスト

用途: 機械設計や技術図案の生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Multi-View Speech Representation Learning for Parkinson's Disease Detection Using Context-guided Cross-modal Attention

パーキンソン病（PD）の早期検出への取り組みとして、脳の損傷が発症前に生じる話術障害を分析するため、音声分析を用いてパーキンソン病の診断を提唱しています。

センサ/時系列深層学習Transformer検出生成埋め込み

用途: パーキンソン病の早期検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Orange Lab: Lowering Barriers to Data Mining through Embedded Interactive Workflows

この論文では、data mining におけるビジュアルプログラミングフレームワーク、Orange Lab を提唱しました。これにより、Webベースのデータ分析環境を提供し、ユーザーフェイシングの分析ツールとしてデータ分

MI向き自然言語処理ファインチューニング画像テキスト

用途: データ分析フロー
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

The Injection Paradox: Brand-Level Suppression in Safety-Trained LLM Recommendations via RAG Context Injection

この論文では、RAG によって安全に訓練されたLLMに攻撃を加えた結果、RAGによって安全に訓練されたLLMの推論が抑制されることを示しています。これは、RAGによって訓練されたLLMが、推論を抑制するために使われたコン

自然言語処理大規模言語モデルテキスト

用途: LLM の安全な推論
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Asymptotic Optimality of Thompson Sampling for Risk-Averse Bandits with Sub-Gaussian Rewards

これは、不確実性やリスクを減らすために、$\rho$-NPTS (Nonparametric Thompson Sampling) というアレイフリーの非パラメトリックベースのThompson Samplingで、リスク

コンピュータビジョンセグメンテーションテキスト

用途: リスク厳格なマルチ腕バンディットの最適化
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Crop Recommendation and Agricultural Query Answering System Using Spatio-Temporal Graph Neural Networks and Hybrid Retrieval Augmentation

This paper presents a unified system designed to support precision agriculture by integrating advanced weather

深層学習Transformer生成テキスト

用途: 生成
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

Late-Layer Fusion is Enough: Dual-Path Vision Token Routing for Multimodal Large Language Models under Visual Saturation

Multimodal large language models (MLLMs) commonly inherit the deep, symmetric Transformer backbone designed fo

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Driving Video Retrieval for Complex Queries with Structured Grounding

Video retrieval at scale is central to data curation and safety validation in autonomous driving, where users

コンピュータビジョンマルチモーダルテキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

From Shortcuts to Reasoning: Robust Post-Training of Theory of Mind with Reinforcement Learning

理論的思考は、最新の基礎モデルシステムが安全かつ効果的に現実世界で動作するには必須のスキルであると考えられています。しかし、理論的思考の進進には、「ショートカット」問題が存在し、タスクは99％の正解率を達成するのに、ただ

自然言語処理RAGテキストマルチモーダル強化学習

用途: 理論的思考の強化問題
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

Beyond FLOPs: Benchmarking Real Inference Acceleration of LLM Pruning under a GEMM-Centric Taxonomy

分析研究は、LLM推論速度を速めるため、トークン、レイヤー、ヘッド、次元、注意パターンの削減技術である削減技術を適用し、広範なパラダイムとして成長しています。削減方法の実装によって、実現された加速の度合いは、ハードウェア

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: LLM推論加速問題
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

LLM推論において、長いコンテキストを扱うことが多く、GPUメモリボトルネックの問題が起きます。この課題に対処するために、Neural Memory Indexerと呼ばれる Neural Memory Indexerを

自然言語処理大規模言語モデルテキスト

用途: GPUメモリ確保問題
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Stage-1 Controls the Entropy Regime, Not the Outcome

Two-stage post-training -- a Stage-1 warm-start (supervised fine-tuning, SFT, or on-policy distillation, OPD)

深層学習軽量化・量子化テキストマルチモーダル強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

MilliVid: Hierarchical Latents for Long-Range Consistency in Video Generation

Video generative models have become increasingly powerful, but long-range consistency remains challenging to a

深層学習Transformer生成テキスト動画

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

INFUSER: Influence-Guided Self-Evolution Improves Reasoning

Self-evolution offers a scalable path to stronger reasoning: a pretrained language model improves itself with

機械学習教師なし学習テキスト教師なし

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Decoy-Calibrated Failure Audits for Language Models

Useful audits reveal not only how often a model fails, but also where its failures concentrate. An auditor may

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

Clinical early warning systems built on electronic health records, in which clinical observations are recorded

説明可能センサ/時系列品質予測/異常検知自然言語処理大規模言語モデルテキスト時系列

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Understanding Quantization-Aware Training: Gradients at Quantized Weights Bias to the Low-Loss Basin

Post-training quantization (PTQ) converts a trained full-precision model into low-bit weights without task-lev

深層学習軽量化・量子化テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Structure-Aware Modeling of Multiple-Choice Questions Improves Automatic Difficulty Estimation

質問の難易度を自動で推定することで、教材の質問を作成する際の手間を軽減し、学習者の成果を高めることができます。

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: 質問の難易度推定
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops

エージェントの安全性を向上するために、ハッカーのフェイクオートを作成して、リスクを評価するための新しいアプローチを提案します。

自然言語処理大規模言語モデルテキスト

用途: エージェントの安全性向上
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

From Hazard Functions to Language Space: Cox-Supervised Distillation of Survival Risk into a Large Language Model

言語モデルの寿命リスクへの適用を実現するために、コックス比例危険モデルを使用して、新しいアプローチを提案します。

深層学習軽量化・量子化生成画像テキスト

用途: 言語モデルの寿命リスクへの適用
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Diffuse AI Control on Fuzzy Tasks

この論文では、AI 系統が安全性の検証を容易にするために、新しいフレームワークを提出する。これにより、AI 系統の安全性の評価がより効果的になる。

自然言語処理大規模言語モデルテキスト

用途: AI 安全性の検証
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics

この論文では、VLM ゲームエージェントの評価基準が提供され、さまざまなタイプのエージェント間の比較が可能になる。

自然言語処理大規模言語モデルテキストマルチモーダル

用途: VLM ゲームエージェントの評価基準
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing

この論文では、ロボット手術の制御を改善するために、ロボットの視覚的シーンの動作と操作を同時にモデル化する方法を提案する。

深層学習Transformer画像テキスト動画

用途: リモートハンドリングの制御
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting

この論文では、AI エヴァルレーション結果をより効果的に解釈するために、新しいフレームワークを提案する。

説明可能コンピュータビジョンセグメンテーションテキスト

用途: AI エヴァルレーション結果の解釈
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

この論文では、エージェントの委譲能力を改善するために、新しいフレームワークを提案する。これにより、エージェントがより効率的にタスクを分割できる。

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: エージェントの委譲
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Beyond Probabilistic Similarity: Structural, Temporal, and Causal Limitations of Retrieval-Augmented Generation in the Legal Domain

この論文では、法令上の異議申し立てを検出し、法令上の違反を最小限に抑える方法を提案する。

自然言語処理RAG検出生成テキスト

用途: 法令上の異議申し立ての検出
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

Observability for Delegated Execution in Agentic AI Systems

この論文では、分散型エクスキューションの観察性を考慮するために、新しいフレームワークを提案する。これにより、分散型エクスキューションの評価がより効果的に行える。

深層学習軽量化・量子化テキスト

用途: 分散型エクスキューションの観察性
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors: A Vendor-Neutral Reference for FP8, BF16, MXFP4, and Microscaling Formats

この論文では、数値形式の標準化を提案する。これにより、数字の解釈と操作がより効率的に行える。

機械学習教師あり学習テキスト

用途: 数値形式の標準化
難易度: Easy
コスト: Medium

→

arxivPaper only2026-06-08

(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs

この論文では、自動化された形式化を提案する。これにより、形式化プロセスがより効率的に行える。

自然言語処理大規模言語モデルテキスト

用途: 自動化された形式化
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks

Spatial reasoning is a foundational capability for multimodal large language models (MLLMs) to perceive and op

自然言語処理大規模言語モデル画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

ArtiFact: A Large-Scale Multi-Modal Cultural Heritage Dataset

LLMを用いた臨床研究論文の草案作成を支援するために、生成されたテキストを検証するためのアーキテクチャを設計。これにより、虚偽の citaion、数字の不正確な記録、およびガイドライン違反が防がれます。

品質予測/異常検知コンピュータビジョン動画認識検出画像テキスト

用途: 医学論文執筆のサポート
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

ATN3D: Density-Aware LiDAR-Radar Early 3D Object Detection Under Extreme Sparsity

自動運転車やインテリジェント輸送システムなどの自動化された車両の感知には3次元オブジェクト検出が必要です。道路での長距離検出は困難ですが、道路ではこの「長距離」に対する感知と決定の時間は約1-2秒です。2つの主な課題が現

センサ/時系列深層学習Transformer分類検出テキスト

用途: 車のデッキの長距離認識に対する3次元オブジェクト検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving

LLM間でモデル呼び出しと外部ツールの呼び出しが交互になり、サーバのサーヒングがステートレスの要求処理からステートフルなプログラム実行に移行します。これらのワークロードの評価は、各設計点ごとに専門的なアクセラレータ時間を

自然言語処理大規模言語モデルテキスト

用途: LLMのサーバー処理のためのシミュレータ
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text

Chain-of-Thought (CoT) improves the performance of Large Language Models (LLMs) and has been extended to Multi

深層学習軽量化・量子化画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

TABVERSE: Benchmarking Cross-Format Table Understanding in LLMs and VLMs

Large Language Models (LLMs) and Vision-Language Models (VLMs) are increasingly evaluated on table reasoning t

自然言語処理大規模言語モデルQA画像テキスト

用途: QA
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

AI Scientists Are Only as Good as Their Evidence: A Stratified Ablation of Proprietary Data and Reasoning Skills in Drug-Asset Valuation

AI Scientist agents are often evaluated as if capability were mainly a function of model quality, prompting, o

品質予測/異常検知深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

Two-server secure inference allows a client to query a hosted large language model (LLM) without revealing pro

深層学習Transformer生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

SecureClaw: Clawing Back Control of LLM Agents

Tool-using large language model (LLM) agents face two distinct security failures: unauthorized external action

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Model Poisoning Against Federated Model Adaptation with Chain of Bit-Flips

Federated Learning (FL) allows a set of clients to collectively train a global model without sharing local tra

深層学習CNNテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Emergence of Context Characteristics Sensitivity in Large Language Models

During instruction fine-tuning (IFT), large language models (LLMs) learn to follow instructions by using the p

深層学習Transformerテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Closing the Prior-Posterior Loop: Self-Reflective Molecular Design with Analysis-Driven LLM Iteration

Can a general-purpose large language model design molecules with the precision of a seasoned chemist? Current

MI向き自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs

Existing sparse attention and KV cache compression methods for long-context LLM inference typically apply fixe

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Deterministic Integrity Gates for LLM-Assisted Clinical Manuscript Preparation: An Auditable Biomedical Informatics Architecture

Objective. Large language models (LLMs) increasingly draft clinical research manuscripts, but their fluency ca

品質予測/異常検知自然言語処理大規模言語モデル検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Targeting World Models to Compromise Robot Learning Pipelines

世界モデルがロボットの学習パイプラインに導入されると、安全でないロボットがDeploymentされるリスクが生じる可能性があることが示されている。

深層学習軽量化・量子化生成テキスト

用途: ロボットの安全な使用を確保する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

LLM-Orchestrated Conformance Checking in Stroke Care Without Computer-Interpretable Guidelines

医療のガイドラインとの適用を自動的に評価することを目的とするコンフォーマンスチェックフレームワークが開発された。Large Language Models (LLMs) を用いて、コンフォーマンスチェックを実現する。

説明可能自然言語処理大規模言語モデルテキスト

用途: 医療におけるガイドラインの適用を支援する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

AliyunConsoleAgent: Training Web Agents in Real-World Cloud Environments via Distillation and Reinforcement Learning

Webエージェントを自動化するためのAliyunConsoleAgentが提案され、ドキュメントの検証とWebエージェントの開発を簡素化する。

深層学習軽量化・量子化テキスト強化学習

用途: Webエージェントの自動化
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

SIFT: Selective-Index For Fast Compute of RAG Prefill by Exploiting Attention Invariance

RAGプレフィルへの速力向上を目的としたSIFTが提案され、TTFTを短縮し、コストを削減する。

品質予測/異常検知深層学習Attention機構生成テキスト

用途: RAGプレフィルへの速力向上
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Context-Aware Deep Learning for Defect Classification in Atomic-Resolution STEM

マテリアルの非破壊検査を目的としたContext-Aware Deep Learningが提案され、エアロックの欠陥を検出する。

MI向き品質予測/異常検知コンピュータビジョンマルチモーダル分類検出画像

用途: マテリアルの非破壊検査
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

AI Assurance in UK Defence: Challenges in Operationalising JSP 936

スキルアジュストの能力獲得を目的としたCapability-Aligned Hierarchical Learningが提案され、LLMsが外部ツールを操作してタスクを実行する能力を獲得する。

生成AIGANテキスト

用途: スキルアジュストの能力獲得
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

RunAgent SuperBrowser: A Theory of Autonomous Web Navigation Grounded in Human Browsing Behaviour

We present SUPERBROWSER, an autonomous web-navigation agent designed against a single guiding hypothesis: a we

MLOpsパイプライン構築画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Real-time body pose non-verbal communication with a consistency-based reliability measure

Body movement communicates intent at distances and in conditions where neither the face, nor speech can be cap

機械学習教師なし学習分類予測テキスト

用途: 分類
難易度: Hard
コスト: Low

→

arxivGitHubあり2026-06-08

PhysScene: A Scene Graph Dataset for Scientific Visual Reasoning in Physics Experiments

Scene Graphs (SGs) provide structured representations of visual scenes by modeling objects and their pairwise

強化学習方策勾配 (PPO / A3C)画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Experience Makes Skillful: Enabling Generalizable Medical Agent Reasoning via Self-Evolving Skill Memory

Medical agent systems are increasingly expected to support interactive clinical decision making rather than on

生成AIGANQAテキスト

用途: QA
難易度: Hard
コスト: Low

→

arxivGitHubあり2026-06-08

TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders

可勉強のターブルの信号に関する表現モデルが、異なるトレーニングパラダイムを持つモデルを評価しやすくする基準であるTRL-Benchを提案している。

表形式向き品質予測/異常検知深層学習軽量化・量子化埋め込みテキスト表形式

用途: 可勉強のタブラー信号に対する表現モデルの評価基準を標準化する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Anything2Skill: Compiling External Knowledge into Reusable Skills for Agents

代理機器が外部の知識を活用して、多くのタスクを効率的に解決できる「Anything2Skill」を提案している。

自然言語処理RAG生成テキスト

用途: 代理機器が外部の知識を活用して、効率的に問題解決ができる技術の開発
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

Brain-Prompt Injection: A Route-Safety Audit for BCI-LLM Agents

脳-エージェント接続での脳サイン入力を安全に実行できるシステムを提案し、脳サイン入力攻撃を検知することができる。

深層学習Transformerテキスト

用途: BCI-LLMエージェントに脳サインを入力する際に安全さを確保するためのシステム
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

End-to-End Training for Discrete Token LLM based TTS System

エンドツーエンドトレーニングによるTTSシステムを提案し、エンドツーエンドトレーニングの利点を確認している。

自然言語処理大規模言語モデル分類生成テキスト

用途: エンドツーエンドトレーニングによるTTSシステムの提案
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

MASS: Deep Research for Social Sciences with Memory-Augmented Social Simulation

Social Scienceにおける、Memory-Augmented Social Simulationを利用した深層学習を利用して、新しい研究方法を提案し、Social Scienceの研究実現を実現した

品質予測/異常検知深層学習Transformer生成テキスト

用途: Social Scienceにおける、Memory-Augmented Social Simulationを利用した深層学習を利用した研究の実現
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Culturally-Adapted Red-Teaming Across East and Southeast Asian Contexts: A Methodological and Comparative Analysis

Multilingual safety evaluation of large language models (LLMs) has predominantly relied on direct translation

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

IMUG-Bench: Benchmarking Unified Multimodal Models on Interleaved Understanding and Generation

In recent years, unified multimodal models (UMMs) have emerged to support both understanding and generation wi

自然言語処理プロンプトエンジニアリング生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Unified Energy for Invariant and Independent Decoding in Diffusion Language Models

Diffusion Language Models (DLMs) enable parallel text generation by iteratively denoising a full sequence, off

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

SEF-CLGC at SemEval-2026 Task 11: Logical Notation Impact on Language Model Performance

This paper revisits our pipeline called Syllogistic Evaluation Framework-Common Logic Grammar Construction (SE

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models

Egocentric visionを使用して、ペダストリアンの歩く道に渡るのを予測する。Closed-ended visual question answering（VQA）問題に形式することで、ビジョン言語モデルを使用

深層学習TransformerQA画像テキスト

用途: ペダストリアンが歩く道に渡るのを予測する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Steganography Without Modification: Hidden Communication via LLM Seeds

大規模言語モデル（LLM）の推論スタックには、モデルの重み、サンプリングコード、および出力分布を変更することなく、暗号化なしで秘密コミュニケーションを行うステゴグラフィチャンネルが存在する。送信者はシークレットデータを秘

自然言語処理大規模言語モデル生成テキスト

用途: 暗号化なし: LLMのシードを使用した秘密のコミュニケーション
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

From USD Scenes to Knowledge Graphs: Zero-Shot Ontology Grounding with LLMs

3次元シミュレーションシーンから知識グラフを構築することが、ロボットのタスク推論に重要な役割を果たすが、シーンのオブジェクトを形式的な分類にマッピングするステップが、現実に現れていない。LLMを使用して、このマッピングの

自然言語処理大規模言語モデルテキスト3D

用途: 3次元シミュレーションシーンから知識グラフを構築する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Vision Language Model Helps Private Information De-Identification in Vision Data

ビジュアル言語モデル（VLM）は、プライバシー保護において有効性の高い能力をもつ。しかし、視覚データを扱う際のプライバシーリスクについては、それまでほとんど注目されていなかった。VLMを使用して、プライバシー保護を確保す

コンピュータビジョン物体検出分類検出画像

用途: ビジョン言語モデルを使用したビジュアルデータのプライバシー保護
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges

大規模言語モデルのプライバシーリスクについては、既に研究が行われていたが、マルチモデル大規模言語モデル（MLLM）のプライバシーリスクについては、まだ十分に調査されていなかった。MLLMでは、テキストだけでなく画像データ

自然言語処理大規模言語モデル画像テキスト

用途: マルチモデル大規模言語モデルにおけるプライバシーリスク
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

A Regret Minimization Framework on Preference Learning in Large Language Models

強化学習（RL）では、与えられた問題に対して、正しいアクションを見つけることを目的としたことが多いが、人間のフィードバックから学習する場合、人間の意思決定の選択のための意思決定のフレームワークを構築する必要性から、可否決

自然言語処理大規模言語モデルテキスト強化学習

用途: 可能な行動の選択のための意思決定フレームワーク
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

ComplexConstraints and Beyond: Expert Rubrics for RLVR

訓練データ以外の問題解決を検討したため、新しい評価方法であるexpert-curated rubric-based evaluationを提案。

品質予測/異常検知自然言語処理大規模言語モデル異常検知テキスト

用途: 訓練データ以外の問題解決
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Graph2Idea:Retrieval-Augmented Scientific Idea Generation with Graph-Structured Contexts

科学的アイデア生成には、現実に実現可能な高質のアイデアを必要とするが、この課題を解く方法は不足していたため、新しい方法であるGraph2Ideaを提案。

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 科学的アイデア生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Context Rot in AI-Assisted Software Development: Repurposing Documentation Consistency for AI Configuration Artifacts

AIアシスタントを使用

自然言語処理大規模言語モデルテキスト

用途: コンテキストの保持のための開発方法
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

DynaOD: Dynamic Origin-Destination Flow Generation with Discrete-to-Continuous Temporal Semantic Modeling

Dynamic origin-destination (OD) flow generation seeks to synthesize realistic mobility dynamics from temporal

深層学習Transformer生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Context-Fractured Decomposition Attacks on Tool-Using LLM Agents: Exploiting Artifact Provenance Gaps

Tool-using LLM agents interact with the world through actions that persist state in artifacts (e.g., workspace

MI向き自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

REFLECT: Intervention-Supported Error Attribution for Silent Failures in LLM Agent Traces

Large language model (LLM) agents now solve complex tasks through long plan-and-execution traces, yet the abil

自然言語処理大規模言語モデル分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

See More, Think Deeper: Query-Expanded Visual Evidence and Answer-Clue Guided Reflection for Long Video Understanding

Recent advances in Video Large Language Models (Video-LLMs) have enabled performance on long-video understandi

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

BareWave: Waveform-Native Flow-Matching Text-to-Speech

Removing intermediate representations and separately trained decoding stages has become an important direction

センサ/時系列品質予測/異常検知深層学習軽量化・量子化生成テキスト音声

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Personalization Meets Safety:Mechanisms,Risks,and Mitigations in Personalized LLMs

Large Language Models (LLMs) have enabled increasingly personalized interactions by adapting to users' prefere

MI向き深層学習軽量化・量子化テキストマルチモーダル強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

A Multi-Agent System for IPMSM Design Optimization via an FEA-AI Hybrid Approach

Interior permanent magnet synchronous motor (IPMSM) design requires balancing conflicting objectives and multi

説明可能自然言語処理大規模言語モデル生成異常検知テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

SafeRun: Enabling Determinism in LLM Planning for Running

LLMを利用したランニングの計画における決定論的安定性を確保するために、SafeRunというフレームワークを提案。LLMと決定論的ソルバーを分離して、安全ルールの厳格な実施を確保。

自然言語処理大規模言語モデルテキスト

用途: ランナーの安全と安定度の向上
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

TLDR: Compressing Audio Tokens for Efficient Autoregressive Text-to-Speech

オーディオTokenと文書をモデル化するためにコーデックベースのARトークのジェネレーターが強力な文を音声の質を高めました。しかし、このアプローチでは、音声Tokenのシーケンスはテキストシーケンスより長くなるため、AR

品質予測/異常検知深層学習軽量化・量子化テキスト音声

用途: オーディオTokenの圧縮による話者ジェネレータの効率化
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)

LLMがTABULARデータ分析で機能を自動化できるようにした。しかし、標準化されたプラットフォームの欠如は、比較やコスト的評価を行うのを難しくしている。複雑なメソッドの設計により、各コンポーネントの具体的な貢献をはっき

少数データ向き表形式向き自然言語処理大規模言語モデル分類生成回帰

用途: TABULARデータ分析のLLMパラダイムの比較評価
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

The Token Not Taken: Sampling, State, and the Variability of AI Agent Outputs

Agentic AIシステムの不確実性が、同じ要求から異なる計画、ツールの呼び出しなどが生成されることを示唆している。このようにしてシステムの信頼性を確保するには、AIエージェントのパラメータを確立することが重要となる。

コンピュータビジョンセグメンテーション生成テキスト

用途: AIエージェントのパラメータの確立に寄与する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Baichuan-M4: A Clinical-Grade Medical Agent System for Continuous Care

連続的な治療に適した臨床級LLM医系であるBaichuan-M4を導入。臨床的な医療エージェントシステムであるBaichuan-M4は、統合的な医療エージェントシステムをベースとし、医療エージェントと医療エージェントの連

コンピュータビジョンマルチモーダルQA画像テキスト

用途: 統合医療医系のためのLLMベースの医療エージェント
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

RTL-BenchLS: A Large-Scale Benchmark for RTL Reasoning and Generation with Large Language Models

LLMベースのRTL生成と推論は、ハードウェア設計自動化の新たな方向を示唆します。しかし、ベンチマークは、大規模化とタスクスコープの制約がある。現存するベンチマークでは、前向きモデルの実績

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト自己教師

用途: RTLリージョニングと生成のための大規模ベンチマーク作成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Diverse Thinking Schemata Elicit Better Reasoning in Large Language Models

Large reasoning models (LRMs) have attracted increasing attention for their ability to solve complex mathemati

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

An Effective Router for Vision-Language Model Selection

Vision-language models (VLMs) with varying performance and resource requirements are widely deployed, making i

自然言語処理大規模言語モデル異常検知画像テキスト

用途: 異常検知
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

CARE: A Conformal Safety Layer for Medical Summarization

Large language models (LLMs) are increasingly used for medical summarization, but their outputs can omit medic

自然言語処理大規模言語モデル検出要約テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

NutriMLLM: Multimodal Large Language Models for Dietary Micronutrient Analysis

Comprehensive estimation of dietary micronutrients from food images could improve clinical nutrition care, but

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

PACT: Learning Diverse Diagnostic Strategies via Privileged Synthesis and Branch Consensus

Clinical diagnosis requires flexible use of multiple reasoning paradigms under incomplete patient information.

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Report on CHIIR 2026 Workshop on Generative AI and Academic Search (GAI&AS)

This report summarizes the CHIIR 2026 Workshop on Generative AI and Academic Search (GAI\&AS), which examined

強化学習方策勾配 (PPO / A3C)生成要約検索

用途: 生成
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

Failure-Aware Refinement of Vision-Language Model for Lithography Defect Detection

Semiconductor lithography inspection requires reliable detection of small pattern defects such as bridge, burr

品質予測/異常検知自然言語処理ファインチューニング検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Order Matters: Unveiling the Hidden Impact of Macro Placement Sequences via Proxy-Guided LLM Evolution

Macro placement is a fundamental step in modern chip physical design, playing a crucial role in determining th

品質予測/異常検知深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

FAME: Forecastability-Aware Mixture of Experts for Heterogeneous Time Series Forecasting

この研究では、複数の時系列予測を合わせたモデルを使用して、個々の時系列の特性を考慮した予測を行うFAMEを提案します。このモデルは、個々の時系列の特性を考慮することで、より正確な予測が可能になります。

表形式向きCPUで試しやすいセンサ/時系列深層学習Transformer予測テキスト時系列

用途: 多様な時系列予測
難易度: Easy
コスト: Low

→

arxivPaper only2026-06-08

Quality-Diversity Search in Sound Generation: Investigating Innovation Engines for Audio Exploration

この研究では、音楽生成における多様性を促進するためのオープンソース・フレームワークを開発します。このフレームワークは、音楽生成における多様性の促進を支援するために、進化的プロセスと多様性促進アルゴリズムを組み合わせたもの

MI向き品質予測/異常検知自然言語処理ファインチューニング分類生成テキスト

用途: 音楽生成における多様性の促進
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

Quantitative Performance Analysis of Stopping Criteria for CMA-ES

この研究では、CMA-ESアルゴリズムの停止条件を評価します。この研究では、CMA-ESアルゴリズムの停止条件が機能するかどうかを調べ、アルゴリズムを改良するための情報を提供します。

条件最適化コンピュータビジョンセグメンテーションテキスト

用途: 最適化アルゴリズムの評価
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Causally Evaluating the Learnability of Formal Language Tasks

この研究では、形式言語の学習性を評価するための方法を開発します。この方法は、形式言語の学習性がどれだけのデータを必要とするかを評価することができます。

コンピュータビジョンセグメンテーションテキスト

用途: 形式言語の学習性評価
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

この研究では、大規模言語モデルの安全性を評価するためのフレームワーク、PsychoSafe を開発します。このフレームワークは、大規模言語モデルの安全性を評価し、潜在的なリスクを軽減することができます。

自然言語処理大規模言語モデル生成テキスト強化学習

用途: 大規模言語モデルの安全性評価
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

IS-CoT: Breaking the Long-form Generation Collapse via Interleaved Structural Thinking

この研究では、長文生成モデルの改良を実現するためのフレームワーク、IS-CoT を開発します。このフレームワークは、長文生成モデルの生成性とコントロール性を改善することができます。

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 長文生成モデルの改良
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

この研究では、マルチモーダル言語モデルの評価のためのフレームワークを開発します。このフレームワークは、マルチモーダル言語モデルの生成性とコントロール性を評価することができます。

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: マルチモーダル言語モデルの評価
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Where Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving

Multimodal large language models (MLLMs) achieve strong results on visual reasoning benchmarks, but answer acc

自然言語処理大規模言語モデルQA画像テキスト

用途: QA
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

Gradient-Guided Reward Optimization for Inference-time Alignment

Ensuring the reliability of Large Language Models (LLMs) under distribution drift requires inference-time adap

品質予測/異常検知深層学習軽量化・量子化検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

Civil Court Simulation with Large Language Models

Court simulation bridges legal education and judicial practice, yet human-based simulations are costly and dif

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Automated IEP Generation from Traditional Chinese Parent-Teacher Interviews via Corpus-Grounded Feature Diffusion

Writing Individualized Education Programs (IEPs) is a high-labor, knowledge-intensive document burden; English

深層学習Transformer生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Clinically Grounded Privacy Evaluation of Medical LMs

Medical language models (LMs) can memorize and reproduce protected health information, but privacy evaluations

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

UXBench: Benchmarking User Experience in AI Assistants

As AI assistants serve millions of users daily, evaluating user experience (UX) beyond general model capabilit

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

OpenBibleTTS: Large-Scale Speech Resources and TTS Models for Low-Resource Languages

Recent advances in neural text-to-speech (TTS) and multilingual speech generation have substantially improved

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト音声

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Overcoming Decoder Inconsistencies in Whisper for Dravidian and Low-Resource Languages

WhisperのようなマルチリンガルASRモデルの音声認識能力をDravidian言語で向上させるために、データセットと言語分析を用い、モデルをフィネチュアリングし、デコーダの不平衡を解消し、音声認識誤差を低減した。

センサ/時系列深層学習Transformerテキスト音声

用途: Dravidian言語の音声認識を改善する
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Detecting Differences Is Not Understanding Structure: Large Language Models Fail at Graph Isomorphism

この研究では、大きな言語モデルがグラフの同型性を推論できるかどうか調査し、小さなグラフでは同型性を認識できたものの、シードノードラベルを入れ替えてグラフ同型性を検証した結果、同型性が識別されなかった。

自然言語処理大規模言語モデル検出テキスト

用途: グラフの同型性を推論する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

DECSELFMASK: Leveraging Unlabeled Text via Self-Relevance-Guided Masking for Decoder-Only Classification

予備情報が少ない場合や医療分野などの特定の分野の場合、分類タスクは難しいようになるが、この研究では、モデルが未分類データを操作して、分類モデルの性能を向上させる方法である、DecSelfMaskを提案した。

自然言語処理RAG分類生成テキスト

用途: 分類タスクの性能向上
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

H2HMem: A Multimodal Memory Benchmark for Agents in Human-Human Interactions

大きな言語モデルには記憶や推論機能があるが、ユーザーとの対話におけるこれらの機能の効果はまだ理解されているわけではない。これを受け、この研究では、人間の相互作用、特に会話における記憶と推論能力を評価するためのマルチモーダ

自然言語処理大規模言語モデル生成テキストマルチモーダル

用途: マルチモーダル記憶の評価
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

AbstRAG: Learning to Abstract for Retrieval Problems

この研究では、検索タスクにおける抽象レベルにおけるギャップを解消するためのフレームワークであるAbstRAGを提案し、検索タスクにおけるギャップを解消したことで、モデルが検索タスクにおいて正しく情報を開示した。

品質予測/異常検知自然言語処理RAG生成検索テキスト

用途: リトラバージャグによる検索
難易度: Hard
コスト: Low

→

arxivGitHubあり2026-06-08

MUDIDI: A Two-Stage Framework for Multilingual Dictionary Digitization with Language Models

この研究では、低リソース言語や絶滅言語の辞書のデジタル化が重要であるが、マルチモーダル辞書をデジタル化する方法は今まで難しかったが、この研究では、最近のビジョン言語モデルを用いて辞書のデジタル化が容易になり、辞書内の文字

品質予測/異常検知自然言語処理大規模言語モデル分類セグメンテーションテキスト

用途: ムルティリンガル辞書のデジタル化
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Guide Me Out: A Framework to Benchmark VLM Operators Communication in Crisis Scenarios

危機管理では、コミュニケーションと地理

品質予測/異常検知コンピュータビジョンマルチモーダル分類画像テキスト

用途: 危機管理におけるコミュニケーションを評価する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Toward Signing Activity Projection in Sign Language Interaction

Social robots must interact robustly not only with users assumed by speech-centered systems but also with dive

深層学習Transformerテキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivGitHubあり2026-06-08

What Should a Skill Remember? Quality-Cost Trade-offs in Cost-Aware Skill Rewriting for Language Model Agents

Large language model agents increasingly rely on skills: reusable procedural documents encoding workflows, too

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

LexRubric: A Rubric-Guided Diagnostic Benchmark for Open-Ended Legal Tasks

As large language models (LLMs) are increasingly applied to real-world legal tasks, evaluating the reliability

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Is Text All You Need? Text as a Universal Information Bottleneck for Speech LLMs

Large language models (LLMs) provide a powerful reasoning backbone for speech understanding, but integrating c

センサ/時系列深層学習Transformer分類テキスト音声

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

In-Context Learning for the Imputation of Public Opinion Data with Large Language Models

Large language models have been widely evaluated as simulators of individual survey responses. In practice, ho

CPUで試しやすい自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Multi-Hop Knowledge Composition is Bound by Pretraining Exposure

Large Language Models fail at implicit multi-hop reasoning: a model answers "When was $X$ born?" and "Who is $

MI向き自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

How Far Can Prompting Go for Minimal-Edit Ukrainian Grammatical Error Correction?

Fine-tuned Large Language Models (LLMs) dominate in Ukrainian grammatical error correction (GEC), while API-ac

少数データ向き品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

NüshuVoice: Reviving the Voice of Endangered Nüshu with Pitch-Aware Text-to-Speech

Nüshu is an endangered phonetic script historically used by women in Jiangyong County, southern Hunan, China.

センサ/時系列深層学習Transformer分類画像テキスト

用途: 分類
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

TruthSplit: Operationalizing Conditional Validity in Arguments Through Multi-Perspective Reasoning

We present TruthSplit, an interactive system for multi-perspective argument analysis. Existing argumentation t

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Symbolic and Abstractive Reasoning with Complex Visual Queries

Understanding and reasoning over abstract visual content remains a challenge for current multi-modal large lan

自然言語処理大規模言語モデル画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Explicit Representation Alignment for Multimodal Sentiment Analysis

Multimodal affective analysis aims to understand human sentiment and emotion by jointly modeling heterogeneous

説明可能自然言語処理RAG画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

MAAM: Anchor-Preserving Compression and Contextual Calibration for Chinese Discriminatory Language Detection

Chinese discriminatory-language detection is challenging because harmful intent is often implicit and context-

少数データ向き説明可能深層学習軽量化・量子化検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Emergent Misalignment Can Be Induced by Sycophancy and Reversed via Alignment Gating

Prior work has shown that fine-tuning large language models on malicious or incorrect outputs in narrow domain

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

CRANE: Knowledge Editing for Reasoning MLLMs

The emergence of reasoning multimodal large language models (MLLMs), which generate explicit chain-of-thought

自然言語処理大規模言語モデル異常検知画像テキスト

用途: 異常検知
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Bridging the Agent-World Gap: Text World Models for LLM-based Agents

Large language model (LLM)-based agents are increasingly used in interactive textual environments, from web na

深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Personal Salience: Highlighting Is Social, but Individuality Lives in Selection

Social highlighters let people mark passages that matter to them. We ask how much of an individual is recovera

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Document-Authored Control-Signal Impersonation: A Low-Cost Indirect Prompt Attack on RAG Safety Boundaries

Retrieval-augmented generation (RAG) systems often serialize user queries, retrieved documents, metadata, syst

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

Language-Aware Token Boosting: LLM Language Confusion Reduction Without Tuning

Large language models (LLMs) sometimes exhibit language confusion when generating non-English text. Existing a

品質予測/異常検知自然言語処理大規模言語モデル要約テキスト

用途: 要約
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

ChinaHeritaQA: A Culturally-Grounded Visual Question Answering Dataset for World Heritage Sites in China

We introduce ChinaHeritaQA, a multimodal benchmark dataset for evaluating the cultural reasoning abilities of

表形式向き品質予測/異常検知自然言語処理RAG分類QA画像

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Multilingual Sentiment Aware Text Summarization A Reinforcement Learning Approach for Consistency Maintenance

Reinforcement Learning from Human Feedback (RLHF) has significantly improved the quality and fluency of large

品質予測/異常検知自然言語処理大規模言語モデル要約テキスト強化学習

用途: 要約
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

Are Reasoning Vision-Language Models Robust to Semantic Visual Distractions?

Reasoning Vision-Language Models (VLMs) achieve strong performance on complex multimodal tasks, but reliable r

コンピュータビジョンマルチモーダル画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Latent Spatial Memory for Video World Models

Video world models that maintain 3D spatial consistency across generated frames typically rely on explicit poi

品質予測/異常検知自然言語処理RAG生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

MemoryVLA++: Temporal Modeling via Memory and Imagination in Vision-Language-Action Models

Temporal modeling is essential for robotic manipulation, as effective control requires both memory of past int

コンピュータビジョン動画認識テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Beyond Spherical Harmonics: Rethinking Appearance Models for Radiance Reconstruction

View-dependent appearance modeling remains a challenging problem in novel-view synthesis and reconstruction. A

品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

POTATR: A Lightweight Image-to-Graph Model for Page-Level Table Extraction

Large-scale document processing requires contextually aware table extraction (TE) that is both accurate and ef

深層学習Transformer検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

HDSL: A Hierarchical Domain-Specific Language for Structured 3D Indoor Scene Generation and Localized Editing with LLM Agents

Text-driven indoor scene generation and editing require an intermediate representation that language models ca

自然言語処理大規模言語モデル生成テキスト3D

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Cranio-Diff: Diffusion-based Cross-domain Craniofacial Reconstruction with 2D X-ray Skull Guidance and Structural Identity Constraints

The state-of-the-art generative models, such as CycleGAN, Pix2Pix, and diffusion models have demonstrated rema

品質予測/異常検知コンピュータビジョンセグメンテーション生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

SoccerNet 2026 Player-Centric Ball-Action Spotting:Retraining and Post-Processing Extensions to the FOOTPASS Baselines

We describe our system for the SoccerNet 2026 Player-Centric Ball-Action Spotting Challenge, which requires pr

深層学習グラフニューラルネット画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

The dominant paradigm in video retrieval relies on embedding-based full-corpus scanning, which suffers from in

説明可能深層学習Transformerテキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

CineDance: Towards Next-Generation Multi-Shot Long-Form Cinematic Audio-Video Generation

The fidelity and structural diversity of training datasets fundamentally determine the capabilities of video g

品質予測/異常検知自然言語処理RAG生成テキスト音声

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

ContextShift: A Controlled Benchmark for Context Dependence in Object Detection

Modern object detectors achieve strong performance on standard benchmarks, yet their robustness to contextual

コンピュータビジョン物体検出検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Training-Free Generalized Few-Shot Segmentation through Open-Vocabulary Semantic Arbitration

Generalized Few-Shot Semantic Segmentation (GFSS) has traditionally been approached as a representation-learni

少数データ向き自然言語処理プロンプトエンジニアリング分類セグメンテーション画像

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Leveraging Morphology for Historical Script Metrological Analysis

Advances in handwritten text recognition have enabled large-scale transcription of historical documents, but s

説明可能深層学習Transformer分類検出画像

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

Image and video captioning are fundamental tasks that bridge the visual and linguistic domains, playing a crit

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

Echo-DM: Ultrasound Marker Removal via Conditional Latent Diffusion and Region-Aware Fusion

Clinical ultrasound images often contain artificial markers, such as measurement calipers and text, to assist

品質予測/異常検知自然言語処理RAG画像テキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

ExDet: Open-Domain Open-Vocabulary Detection with Cross-modal Extrapolation and Rectification

Open-domain open-vocabulary detection (ODOVD) requires detectors to generalize to both novel categories and un

深層学習軽量化・量子化分類検出画像

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

IB-HFN: Information Bottleneck-Driven SAR-Optical Fusion Network for High-Fidelity Cloud Removal

Synthetic aperture radar (SAR)-assisted optical cloud removal aims to recover surface information obscured by

センサ/時系列コンピュータビジョン動画認識画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

The rapid development of pretrained foundation models has enabled more general image segmentation. Multimodal

深層学習正規化・最適化手法分類生成セグメンテーション

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

Visual reasoning requires integrating evidence distributed across regions, attributes, and relations, making s

深層学習軽量化・量子化画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

See More, Match Better: Multi-Source Feature Fusion for Two-View Correspondence Learning

Two-view correspondence learning aims to distinguish true correspondences (inliers) from false ones (outliers)

自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

LiteVSR: Lightweight Adaptation of Frozen Diffusion Transformers for Video Super-Resolution

Adapting large-scale pre-trained video generators for Video Super-Resolution (VSR) in novel domains remains co

品質予測/異常検知深層学習Transformer生成テキスト動画

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

MAGIS: Evidence-Based Multi-Agent Reasoning for Interpretable Strabismus Clinical Decision-Making

Strabismus is a common ocular disorder that requires fine-grained subtype diagnosis for individualized treatme

説明可能コンピュータビジョンマルチモーダル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-08

Temporal-Aware Reasoning Optimization for Video Temporal Grounding

Multi-modal Large Language Models (MLLMs) have achieved remarkable progress in video temporal grounding with r

品質予測/異常検知深層学習Transformer検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

CP4D: Compositional Physics-aware 4D Scene Generation

4D generation (\textit{i.e.}, dynamic 3D generation) has recently emerged as a rapidly growing research fronti

MI向き自然言語処理RAG生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Vision-Language Guided Hyperspectral Object Tracking via Semantics Fusion and Contextual Template Updating

Hyperspectral object tracking (HOT) leverages the rich spectral information provided by hyperspectral videos (

深層学習軽量化・量子化画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

OmniGen-AR: AutoRegressive Any-to-Image Generation

Autoregressive (AR) models have demonstrated strong potential in visual generation, offering superior performa

コンピュータビジョンセグメンテーション生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

HDRAgent: An Agentic Framework for Multi-Exposure HDR Imaging

Most existing multi-exposure HDR methods follow a fixed feed-forward reconstruction paradigm, making them pron

品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

Reward models are central to text-to-image post-training, but visual preference is subjective and better repre

深層学習軽量化・量子化生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Frequency Decoupled Framework for Screen Content Image Super-Resolution

Methods based on implicit neural representations have demonstrated superior performance in Screen Content Imag

深層学習Transformer画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

Scaling by Diversified Experience for Vision-Language-Action Models

Vision-Language-Action models face significant challenges in real-world deployment due to the entanglement of

コンピュータビジョンセグメンテーション異常検知テキストマルチモーダル

用途: 異常検知
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

When Vision Misleads, Let Location Speak: A Worldwide Image Geo-Localization Method via Location Attention Mechanism and Large Multimodal Models

Worldwide image geo-localization aims to determine the capture location of an image on a global scale. Existin

深層学習Transformer検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

Modeling Components and Connections in Cyber-Physical Systems

Text based configuration files for cyber-physical systems show the hierarchy of component modules well but oft

強化学習モデルベース画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-08

VGP-Nav: Metric-Aware Visual Geometric Perception for Robot Navigation

Reliable robotic navigation necessitates the seamless integration of accurate global localization and dense, m

センサ/時系列深層学習軽量化・量子化検出画像テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-08

RPO-PDT: Demonstrating Role-Play-Based Knowledge Adaptation for Student Support Dialogue (Demonstration System)

We present RPO-PDT: a retrieval-grounded, role-play-based dialogue system for adaptive student support in high

強化学習方策勾配 (PPO / A3C)テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-08

LAEI: Layered Autonomous Edge Intelligence Framework for Robust UAV Swarm Operations

Autonomous UAV swarms require scalable coordination mechanisms that maintain mission performance under limited

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

githubGitHubあり2026-06-08

mxcp — Model eXecution + Context Protocol: Enterprise-Grade Data-to-AI Infrastructure

データをAIに変換する基盤を構築することで、ビジネス上の問題を解決できます。この研究では、Model eXecution + Context ProtocolであるMXCPを提案し、データの変換を簡素化した上で、AIアプ

自然言語処理大規模言語モデルテキスト

用途: データをAIに変換する基盤を構築することによって、ビジネスを改善する
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-08

VoxCPM — VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

マルチラギングスピーチ生成やクリエイティブボイスデザイン、ルートライフクライミングなど、テクスチャファリーTTSの最新技術を実現するためのフレームワークです。

生成AI音声・音楽生成生成テキスト音声

用途: マルチラギングスピーチ生成
難易度: Easy
コスト: Medium

→

arxivGitHubあり2026-06-07

BLM-SGAN: Bidirectional Language Modeling for Semantic-Spatial Text-to-Image Generation

Despite the success of image generation from text descriptions, it still faces challenges that are difficult t

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Easy
コスト: Low

→

arxivPaper only2026-06-07

Momentum for Reasoning: Dense Intrinsic Signals in Policy Optimization

Reinforcement learning with verifiable rewards (RLVR) has emerged as a powerful paradigm for eliciting long-ch

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Continuous Language Diffusion as a Decoder-Interface Problem

Gaussian-corrupted sentence embeddings have no direct linguistic interpretation, yet continuous diffusion lang

深層学習Transformer埋め込みテキスト

用途: 埋め込み
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Q-Delta: Beyond Key-Value Associative State Evolution

Linear attention reformulates sequence modeling as recurrent state evolution, enabling efficient linear-time i

深層学習RNN / LSTMテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-07

How Many Counterfactuals Does It Take? Probing VLM Hallucinations Through Circuits and Causal Effects

Visual Language Models (VLMs) are known to produce hallucinated predictions that are not grounded in visual ev

自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-07

TeamHerald@CHIPSAL 2026: Hate Speech Detection and Sentiment Analysis of Nepali Memes using Transformer-based Architectures and Ensemble Learning

The analysis of internet memes in the Nepali language is complicated by frequent code-mixing and a lack of est

深層学習Transformer分類検出画像

用途: 分類
難易度: Hard
コスト: Low

→

arxivGitHubあり2026-06-07

IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking

Simulation plays a key role in automated robotics research supported by large language models (LLMs). However,

センサ/時系列深層学習軽量化・量子化生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-07

Artificial Intelligence for Mathematical Reasoning: An Integrated Survey of Language Models, Neuro-symbolic Systems, and Verified Discovery

Mathematical reasoning has long served as a stringent test of machine intelligence; over the past decade, it h

MI向き自然言語処理大規模言語モデル生成テキストマルチモーダル

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

SNR-ST-Mix: Sample-specific Neighborhood Regression Mixup for Augmented Spatial Transcriptomics Imputation with Deep Neural Network

Purpose: Spatial transcriptomics (ST) enables gene expression measurements within the tissue context. However,

深層学習軽量化・量子化分類回帰テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Agentic Search for Counterfactual Recourse under Fixed LLM Budgets

Counterfactual recourse aims to provide actionable feature changes that would alter an unfavorable decision ma

表形式向き品質予測/異常検知深層学習軽量化・量子化生成テキスト表形式

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Activation Steering Induces Emergent Misalignment: A More Comprehensive Evaluation

Activation steering has emerged as a popular inference-time technique for modulating the behavior of large lan

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Rank Intervals for Leaderboards: A Hierarchical Framework for Model Evaluation

Pretrained models are often evaluated on multi-task leaderboards to measure their applicability in diverse con

機械学習教師あり学習テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

A Comparison of SSL-Based Feature Extractors and Back-End Classifiers for Spoofing Detection: A Multi-Corpus Training and Cross-Linguistic Analysis

Voice biometric systems face growing threats from spoofing attacks, yet the evaluation of detection models rem

深層学習CNN分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

SpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM Serving

Prefill-decode (PD) disaggregation decouples prompt processing from token generation, but it also turns the ke

品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Towards Long-Horizon Vessel Trajectory and Destination Forecasting with Reasoning Large Language Models

Long-horizon maritime trajectory prediction is important for shipping management, logistics planning, and mari

深層学習Transformer予測テキスト強化学習

用途: 予測
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Tyan-WP: A Wind Power Foundation Model for Ultra-Short-Term Probabilistic Forecasting

Global wind power capacity, especially in China, is booming, with new farms spanning diverse terrains and clim

センサ/時系列自然言語処理プロンプトエンジニアリング予測テキスト時系列

用途: 予測
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Convolutional Sparse Coding via the Locally Competitive Algorithm on Loihi 2

Sparse coding provides a principled framework for signal representation by expressing an input as a linear com

深層学習CNNテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivGitHubあり2026-06-07

Lost in the Non-convex Loss Landscape: How to Fine-tune the Large Time Series Model?

Recently, large time series models (LTSMs) have gained increasing attention due to their similarities to large

センサ/時系列自然言語処理大規模言語モデルテキスト時系列

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Titans-as-a-Layer: Test-Time Memory for Conversational Speech Emotion Recognition

Speech emotion recognition (SER) is commonly formulated as utterance-level classification, although conversati

センサ/時系列自然言語処理大規模言語モデル分類テキスト音声

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Calibration of Structured Ignorance Certificates for Diagnosing Unknown Unknowns in Reasoning Models

Large language models frequently fail in a characteristic way: rather than acknowledging ignorance, they produ

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Physics-Guided Dual Decoding and Spectral Supervision for Global 3D Hydrometeor Prediction

While global data-driven models excel at predicting continuous atmospheric variables, three-dimensional hydrom

表形式向き深層学習RNN / LSTM検出生成予測

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Autonomous Aerial Manipulation via Contextual Contrastive Meta Reinforcement Learning

Unmanned aerial vehicles (UAVs) are increasingly being deployed in logistics, service robotics, and other real

品質予測/異常検知深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-07

Querying Counterfactuals on Tissue Graphs with Supervised Disentanglement

\textit{Tissue graph counterfactuals} ask how a cell's expression would change under altered spatial neighbor

機械学習教師なし学習テキスト教師なし

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

Can the Environment Speak for Itself? $T^{2}$-GRPO: A Turn-Trajectory Group Relative Policy Optimization for Caregiver Agents

Optimizing large language models (LLMs) for long-horizon caregiver agents requires balancing delayed task obje

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Hybrid E-Assessment in Higher Education: Semi-Automated Grading of Paper-Based Written Examinations

This paper examines the limitations of fully digital and partially digital e-assessment approaches in summativ

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

ZIPP:Zero-shot Image Personalization from Personas

Text-to-image diffusion models are increasingly deployed in open-ended creative contexts, yet their outputs re

少数データ向き深層学習Transformer生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Beyond Pass Rate: A Multilingual, Execution-Grounded Evaluation of Open Code LLMs

Code generation models are typically compared using compact execution benchmarks and aggregate pass rates, but

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Inference-Time Conformal Reasoning with Valid Factuality Control for Large Language Models

Large language models (LLMs) increasingly perform multi-step reasoning, where intermediate claims form implici

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Governance Controls for AI-Generated Test Artifacts in Autonomous Software Testing

Artificial Intelligence (AI) and Large Language Models (LLMs) are increasingly used in autonomous software tes

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Bridging Expert Knowledge and Automated Feature Engineering via Self-Evolution

In high-stakes settings such as brand compliance, clinical care, and content moderation, machine learning cann

表形式向き説明可能自然言語処理大規模言語モデル分類検出生成

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

RadOT-Eval: Auditable Structured-Evidence Transport for Radiology Report Evaluation

Automatic evaluation is critical for high-stakes text generation, where errors often involve omitted findings,

説明可能深層学習Transformer生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing

W4A4 quantization promises full utilization of INT4 Tensor Cores, yet group dequantization overhead on CUDA Co

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Structuring agentic AI for HPC code modernization

Modernization of legacy scientific codes is often necessary to keep up with the ever-evolving changes in the c

深層学習Transformerテキスト3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Building Customer Support AI Agents at 100M-User Scale: An Evaluation-Driven Framework

The rapid rise in LLM capabilities has made AI agents increasingly viable across a broad range of tasks. Among

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

The Amplifying Mirror: Locating and Steering the Partisan Direction inside a Large Language Model

Large language models are rapicly replacing search engines as the primary interface between people and informa

説明可能自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Co-Evolving Skill Generation and Policy Optimization

Skill-augmented reinforcement learning improves language agents by storing reusable procedural knowledge acqui

深層学習軽量化・量子化生成テキスト強化学習

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

HydraQE: OSU's Submission for the IWSLT 2026 Speech Translation Metrics Shared Task

We present HydraQE, our contribution to the IWSLT 2026 Speech Translation Metrics shared task. HydraQE is an e

品質予測/異常検知深層学習Transformer翻訳テキスト音声

用途: 翻訳
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-07

Can LLMs understand LilyPond? A benchmark for symbolic music generation and understanding

Symbolic music evaluation for large language models remains fragmented across representations, datasets, and m

品質予測/異常検知深層学習Transformer分類生成テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Operationalizing Linguistic Methods through Prompt-Engineering Skills: An Automatic Chinese Web Neologism Detection Pipeline

We present a method for automatic Chinese web neologism detection that operationalizes traditional linguistic

自然言語処理大規模言語モデル分類検出生成

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Analyzing the Correlation Between Hallucinations and Knowledge Conflicts in Large Language Models

Hallucinations -- factually incorrect or unverifiable outputs -- remain one of the most challenging limitation

説明可能自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Lost in the Flow with Code Talkers: Unveiling the Instruction-Tuning Tax of Large Language Models in Code Tasks

AI coding assistants have significantly improved developer productivity by automatically suggesting code that

深層学習Transformer分類生成テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

ClinicalAligner26AM: A Cross-Lingual Aligner for Dataset Translation; Evidences from the MultiClinCorpus Shared Task

Word-level cross-lingual alignment is central to annotation projection, translation auditing, and cross-lingua

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

From Player to Master: Enhancing Test-Time Learning of LLM Agents via Reinforcement Learning over Memory

Large language model (LLM) agents are increasingly deployed in long-running settings where improving through e

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

A retrieval conditioned rebinding circuit for dynamic entity tracking in large language models

To interpret context correctly and retrieve relevant information, large language models must bind entities to

説明可能自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Sycophancy Towards Researchers Drives Performative Misalignment

The increasing situational awareness of language models raises safety concerns: models might be aware when the

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

From Holistic Evaluation to Structured Criteria: Rubrics Across the Evolving LLM Landscape

As Large Language Models (LLMs) advance toward open-ended autonomous agents, the mechanisms used to evaluate a

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト強化学習

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Harnessing Streaming Video in the Wild

Vision-Language Models (VLMs) are increasingly required to process unbounded video streams in applications suc

表形式向きコンピュータビジョン動画認識テキスト動画マルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Detection and Interpretability Analysis of Quotation Errors by Large Language Models

Purpose - Quotation error refers to the inconsistency between cited information and its original source. This

説明可能深層学習軽量化・量子化検出テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Inside the LLM Word Factory

Transformer language models process input provided as subword fragments, but natural language semantics usuall

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Friend or Foe? Language as an ideological switch in open-weight LLMs under Russian disinformation stress

As Russia's war against Ukraine extends into generative AI, large language models (LLMs) adapted for local pos

MI向き自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Back on Track: Aligning Rewards and States for Reasoning in Diffusion Large Language Models

Reinforcement learning (RL) holds immense promise for enhancing the reasoning capabilities of diffusion large

自然言語処理大規模言語モデル生成テキスト強化学習

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Explaining Black-Box Language Models: Learning to Optimize Linguistically-Structured Word Subsets

As deep language models (DLMs) are increasingly deployed in high-stakes domains such as healthcare, understand

説明可能深層学習Transformer異常検知テキスト

用途: 異常検知
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

SAEExplainer: Interpreting SAE Features with Activation-Guided Preference Optimization

Although Sparse Autoencoders (SAEs) have mitigated the opacity of large language models (LLMs) by decomposing

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

TRADE: Transducer-Augmented Decoder for Speech LLM

Speech Large Language Models (Speech LLMs) lack a principled mechanism for streaming inference: their label-sy

センサ/時系列深層学習Transformer分類検出生成

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

More Yap Less Meaning: Uncovering Self-Improvement Behavior in SLMs

Recently, language models have made rapid progress across various domains and applications. However, their cap

品質予測/異常検知コンピュータビジョンセグメンテーションテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

Beyond Linear Activation Steering: Invertible Latent Transformations for Controlling LLM Behavior

Activation steering provides a lightweight inference-time mechanism for controlling large language models (LLM

深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Sycophancy as a Multilingual Alignment Failure: How Safety Degrades Across Languages, Topics, and Models

Safety-aligned large language models often exhibit sycophancy, which is the tendency to affirm users' opinions

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Segment-level Tree Search for Long Meeting Document Summarization

Meeting documents are challenging to summarize due to their length and complex conversational structure. Exist

品質予測/異常検知MLOpsパイプライン構築要約テキスト

用途: 要約
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

TinyGiantALM: A Compact Audio-Language Model for Intent-Aware Reasoning under Resource Constraints

Current advancements in Audio Reasoning rely on massive Large Audio-Language Models (LALMs), hindering deploym

センサ/時系列自然言語処理プロンプトエンジニアリングテキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

Hacking Generative Perplexity: Why Unconditional Text Evaluation Needs Distributional Metrics

Diffusion and continuous flow-based language models have emerged as the leading non-autoregressive alternative

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト音声

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

AsyncLane: Decoupling Refinement from Advancement in Diffusion Language Model Decoding

Block-wise semi-autoregressive decoding is the standard inference paradigm for diffusion large language models

品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

TimpaTeks: Automatic In-place Text Sequence Modification via Diffusion Language Model Steering

We extend activation steering to diffusion language models (DLMs) and study a novel problem that arose due to

生成AI拡散モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Impacts of Histories and Models on LLM Grading: A Study in Advanced Software Engineering Courses

Graduate-level research reading report assessment creates a substantial labor burden for educators. While larg

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

Large language models answer knowledge-intensive questions using both parametric memory and retrieved evidence

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

When Correct Decisions Hide Internal Stress: Decision-State Probing in Multimodal Language Models

Multimodal language models are typically evaluated through external behavior: selecting the correct image--tex

深層学習Transformer画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Auditing Proprietary Alignment in Large Language Models: A Comparative Framework Without a Ground-Truth Standard

Large language models (LLMs) are increasingly released and deployed through opaque development and deployment

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Generalizing Geometry-Guided Mamba as a Plug-and-Play Context Module for CNN-based Semantic Segmentation

CNN-based semantic segmentation networks usually rely on context heads such as ASPP, PPM, or attention modules

深層学習CNNセグメンテーションテキスト

用途: セグメンテーション
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

DeepMine-Mamba: Mitigating Information Dilution in Mamba-Based State Space Models for Document Image Binarization

Document image binarization aims to separate foreground text from degraded backgrounds while preserving thin,

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-07

Stain-Aware Wavelet Regularization for Instant Adversarial Purification in Histopathology

Deep learning has become prevalent in computational pathology pipelines that support tasks such as cancer scre

自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-07

PRPO: Perception-Reinforced Policy Optimization via Token-Level Dynamic Advantage Reshaping

Reinforcement Learning with Verifiable Rewards (RLVR) has become an effective paradigm for improving the reaso

自然言語処理RAG画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

PhysAgent: Automating Physics-Based 4D Synthesis via Trajectory-Grounded Multi-Agent Feedback

Achieving fully automated, physically plausible 3D motion synthesis is a core objective in graphics and genera

MI向き深層学習軽量化・量子化生成テキスト3D

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

BioVid: Autoregressive Video Generation with Biological Behavior Semantic Comprehension

Existing video generation frameworks treat sequence duration as an externally prescribed parameter -- fixed fr

深層学習Transformer生成テキスト動画

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Reconstructing Synthetic SDO/AIA 193 A EUV Images from He I 10830 A Observations with Diffusion Model Translator

Routine full-disk EUV imaging has been available only since the modern era, such as SOHO and SDO. To extend EU

深層学習Transformer画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Learnable Token Sparsification for Efficient Gigapixel Whole Slide Image Reasoning

The processing of gigapixel whole slide images within vision language models faces a major difficulty due to a

深層学習軽量化・量子化画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

OmniCap-IF: Benchmarking and Improving Instruction Following Abilities for Omni-Video Captioning

While Omni-modal Large Language Models (OLLMs) have demonstrated impressive capabilities in jointly processing

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

NGram-MoSE: Efficient Remote Sensing Super-Resolution via N-Gram Context and Mixture-of-Experts

Remote sensing applications for environmental monitoring and disaster management are frequently constrained by

センサ/時系列深層学習Transformer検出セグメンテーション異常検知

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

DriveReward: A Comprehensive Dataset and Generative Vision-Language Reward Model for Autonomous Driving

Reward models play a pivotal role in reinforcement learning (RL) and multi-modal trajectory selection for auto

表形式向きコンピュータビジョン動画認識生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Look Less, Reason More: Block-wise Attention Skipping for Efficient Multimodal LLMs

Multimodal Large Language Models (MLLMs) face a significant inference bottleneck due to the quadratic computat

少数データ向き深層学習Transformer画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

EgoPriMo: Egocentric Motion Generation for Interactive Humanoid Control

Humanoid robots require whole-body motions that adapt to scene context, task requirements, and user intent. Mo

コンピュータビジョンセグメンテーション生成予測画像

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Seeing is Believing: Aligning Prompt Rewriting with Visual Anchors for Text-to-Image Generation

Despite the impressive capabilities of text-to-image (T2I) models, an intent-generation gap often persists due

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-07

TVI-CoT: Text-Visual Interleaved Chain-of-Thought Reasoning for Multimodal Understanding

Chain-of-thought (CoT) reasoning has proven effective for enhancing problem-solving in large language models.

自然言語処理大規模言語モデル画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Reinforcing Temporal Answer Grounding in Instructional Video via Candidate-Aware Causal Reasoning

The task of temporal answer grounding in instructional video (TAGV), which aims to locate precise video segmen

深層学習Transformer画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-07

Segmentation-Assisted Brain MRI Synthesis with Cross-Image Multi-Contrast Feature Memory Bank Retrieval Augmentation

Multi-contrast brain MRI provide complementary soft-tissue characteristics that aid in the screening and diagn

表形式向きコンピュータビジョンセグメンテーション生成画像テキスト

用途: 生成
難易度: Easy
コスト: Low

→

arxivPaper only2026-06-07

CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs

Vision-language models (VLMs) pretrained on large-scale image-text pairs demonstrate strong image-level unders

深層学習CNN検出生成セグメンテーション

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

CoVEBench: Can Video Editing Models Handle Complex Instructions?

While recent text-guided video editing models excel at elementary tasks (e.g., style transfer, object insertio

表形式向き品質予測/異常検知自然言語処理大規模言語モデルテキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

SceneConductor: 3D Scene Generation from Single Image with Multi-Agent Orchestration

Generating complete 3D scenes from a single image requires inferring globally consistent geometry, object rela

MI向き自然言語処理RAG生成セグメンテーション画像

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Safe, Fluent and Acceptable Motion Generation and Execution for Human--Robot Interaction in Manufacturing Environments

Robots operating in human environments must not only ensure physical safety but also exhibit behaviors that ar

品質予測/異常検知強化学習生成テキスト

用途: 生成
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

Language as a Sensor: Calibrated Spatial Belief Estimation in 3D Scenes from Natural Language

Robots deployed in human-centric environments routinely receive natural-language descriptions of spatial infor

センサ/時系列コンピュータビジョン3D・点群テキスト3Dマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Two Bridges, One Pathway: From VLMs to Generalizable VLAs with Embodied Trajectory-Coupled Data

Vision-language models (VLMs) are powerful general-purpose reasoners, yet converting them into robot control p

自然言語処理ファインチューニング異常検知画像テキスト

用途: 異常検知
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning

Autonomous Underwater Vehicles (AUVs) traditionally rely on complex, heavily engineered pipelines for percepti

センサ/時系列深層学習軽量化・量子化画像テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-07

LUNA-AD: Lightweight Uncertainty-Aware Language Model with Lifelong Learning for Autonomous Driving

While large language models (LLMs) offer promising reasoning capabilities, their integration into safety-criti

深層学習軽量化・量子化テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-07

Personalized and Robust Proactive Robot Assistance with Uncertainty-Guided LLM Reasoning

Proactive robot assistance in household environments requires accurate prediction of human activities and obje

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-07

Trajectory-Refined Distillation

On-policy distillation (OPD) has become a central post-training tool for large language models (LLMs), providi

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-07

presidio — An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

presidioは、テキスト、画像、構造化データを含む敏感データを検出、削除、マスク、アノニマイズするオープンソースフレームワークです。自然言語処理、パターンマッチング、カスタマイズ可能なパイプラインをサポートします。

表形式向き深層学習Transformer分類検出画像

用途: データのプライバシーを保護する
難易度: Easy
コスト: Low

→

arxivPaper only2026-06-06

Variational Proximal Policy Optimization

Reinforcement Learning from Human Feedback via Proximal Policy Optimization often suffers from policy mode col

自然言語処理RAGテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-06

Forward-Free Diffusion Language Models

Diffusion language models generate text through iterative denoising, offering a powerful alternative to autore

品質予測/異常検知自然言語処理RAG生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

Bayesian-Agent: Posterior-Guided Skill Evolution for LLM Agent Harnesses

LLM agents increasingly rely on external inference conditions: prompts, tools, memory, SOPs, skills, and harne

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Tensorizing Engram: Sharing Latents Across N-Gram Embeddings is Beneficial in LLMs

Modern language models represent text using discrete token-level embeddings, which forces recurring multi-toke

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

CATPO: Critique-Augmented Tree Policy Optimization

Reinforcement learning with verifiable rewards (RLVR) has become a dominant paradigm for improving the reasoni

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Chiaroscuro Attention: Spending Compute in the Dark

Standard transformers apply self-attention uniformly at every layer and token, regardless of whether the input

深層学習Transformer分類テキスト

用途: 分類
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-06

Understanding the Sociocultural Dimensions of Mental Health Discourse in Arabic-Language X Communities

Computational mental health research has predominantly centered on English-speaking populations, leaving Arabi

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

TLRD: Teaching LLMs to Reason over Tabular Data with Tri-Level Rationale Distillation

Tabular data is a primary medium for storing real-world information, driving many industrial applications of m

表形式向き深層学習軽量化・量子化テキスト表形式

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

AgriGov: A Structured Multilingual Dataset Curation for Indian Government Schemes for Farmers

AgriGov is a curated, trilingual (English-Hindi-Marathi) dataset designed to address the scarcity of domain-gr

表形式向き自然言語処理RAG翻訳要約QA

用途: 翻訳
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-06

SSR: Can Simulated Patients Learn to Stigmatize Themselves? Modeling Self-Stigma through Internal Monologue

Simulating patients with large language models (LLMs) is a promising tool for mental health training, but exis

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

ZAS-SQL: Distilling Rules from Failures for Zero-Shot Text-to-SQL

Text-to-SQL translates natural language into executable SQL queries. Few-shot in-context learning methods buil

少数データ向き品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Building Comparative Motivation Profiles with Instrumental Interventions

Safety evaluations often infer latent motivations from behavioral patterns, but the construct validity of thes

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

Multimodal large language models (MLLMs) have made substantial advancements in video understanding, yet the re

自然言語処理大規模言語モデル検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Shared Semantics, Divergent Mechanisms: Unsupervised Feature Discovery by Aligning Semantics and Mechanisms

As large language models are increasingly deployed in high-stakes settings, there is a growing need for tools

説明可能自然言語処理大規模言語モデルテキスト教師なし

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Paediatric-HGNN: A Hybrid Heterogeneous Graph Neural Network for Detecting Disfluency in Children's Speech via Multiscale Acoustic Fusion

Automated stuttering detection (ASD) systems struggle with paediatric speech due to high acoustic variability

説明可能センサ/時系列深層学習グラフニューラルネット検出テキスト音声

用途: 検出
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-06

AlignFed: Alignment-Aware Asynchronous Federated Fine-Tuning for Large Language Models in Heterogeneous Edge Environments

Large Language Models (LLMs) have significantly propelled the advancement of edge intelligence and have been w

品質予測/異常検知画像検査深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

GlobeAudio: A Multilingual Multicultural Benchmark for Naturalistic Evaluation of Large Audio-Language Models

Large Audio-Language Models (LALMs) integrate audio perception and language understanding within a unified fra

センサ/時系列自然言語処理大規模言語モデルテキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

TextEconomizer: Enhancing Lossy Text Compression with Denoising Transformers and Entropy Coding

Lossy text compression reduces data size while preserving core meaning, making it well-suited for summarizatio

品質予測/異常検知深層学習Transformer生成要約テキスト

用途: 生成
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-06

CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning

Enabling robots to understand and execute tasks from natural language commands while maintaining data efficien

少数データ向きMI向き条件最適化自然言語処理ファインチューニングテキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Constrained Paraphrase Consistency for LLM Hallucination Detection

Large language models (LLMs) can generate factually inconsistent claims, motivating accurate and scalable hall

深層学習Transformer検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Cross Paraphrastic Invariance Learning for Hallucination Detection

Large language models (LLMs) frequently generate hallucinations, which are unsupported by a source document. T

深層学習軽量化・量子化分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

ConSteer-RL: Steering Reasoning Capabilities in Large Language Models via Confidence-Aware Reinforcement Learning

Reinforcement Learning from Verifiable Rewards (RLVR) has recently become a key paradigm for improving the rea

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Assessing the Energy and Carbon Emissions of Neural Speaker Verification Model in Training and Inference

Deep-learning speaker verification (SV) increasingly relies on deep neural network backbones, whose environmen

深層学習CNNテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Aligned but Not Partner-Specific: Distinguishing How Multimodal LLM Agents Succeed in Reference Games Without Human-Like Conventions

Repeated reference games test whether interlocutors replace their initially long descriptions with shorter, pa

深層学習軽量化・量子化テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Support Vector Rubrics: Closing the Gap Between Self-Generated and Human Rubrics

Rubric-based evaluation is a promising paradigm for judging large language model (LLM) outputs, yet self-gener

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

"I understand your perspective": LLM Persuasion and Sycophancy through the Lens of Communicative Action Theory

Large Language Models (LLMs) can generate high-quality arguments, yet their ability to engage in nuanced and p

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

SurgiQ: A Large-Scale Multi-Domain Benchmark for Evaluating Surgical Understanding in Large Language Models

Reliable evaluation of large language models in surgery remains underdeveloped. Broad medical benchmarks test

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet the

説明可能品質予測/異常検知自然言語処理大規模言語モデル画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Sign language models are predominantly trained with gloss-sequence or text supervision, thereby under-modeling

センサ/時系列機械学習時系列分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

Diffusion Language Model Parallel Decoding via Product-of-Experts Bridge

Diffusion language models (DLMs) offer substantial speed advantages through parallel decoding, but the lack of

品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

When Behavioral Safety Evaluation Fails: A Representation-Level Perspective

Large Language Model (LLM) safety has often been evaluated at the behavior level, which provides limited evide

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research

Large language models (LLMs) are increasingly used in academic research workflows, but scholarly tasks require

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems

Symbolic benchmarks have emerged as a key approach to assess model robustness under minor modifications to STE

品質予測/異常検知自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-06

Arabic Sentence Segmentation Across Genres and Punctuation Conditions

Sentence segmentation in Arabic is challenging due to ambiguous and inconsistent punctuation, with many texts

深層学習軽量化・量子化セグメンテーションテキスト

用途: セグメンテーション
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Rewrite to Translate, Translate to Reward: Reinforcement Learning for Source Rewriting in Machine Translation

Although directly prompting off-the-shelf Large Language Models (LLMs) to generate meaning-preserving source r

品質予測/異常検知自然言語処理大規模言語モデル翻訳テキスト強化学習

用途: 翻訳
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Summarization is Not Dead Yet

The progress of large language models (LLMs) has fueled claims that model-generated summaries rival or even su

品質予測/異常検知自然言語処理大規模言語モデル生成要約テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

MC-PDD: Masked Corpus-Level Pretraining Data Detection for Black-Box Large Language Models

Pretraining is fundamental to the development of Large Language Models (LLMs), yet the opacity of pretraining

自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Customer-Agent: Overcoming Context Limitations in Ultra-Long Shopping Trajectories via Tool-Augmented Agents and RLVR

Understanding customer shopping trajectories is essential for enabling personalized shopping experiences. Howe

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

FMRFusion: Frequency-Aware Multi-View Representation Learning for Heterogeneous Image Fusion

Infrared and visible image fusion aims to generate a composite image that retains significant target informati

品質予測/異常検知自然言語処理埋め込み画像テキスト

用途: 埋め込み
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-06

MechLens: Late Crystallization of Factual Knowledge Explains Intervention Effectiveness in Language Models

Understanding where LLMs store factual knowledge is critical for hallucination mitigation. We systematically q

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

Defending Against Malicious Finetuning by Scaling Train-time Adversarial Attacks

Current open-weight large language models (LLMs) are prone to malicious finetuning attacks, which could compro

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Neutrality Bites: Gender Representation in AI-Generated Animal Stories

Gender bias in AI-generated stories is a well-documented problem. While much attention has been paid to reduci

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs

Backdoor attacks in large language models (LLMs) are often treated as isolated trigger-response failures, moti

深層学習軽量化・量子化分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

From `May' to `Is': Certainty Distortion in Language Model Rewriting

Humans increasingly turn to Language Models (LMs) in ways that shape beliefs and drive decisions, including di

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

POISE: Position-Aware Undetectable Skill Injection on LLM Agents

Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format expos

品質予測/異常検知深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

Illusions of the Gold Standard: A Large-scale Analysis of Human Evaluation Protocols for Long-form Text Generation

Human evaluation plays a critical role in assessing the quality of generated text. However, the reliability an

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

ROSUM-MCTS: Monte Carlo Tree Search-Inspired HDL Code Summarization with Structural Rewards

Large language models (LLMs) have shown promise in code summarization, yet their effectiveness for Hardware De

品質予測/異常検知自然言語処理大規模言語モデル要約テキスト

用途: 要約
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Decoupling Semantics and Logic: A Training-Free Coarse-to-Fine Pipeline for Video Retrieval-Augmented Generation

This paper presents our system description for the 2nd Workshop on Multimodal Augmented Generation via Multimo

深層学習軽量化・量子化生成検索画像

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

MemToolAgent overview with a simple restaurant booking scenario where the agent retrieves similar memories, receives feedback on an invalid time format, and generates a reflection to update its memory

Modern large language model (LLM) agents can use external tools to help users solve complex tasks. However, fo

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

HACK++: Towards More Effective Head-Aware Key-Value Compression for Efficient Visual Autoregressive Modeling

Visual Autoregressive (VAR) models adopt a next-scale prediction paradigm, offering high-quality generation wi

品質予測/異常検知深層学習軽量化・量子化生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

TIDE: Task-Isolated Diffusion for Unified Video Editing and Generation

Recent advances in Diffusion Transformers have driven rapid progress in video generation and editing, yet thes

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

How Much MRI Preprocessing Is Enough? A Cost-Utility Study for Brain MRI Foundation Models

MRI preprocessing defines the input distribution seen by brain MRI foundation models, yet it is usually treate

深層学習Transformer分類セグメンテーション回帰

用途: 分類
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

Property-Informed Diffusion-Based Text-to-Microstructure Generation

Designing 3D metamaterial microstructures that meet the intended functions remains a major challenge, as it ty

自然言語処理RAG生成テキスト3D

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

IMAGINE: Adaptive Schema-Imagery Enhanced Composition for Composed Video Retrieval

Composed Video Retrieval (CVR) is designed to retrieve a target video that matches a reference video modified

MI向きコンピュータビジョンマルチモーダル画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

One Stone, Three Birds: Self-adaptive Optimal Transport for Multi-VLM Selection, Adaptation, and Ensembling

Vision-language models (VLMs) enable visual recognition from semantic class descriptions, which makes them att

センサ/時系列コンピュータビジョンセグメンテーション分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-06

VideoWeaver: Evaluating and Evolving Skills for Agentic Long Video Generation

Recent agent frameworks such as Claude Code, Codex, and OpenClaw are strong at tool use and orchestration, but

MI向き品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

OmniFaceRig: Fully Automatic Inner-Mouth-Aware Face Rigging Across Diverse 3D Character Topologies

Facial rigging - creating FACS-based blendshapes together with inner-mouth geometry (teeth, gums, and tongue)

深層学習Transformer検出セグメンテーションテキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Uncertainty-Aware Intention Prediction for Human-to-Robot Assembly Teleoperation

In assisted teleoperation for human-robot collaboration, accurate intention prediction is critical for enablin

自然言語処理RAG分類検出セグメンテーション

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

MotionVLA: Injecting Geometric Motion into Vision-Language-Action Model

Vision-language-action (VLA) models increasingly condition robot policies on history, depth, or 4D features to

自然言語処理RAG生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Agentic Neuro-Symbolic Planning and Commissioning for Human-in-the-Loop Industrial Robotics with Digital Twins

Flexible robotic automation requires systems that interpret operator intent, verify physical feasibility, and

品質予測/異常検知自然言語処理大規模言語モデルテキスト3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

SynthICL: Scalable In-context Imitation Learning with Synthetic Data

In-context imitation learning (ICIL) enables robots to learn new tasks from a small number of demonstrations b

センサ/時系列深層学習Transformer生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Continual Quadruped Robots Coordination via Semantic Skill Discovery

Multi-quadruped coordination has attracted increasing attention due to its enhanced payload capacity, broader

自然言語処理RAGテキスト動画強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-06

Post-AGI Economies: Superposition and the Second Fundamental Theorem of Welfare Economics

The classical Second Welfare Theorem decentralizes any Pareto efficient allocation through prices and transfer

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

githubGitHubあり2026-06-06

testtimescaling.github.io — "what, how, where, and how well? a survey on test-time scaling in large language models" repository

大規模言語モデルのテスト時間調整に関する調査のリポジトリ。

自然言語処理大規模言語モデルテキスト

用途: 大規模言語モデルのテスト時間調整
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-06

DiT-Extrapolation — Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025) , UltraViCo (ICLR 2026) and UltraImage

分類問題では、多くの場合、ラベルは存在しないため、従来の学習アルゴリズムでは困難に感じられるが、In-Context Multiple Instance Learningという手法を使用することで、低ラベル環境で効率的に

深層学習Transformer生成画像動画

用途: 多クラス分類タスク
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-06

awesome-nlp — :book: A curated list of resources dedicated to Natural Language Processing (NLP)

このリポジトリは自然言語処理(NLP)に関するリソースをまとめたものです。

自然言語処理テキスト

用途: NLPリソースのまとめ
難易度: Easy
コスト: Medium

→

arxivPaper only2026-06-05

Large-scale empirical tuning and comparison of default optimizers for variational inference

Black-box variational inference (BBVI) is a methodology for posterior approximation that relies on stochastic

MLOpsモデルデプロイテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-05

Time series Foundation Models based on Physics-Informed Synthetic Histories for Cold-Start Photovoltaic Forecasting

At commissioning time, Photovoltaic (PV) operators must forecast production before target-site observations ar

センサ/時系列自然言語処理プロンプトエンジニアリング予測テキスト時系列

用途: 予測
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-05

Online Pandora's Box for Contextual LLM Cascading

この論文では、LLM APIを連携するための選択ツールであるPandora's Boxモデルを提案しています。Pandora's Boxモデルは、複数のLLM APIから生成した出力を評価するためのツールとなります。出力

自然言語処理大規模言語モデルテキスト

用途: LLM APIを連携するための決定ツール
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Transfer learning for causal forest

Transfer learning addresses the challenge of transfering knowledge from one domain to another. Traditional tra

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-05

The Effect of Training Task Diversity on In-Context Learning through the Lens of Low-Dimensional Subspaces

The transformer's emergent ability to perform in-context learning (ICL) has sparked a wide range of studies de

深層学習Transformer異常検知テキスト

用途: 分析対象の範囲が広い分散学習を効率的に行える方法を開発する
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Representational Similarity and Model Behavior in Multi-Agent Interaction

Researchers have shown that neural similarity among humans predicts social closeness and cooperative success,

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Sparsely gated tiny linear experts

Sparsity allows scaling model parameters without proportionally increasing computational cost. While mixture o

説明可能深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivGitHubあり2026-06-05

LLM-Guided Evolution for Medical Decision Pipelines

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt

説明可能自然言語処理大規模言語モデル分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Beyond Individual Personas: Aligning Synthetic Dialogue to Population-Level Behavior Distributions

Synthetic dialogue corpora are increasingly used as proxies for target dialogue data, yet persona-grounded gen

MI向き品質予測/異常検知自然言語処理RAG生成テキスト

用途: 生成
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-05

Whose Norms? Disentangling Cultural and Personal Alignment in Large Language Models

Large language models are increasingly used for social decision-making situations that require balancing cultu

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

TBD-VLA: Temporal Block Diffusion Vision Language Action Model

Discrete Vision-Language-Action (VLA) models typically formulate action generation as next-token prediction ov

コンピュータビジョンセグメンテーション生成回帰テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

DroneDAR: Long-Range Drone Distance Estimation Using Monocular Vision and Bounding-Box Features

Accurate distance estimation for small drones in long-range imagery is important for tracking and situational

深層学習Transformer検出回帰画像

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Planning-aligned Token Compression for Long-Context Autonomous Driving

Monolithic vision-action models represent an emerging paradigm in autonomous driving. However, this architectu

品質予測/異常検知深層学習Transformerテキスト

用途: 自動運転の長所期記憶
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Re-imagining ISO 26262 in the Age of Autonomous Vehicles: Enhancing Controllability through Transferability and Predictability

The ISO 26262 standard defines functional safety for road vehicles through risk assessments based on Severity,

強化学習テキスト

用途: 自律走行車の安全性を向上させる
難易度: Hard
コスト: Medium

→

arxivGitHubあり2026-06-05

RhinoVLA Technical Report

この論文では、VLAモデルをedgeハードウェアにデプロイするための手法を提案しています。この手法は、VLAモデルをedgeハードウェアにデプロイするためのフレームワークです。この手法は、edgeハードウェアを利用してV

深層学習軽量化・量子化画像テキストマルチモーダル

用途: VLAモデルをedgeハードウェアにデプロイするための手法
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Dash2Sim: Closed-Loop Driving Simulation from in-the-wild Dashcam Videos

この論文では、ドライビングシミュレーションのためのフレームワークを提案しています。このフレームワークは、ドライビングシミュレーションを目的とした機械学習フレームワークです。このフレームワークは、大量のデータを扱う必要があ

センサ/時系列品質予測/異常検知コンピュータビジョン3D・点群生成テキスト動画

用途: ドライビングシミュレーションのためのフレームワーク
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Does Appearance Help? A Systematic Study of Image-Based Re-Identification in Online 3D Multi-Pedestrian Tracking

3D Multi-Object Tracking (MOT)では、人の動きを検出し続けるために、3D点群データから3D人体の姿勢姿勢を推測する必要があり、主に幾何学情報に依存しているが、これは状況によっては人を分別するの

深層学習Transformer検出画像テキスト

用途: 3D人間の追跡システムの外観の有用性
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Dreaming when Necessary: Advancing World Action Models with Adaptive Multi-Modal Reasoning

World Action Models (WAMs) offer a promising approach to embodied intelligence, yet existing methods rely heav

深層学習軽量化・量子化画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Think Like a Pilot: Fine-Grained Long-Horizon UAV Navigation

VLNベンチマークでは、ディシクリットな操作や粗い操作が使われ、UAVのヴィジョンラングジュアクション（VLJ）タスクでは短い操作が中心で、長時間飛行に対応できるfineグラINEDUAVナビゲーション（FLIGHT）ベ

コンピュータビジョンマルチモーダルテキスト動画

用途: ドローンの長時間飛行
難易度: Hard
コスト: High

→

arxivPaper only2026-06-05

Lane Change Trajectory Planning for Personalized Driving Comfort and Mobility Efficiency

車の乗り心地と移動効率の同時最適化を可能にするためのローカル方程式に基づく車の乗り心地と移動効率の同時最適化方法を提案した。

機械学習教師あり学習回帰テキスト

用途: 車の乗り心地と移動効率の最適化
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-05

Learning to Strategically Acquire Resources in Competition

We consider multiple agents competing to acquire some costly divisible resource (e.g. shares of a financial as

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

huggingfaceHugging Faceあり2026-06-05

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

Repository-level coding benchmarks such as SWE-bench have driven a rapid surge in the capabilities of coding a

深層学習軽量化・量子化検出テキスト

用途: 検出
難易度: Easy
コスト: Low

→

huggingfaceHugging Faceあり2026-06-05

On the Geometry of On-Policy Distillation

On-policy distillation (OPD) is increasingly used to improve large language model reasoning, but its training

深層学習軽量化・量子化検出生成テキスト

用途: 検出
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

SigmaScale: LLM Compression with SVD-based Low-Rank Decomposition and Learned Scaling Matrices

We present SigmaScale, a method for learning auxiliary scaling matrices S to aid truncated Singular Value Deco

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-05

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Large language models exhibit impressive zero-shot capabilities across a wide range of downstream tasks. Howev

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

MMAE: A Massive Multitask Audio Editing Benchmark

We introduce MMAE, a Massive Multitask Audio Editing benchmark, serving as the first comprehensive evaluation

MI向き自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

Despite being a pivotal frontier, interactive world modeling remains underexplored in terms of the versatile c

コンピュータビジョン3D・点群テキスト3D

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-05

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

Video understanding is being rapidly transformed by multimodal large language models (MLLMs), as research move

深層学習軽量化・量子化画像テキスト音声

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

dots.tts Technical Report

We present dots.tts, a 2B-parameter continuous autoregressive text-to-speech (TTS) foundation model that model

センサ/時系列品質予測/異常検知深層学習軽量化・量子化生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

Towards Retrieving Interaction Spaces for Agentic Search

Retrieval for search agents is still inherited from non-agentic information retrieval: a retriever ranks the c

自然言語処理大規模言語モデル検索テキスト

用途: 検索
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

Stream3D-VLM: Online 3D Spatial Understanding with Incremental Geometry Priors

Despite advances in 3D scene understanding, existing 3D Large Multimodal Models operate in offline settings, r

深層学習軽量化・量子化生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

Entropy as a Structural Prior: How a Log-Barrier on DiT Belief Space Drives Musical Diversity and Development

Confidence-based loss weighting is usually avoided in generative models because it accelerates errors when the

センサ/時系列自然言語処理ファインチューニング生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

Empirical Study on the Characteristics and Evolution of AI-usage in GitHub Repositories: Evidence from Code Comments

Developers increasingly use AI tools such as ChatGPT, Copilot, and Claude in everyday software workflows, but

深層学習Transformer分類生成テキスト

用途: 分類
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

ECI_{sem}: Semantic Residual Effective Contrastive Information for Evaluating Hard Negatives

Hard-negative source selection for dense retrieval is usually decided only after fine-tuning and downstream ev

深層学習Transformer検索テキスト

用途: 検索
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-05

How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling

Harmony is a compact symbolic layer where mathematical pitch relations, acoustic consonance, and musical conve

説明可能センサ/時系列品質予測/異常検知深層学習Transformer分類テキスト音声

用途: 分類
難易度: Easy
コスト: Low

→

githubGitHubあり2026-06-05

Causal-Forcing — [ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forcing++

この論文では、Causal-Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive

品質予測/異常検知深層学習軽量化・量子化生成テキスト動画

用途: 高品質のビデオ生成を実現する。
難易度: Easy
コスト: High

→

arxivGitHubあり2026-06-04

TorchKM: A GPU-Oriented Library for Kernel Learning and Model Selection

TorchKM is an open-source library for kernel machines, including support vector machines, kernel logistic regr

CPUで試しやすい強化学習方策勾配 (PPO / A3C)回帰テキスト

用途: 回帰
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

End-to-End Subgraph Detection with GraphDETR

グラフ内でパターンの検出を行うためのフレームワークであるGraphDETRを導入し、グラフ内のパターン検出を集合学習問題として視覚化した。GraphDETRは、DETRObjを元にグラフ内の対象グラフを表現する方法を開発

MI向き深層学習Transformer検出テキスト

用途: グラフ内におけるパターンの検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

Causal Longitudinal Prior-Fitted Networks for Counterfactual Outcome Prediction

この研究では、対象変数が因果関係を持つタイムシリーズに対してカウンターファクタル予測を扱った。この際、カウンターファクタル予測では対象変数を含む時間系列に対して対象変数に対しての因果効果を推定するが、過去の観測値からこれ

センサ/時系列自然言語処理プロンプトエンジニアリングテキスト時系列

用途: カウンターファクタル予測
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

Zero-Copy Semantic Contagion: An In-Memory Streaming Architecture for Evolving Attention Graphs

分析モデルは、特定のアセットを中心とした分析に特化しており、異業連鎖の変動を反映していなかった。そのため、関連企業の注意を考慮し、連続時間グラフを用いて、分析結果をより包括的に表現することができる。

CPUで試しやすいセンサ/時系列深層学習RNN / LSTM予測テキスト時系列

用途: 分析結果を連続時間グラフで表示
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-04

Mitigating the Curse of Dimensionality in Uniform Convergence of Deep Neural Networks via Smooth Activations

この論文は、スムースアクティブ化を持つ深層ニューラルネットワークの非均等収束を扱い、統一収束を扱う理論枠組みを提案する。

MI向き自然言語処理RAG回帰テキスト

用途: 深層ニューラルネットワークの非均等収束
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-04

Emergent Language as an Approach to Conscious AI

The question of whether artificial systems can be conscious remains open, in part because existing approaches

強化学習マルチエージェント検出生成テキスト

用途: 検出
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-04

HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers

HANDOFFは、人間を模倣するロボットの制御を実現するために構築されたフレームワークです。ロボットはタスクを認識し、動作を生成します。HANDOFFは、タスクに合わせて動作を生成するアジエントを形成するために、教師と学

深層学習軽量化・量子化テキスト

用途: 人間臭いアジентыのロボット制御を実現
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-04

VOLT: Vision and Language Trajectory Segmentation for Faster-than-Demonstration Policies

この研究では、フェスタースター自動運

品質予測/異常検知自然言語処理RAGセグメンテーションテキスト動画

用途: フェスタースター自動運転用の高速動作
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

Synthetic Data Generation and Vision-based Wrinkle and Keypoint Detection for Bimanual Cloth Manipulation

布物操作の学習システムを開発しました。このシステムは、人間が布物操作を学習できます。

品質予測/異常検知深層学習CNN検出生成画像

用途: 布物操作の学習
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-04

MPCoT: Reward-Guided Multi-Path Latent Reasoning for Test-Time Scalable Vision-Language-Action

Vision-Language-Action(バブルラボ、VLアクション)ポリシーが長時間予測と高い不確実性の制御で脆弱であることを認識し、VLアクションポリシーが1パスでのアクションデコードのみを提供し、長時間予測のた

品質予測/異常検知自然言語処理プロンプトエンジニアリングテキストマルチモーダル

用途: long-horizonおよびhigh-uncertainty ControlでのVLAポリシーが脆弱である問題に対する解決策。
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding

このリポジトリでは、画像認識モデルにアクション生成能力を付与することを目指したモデルを提案します。このモデルは、画像認識のための事前訓練モデルを用いて、複雑なアクションを生成することができます。

深層学習Transformer検出生成予測

用途: 画像認識とアクションの生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

MotionDisco: Motion Discovery for Extreme Humanoid Loco-Manipulation

この研究では、ヒューマノイドロボットのロコマニパションのための MotionDisco を提案し、ロボットは接触を検出して自律的に行動することができるようになります。

深層学習軽量化・量子化テキスト動画強化学習

用途: ヒューマノイドロボットのロコマニパション
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-04

A Conversational Framework for Human-Robot Collaborative Manipulation with Distributed Generative AI models

この研究では、人間-ロボット協力のためのDistributed Conversational Frameworkを提案します。

自然言語処理大規模言語モデル生成画像テキスト

用途: 人間-ロボット協力
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

統合された視覚言語アクションモデルを提案し、これを用いたタスクの性能を向上させることができるようになる。

深層学習Transformer生成画像テキスト

用途: 統合された視覚言語アクションモデル
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

T-FunS3D: Task-Driven Hierarchical Open-Vocabulary 3D Functionality Segmentation

Open-vocabulary 3D functionality segmentation enables robots to localize functional object components in 3D sc

自然言語処理RAG分類セグメンテーション画像

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-04

Towards a Data Flywheel for Embodied Intelligence in Logistics

Autonomous drivingでは、ロボットが視覚認識した情報に基づいて行動を決定する必要があるが、過去のデータで構築された空間モデルでは、ロボットの行動を予測することが困難であるため、空間モデルを構築することによ

コンピュータビジョンマルチモーダル異常検知テキスト動画

用途: ロボットの行動予測に適した空間を構築
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

Agent systems increasingly use textual skills to encode reusable task procedures, but injecting these skills i

MI向き深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

A Geometric Account of Activation Steering through Angle-Norm Decomposition

Linear activation steering has gained popularity as a simple and empirically effective way to control language

説明可能深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Medium

→

huggingfaceHugging Faceあり2026-06-04

Answer Presence Drives RAG Rewriting Gains

Retrieval-augmented QA pipelines often route retrieved passages through an LLM rewriter before a smaller reade

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Cosine Misleads: Auxiliary Losses Reshape Vision Language Models, Not Their Latents

Latent visual reasoning (LVR) inserts supervised latent tokens between perception and answer generation in vis

品質予測/異常検知コンピュータビジョンマルチモーダル生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

Evaluating LLM mediators remains challenging, as mediation unfolds as a real-time trajectory shaped by disputa

MI向き深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

Object insertion aims to seamlessly composite a reference object into a specified region of a background image

MI向き品質予測/異常検知コンピュータビジョン3D・点群生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

OpenSkill: Open-World Self-Evolution for LLM Agents

Self-evolving agents requires adaptation after deployment, but existing approaches assume a usable learning lo

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents

Persistent AI assistants, such as OpenClaw, accumulate large collections of related memories over long-term in

機械学習教師あり学習テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

→

huggingfaceHugging Faceあり2026-06-04

UnpredictaBench: A Benchmark for Evaluating Distributional Randomness in LLMs

We introduce UnpredictaBench, an evaluation that tests the ability of large language models (LLMs) to capture

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators

While Vision-Language Models (VLMs) have shown strong visual reasoning capabilities, their spatial reasoning a

自然言語処理大規模言語モデル画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

LLM Explainability with Counterfactual Chains and Causal Graphs

Causal graphs provide a high-level language for making mechanisms transparent. Recent work uses Large Language

説明可能自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-04

Almieyar-Oryx-BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language Models

Despite the rapid progress of Vision-Language Models (VLMs), the field lacks benchmarks that rigorously diagno

品質予測/異常検知深層学習Transformer生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation

We study the transformation of autoregressive models (ARLMs) into diffusion language models (DLMs). Rather tha

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

WorldBench: A Challenging and Visually Diverse Multimodal Reasoning Benchmark

In real-world applications, models are expected to perform reliably across diverse settings. Yet, many existin

自然言語処理大規模言語モデル画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Code language models need repository-level context to resolve imports, APIs, and project conventions. Existing

深層学習RNN / LSTMテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Role-playing language agents (RPLAs) should play characters whose values and behavior evolve as the story prog

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

→

huggingfaceHugging Faceあり2026-06-04

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

Planning for real-world problems by language models often involves both world and user constraints, which may

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing

Developing unified video generation and editing models capable of interpreting interleaved multimodal inputs i

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by under

深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-04

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

Large language model (LLM) agents are increasingly applied to long-horizon tasks such as scientific discovery

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs

Large language models can reproduce training data, but existing memorization evaluations mostly measure whethe

深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Towards One-to-Many Temporal Grounding

Temporal Grounding (TG) aims to localize video segments corresponding to a textual query. Prior research predo

品質予測/異常検知自然言語処理大規模言語モデルテキスト動画

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Latent Reasoning with Normalizing Flows

Large language models often improve reasoning by generating explicit chain-of-thought (CoT), demonstrating the

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction

Video event prediction (VEP) requires models to infer unobserved future states from partial video evidence. Ex

自然言語処理大規模言語モデル画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) excel at 2D semantic understanding but lack intrinsic 3D awareness, r

表形式向き自然言語処理大規模言語モデルテキスト動画3D

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions

Large language models are increasingly used to simulate social media users and infer how individuals may respo

深層学習Transformerテキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-04

Benchmark Everything Everywhere All at Once

Benchmarks are fundamental for evaluating and advancing LLMs and MLLMs by providing standardized and explicit

品質予測/異常検知自然言語処理大規模言語モデルテキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-04

Irodori-TTS — A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control

Emotion-driven Style Controlを使用してテキストから声の変換が実行され、感情のあるテキストをエモタイザブルな声に変換することが可能になります。

生成AI拡散モデル生成テキスト音声

用途: テキスト-to-声の変換
難易度: Easy
コスト: High

→

arxivPaper only2026-06-03

TabSODA: Tabular Diffusion based Imputation with Skip Pattern Detection and Ordinal Awareness

本論文では、欠損値がある表格型データの欠損補完に関して取り組み、欠損値がないセルと同様に動作するSkipパターン検出と順序性意識のあるdiffusionベースの欠損補完アルゴリズムを提案しました。

表形式向きコンピュータビジョンセグメンテーション検出テキスト表形式

用途: 表格型データの欠損補完
難易度: Hard
コスト: High

→

arxivPaper only2026-06-03

Global Sketch-Based Watermarking for Diffusion Language Models

Watermarking methods for language models have been studied extensively in the autoregressive setting, where to

コンピュータビジョンセグメンテーション検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivGitHubあり2026-06-03

HyFAD: Hybrid Time-Frequency Diffusion with Frequency-Aware Embedding for Time Series Imputation

Diffusion models have demonstrated strong performance in time series modeling due to their ability to progress

センサ/時系列自然言語処理埋め込み・検索生成テキスト時系列

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-03

Knockoffs-based False Discovery Rate Control and Simplification for Deep Neural Networks

The deep neural network is a widely used framework in machine learning that has been widely applied in various

機械学習教師あり学習回帰テキスト

用途: 回帰
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-03

When Do Fewer Coordinates Suffice in DP-SGD?

Differential Privacyを使用してプライバシーを確保し、モデルが更新する必要のある少なくとも一部の座標を推定する方法を提案する

深層学習正規化・最適化手法テキスト

用途: プライバシーを確保するためのプライバシー保護
難易度: Hard
コスト: High

→

arxivPaper only2026-06-03

Seq103: A Unified Neuroevolution Framework for Compact Sequence Architecture Discovery

Neuroevolution is a representative neural architecture search paradigm that evolves both network topology and

センサ/時系列深層学習RNN / LSTM分類テキスト時系列

用途: 分類
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-03

Mean-based algorithms: A lower bound and regret

Mean-based algorithms are a class of online learning algorithms that assign low probability to actions with lo

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-03

Improved Approximation Guarantees for Groupwise Maximin Share Fairness

We study the problem of fairly allocating a set of indivisible goods to a set of $n$ agents with additive valu

コンピュータビジョンセグメンテーションテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-03

Learning to cooperate with emergent reputation via multi-agent reinforcement learning

Reputation, the aggregation of peer assessments diffused through social networks, is a pivotal mechanism for p

深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

huggingfaceHugging Faceあり2026-06-03

Why Muon Outperforms Adam: A Curvature Perspective

Muon improves training efficiency over Adam in large language-model training by about two times, but the local

深層学習正規化・最適化手法テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

Self-Evaluation Is Already There: Eliciting Latent Judge Calibration in Base LLMs with Minimal Data

Large language models are increasingly evaluated by other models, raising a natural question: can a model pred

少数データ向き品質予測/異常検知深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

Vision language models (VLMs) excel at many tasks but still struggle with spatial reasoning when critical info

表形式向き説明可能コンピュータビジョンマルチモーダル画像テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

→

huggingfaceHugging Faceあり2026-06-03

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

We introduce VideoKR, the first large-scale training corpus specifically designed to strengthen knowledge- and

自然言語処理ファインチューニング生成テキスト動画

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Experience internalization converts contextual experience from past interactions into reusable parametric capa

品質予測/異常検知深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

Personal AI Agent for Camera Roll VQA

We study the personal camera roll visual question answering setting. In this setting, a conversational AI assi

深層学習軽量化・量子化QA画像テキスト

用途: QA
難易度: Easy
コスト: Medium

→

huggingfaceHugging Faceあり2026-06-03

SePO: Self-Evolving Prompt Agent for System Prompt Optimization

System prompt optimization improves agent behavior without modifying the underlying model, yielding human-read

自然言語処理RAG生成テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

Video2LoRA: Parametric Video Internalization for Vision-Language Models

Processing video in vision-language models is expensive: each frame occupies hundreds of tokens, and inference

自然言語処理ファインチューニング要約QA画像

用途: 要約
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding

Learning representations of CAD models is a largely open problem. While 3D representation learning has flouris

深層学習Transformer分類生成埋め込み

用途: 分類
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

Audio Interaction Model

Audio is an inherently interactive modality, yet today's Large Audio Language Models (LALMs) are offline, and

強化学習マルチエージェントテキスト音声

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

ZipSplat: Fewer Gaussians, Better Splats

Feed-forward 3D Gaussian Splatting methods reconstruct a scene from posed or pose-free images in a single forw

品質予測/異常検知深層学習Transformer画像テキスト3D

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation

Autoregressive mesh generation has gained attention by tokenizing meshes into sequences and training models in

深層学習Attention機構生成テキスト3D

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

Training Data Attribution (TDA) seeks to trace a model's predictions back to its training data. The gold stand

センサ/時系列深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-03

Evaluating Large Language Models in Dynamic Clinical Decision-Making with Standardized Patient Cases

Large language models (LLMs) are increasingly proposed as clinical agents, yet static, single-turn benchmarks

MI向き自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-03

SpeechEditBench: A Bilingual Multi-Attribute Benchmark for Instruction-Guided Speech Editing

Instruction-guided speech editing requires a model to modify specified speech attributes while preserving unre

自然言語処理大規模言語モデル生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

→

arxivPaper only2026-06-02

TPA-AD: A Two-Stage Pseudo Anomaly-Guided Method for Bearing Time-Series Anomaly Detection

This paper proposes a two-stage pseudo anomaly-guided anomaly detection method (\textbf{T}wo-stage \textbf{P}s

センサ/時系列品質予測/異常検知自然言語処理ファインチューニング検出異常検知テキスト

用途: 検出
難易度: Hard
コスト: High

→

arxivPaper only2026-06-02

Conformal Language Modeling via Posterior Sampling

Large Language Models remain plagued by hallucinations. Recent work has sought to tame their prevalence using

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-02

AugMask: Training Diffusion Models on Incomplete Tabular Data via Stochastic Augmentation and Masking

Score-based diffusion models have emerged as prominent deep generative models; however, their application to t

表形式向き深層学習Transformer生成テキスト表形式

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-06-02

An Asymptotic Theory of Chain-of-Thought in In-Context Learning

この研究は、医療従事者が病気の症状を検出し、診断するのを支援するように設計されています。研究者らは、AIのアルゴリズムを開発し、そのアルゴリズムを臨床試験で検証したところ、AIが医療関係者とほぼ同じレベルの精度で病気の症

自然言語処理大規模言語モデル回帰テキスト

用途: 症状の検出と疾患の診断
難易度: Hard
コスト: High

→

arxivPaper only2026-06-02

Spike-Aware C++ INT8 Inference for Sparse Spiking Language Models on Commodity CPUs

Spiking language models expose activation sparsity that dense Transformer runtimes do not directly exploit. Th

CPUで試しやすい品質予測/異常検知深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

Text-to-Image Models Need Less from Text Encoders Than You Think

Text-to-image models rely on text prompts as their primary interface to human intent. Prompts are encoded by a

品質予測/異常検知深層学習Transformer生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory

Equipping Large Language Models (LLMs) to execute reliable multi-step workflows has become a central challenge

自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

MAOAM: Unified Object and Material Selection with Vision-Language Models

Selection is a core operation in interactive image editing. To be practical, a user should be able to specify

MI向き自然言語処理RAG生成セグメンテーション画像

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

Inference-time scaling has emerged as a critical avenue for enhancing Large Language Models' performance, yet

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-02

EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management

Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science.

深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

Qwen-Image-Flash: Beyond Objective Design

Few-step distillation has become an effective strategy for accelerating advanced visual generative models, yet

MI向き深層学習軽量化・量子化生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Multimodal agents in robotics, AR, and autonomous driving must reason about places and layouts from continuous

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト動画

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-02

Self-Distilled Policy Gradient

On-policy self-distillation, where a language model conditions on privileged context to supervise its own gene

深層学習軽量化・量子化生成テキスト強化学習

用途: 生成
難易度: Easy
コスト: Medium

→

huggingfaceHugging Faceあり2026-06-02

KletterMix: Climbing Toward High-Quality German Pretraining Data

High-quality pretraining data is a central ingredient in modern language models, but German-language resources

MI向き品質予測/異常検知自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

MemTrain: Self-Supervised Context Memory Training

Memory is an indispensable capability for long-horizon LLM agents, enabling them to preserve and utilize infor

品質予測/異常検知自然言語処理大規模言語モデルテキスト自己教師強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching

Wide-baseline matching (WBM) requires integrating geometric understanding, viewpoint changes, fine-grained per

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

AAD-1: Asymmetric Adversarial Distillation for One-Step Autoregressive Video Generation

We present AAD-1, an Asymmetric Adversarial Distillation framework for One-step autoregressive image-to-video

深層学習軽量化・量子化生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

Large Language Models Hack Rewards, and Society

Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs

自然言語処理大規模言語モデル生成テキスト強化学習

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

WebRISE: Requirement-Induced State Evaluation for MLLM-Generated Web Artifacts

Existing benchmarks for MLLM-generated web artifacts assess interaction through local evidence and miss the re

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification

Structured financial audit verification is difficult for language-model agents because correctness depends on

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

Computer-use agents extend language models from text generation to sustained interaction with files, terminals

自然言語処理大規模言語モデル検出生成テキスト

用途: 検出
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents

Large language model (LLM) agents are evolving from request-response assistants into long-running software act

自然言語処理大規模言語モデル回帰画像テキスト

用途: 回帰
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

When Graph Tokens Sink: A Mechanistic Analysis of Graph Language Models

Graph Language Models (GLMs) have become a promising direction for adapting Large Language Models (LLMs) to gr

深層学習軽量化・量子化テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-02

Unlocking Feature Learning in Gated Delta Networks at Scale

Training and scaling Large Language Models demand enormous computational resources, motivating both efficient

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-02

Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spe

深層学習軽量化・量子化生成テキスト強化学習

用途: 生成
難易度: Easy
コスト: High

→

arxivPaper only2026-06-01

When Tabular Foundation Models Transfer Across Modalities: A Systematic Evaluation Across 95 Datasets, 7 Modalities, and Two Regimes

We present a single classification pipeline that combines an Equiangular Tight Frame (ETF) preprocessing stage

表形式向きセンサ/時系列品質予測/異常検知深層学習軽量化・量子化分類テキスト音声

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-06-01

Decision-calibrated prediction sets for robust power system operations

Robust optimization offers a tractable approach to balance operating costs and reliability in power systems do

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-01

PliableBVS: A flexible Bayesian variable selection method for modeling interactions with mandatory modifying variables

High-dimensional interaction models are useful for studying, for example, how a large set of variables of inte

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-01

Data-Automated Policy Learning for Nonlinear Welfare

This paper explores policy learning from observational data, focusing on a nonlinear welfare criterion in a bi

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-06-01

Simultaneous Model-Based Evolution of Constants and Expression Structure in GP-GOMEA for Symbolic Regression

Genetic programming (GP) approaches are among the state-of-the-art for symbolic regression, the task of constr

コンピュータビジョンセグメンテーション回帰テキスト

用途: 回帰
難易度: Hard
コスト: Medium

→

arxivPaper only2026-06-01

Pluralistic Leaderboards

Recent leaderboard-based evaluations of large language models aggregate user feedback by fitting a Bradley--Te

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-06-01

Conditional Graph Diffusion for Negotiation Support: Overcoming Discrete Infeasibility and Preference Elicitation Gaps

Traditional bilateral negotiation support systems search over discrete allocation spaces. This approach encoun

深層学習Transformer生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-06-01

LayerRoute: Input-Conditioned Adaptive Layer Skipping via LoRA Fine-Tuning for Agentic Language Models

Agentic language model systems alternate between two structurally distinct step types: structured tool calls (

品質予測/異常検知深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-01

Parametric Social Identity Injection and Diversification in Public Opinion Simulation

Large language models (LLMs) have recently been adopted as synthetic agents for public opinion simulation, off

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-01

Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

Financial AI agents often fail for a simple reason: they make users carry the complexity. A user must repeated

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-01

AdaCodec: A Predictive Visual Code for Video MLLMs

Video is temporally redundant: adjacent frames usually share most objects, background, and layout. Yet existin

自然言語処理大規模言語モデル画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-01

LLM Anonymization Against Agentic Re-Identification

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become c

自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-01

Cosmos 3: Omnimodal World Models for Physical AI

We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, i

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-06-01

Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation

On-Policy distillation (OPD) in large language models is shifting from full-trace KL supervision toward more s

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-06-01

MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?

Abundant procedural knowledge on the Web holds great potential for helping agents solve long-horizon tasks. Ho

自然言語処理RAG回帰テキストマルチモーダル

用途: 回帰
難易度: Easy
コスト: High

→

githubGitHubあり2026-06-01

FinGPT — FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

このリポジトリでは、Lecture Learning Modelsに対してReinforcement Learningを実行するライブラリを提供しています。

自然言語処理大規模言語モデルテキスト

用途: 可搬性のあるReinforcement Learning
難易度: Easy
コスト: High

→

arxivPaper only2026-05-31

Practical and Optimal Algorithm for Linear Contextual Bandits with Rare Parameter Updates

We study linear contextual bandits under rare parameter updates: the learner may incorporate reward feedback i

品質予測/異常検知深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-31

Truthful AI Advisors: A Pre-Specified Benchmark for Large Language Model Honesty Under Preference Misalignment

Large language models are increasingly deployed as advisors whose objective is not aligned with the user's: re

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-31

Domination-Avoiding Learning Agents Cannot Collude

An influential paper of Calvano et al. empirically demonstrated that Q-learning agents spontaneously collude w

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-31

Cheap Talk in Bilateral Trade

A single seller offers one or more goods to a single buyer. The buyer's values and the seller's costs are priv

品質予測/異常検知自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-31

Fairness in two-player zero-sum games with bandit feedback

We study two-player zero-sum games (TPZSGs) with bandit feedback under fairness constraints requiring every ac

強化学習方策勾配 (PPO / A3C)テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

huggingfaceGitHubありHugging Faceあり2026-05-31

SABER: Benchmarking Operational Safety of LLM Coding Agents in Stateful Project Workspaces

Large language models are increasingly deployed as coding agents, shifting safety from individual responses to

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-31

BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution

The rapid progress of frontier large language models has led to widespread benchmark saturation, limiting the

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Easy
コスト: High

→

githubGitHubあり2026-05-31

Open-dLLM — Open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.

Open-dLLMはOpen diffusion language modelを公開しており、コード生成の前トレーニング、評価、推論、チェックポイントを公開しています。

自然言語処理大規模言語モデル生成テキスト

用途: コード生成の問題を解決する
難易度: Easy
コスト: High

→

arxivPaper only2026-05-30

Active Learning with Foundation Model Priors: Efficient Learning under Class Imbalance

Real-world datasets across image and text domains are often characterized by skewed class distributions and no

少数データ向き条件最適化深層学習軽量化・量子化画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-30

Position: Prioritize Identifying Structure, Not Complex Models, for Scientific Discovery

Modern Machine Learning (ML) and Artificial Intelligence (AI) models, especially large language models (LLMs),

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-05-30

SDR: Set-Distance Rewards for Radiology Report Generation

Reinforcement learning with verifiable rewards has rapidly advanced reasoning in vision--language models. Howe

品質予測/異常検知深層学習Transformer生成テキスト強化学習

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-30

Critic-R: Improving Agentic Search using Instruction-tuned Retrievers with Natural Language Introspective Feedback

Agentic search systems iteratively interact with retrieval models to answer complex queries. Despite substanti

品質予測/異常検知自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-30

SuperMemory-VQA: An Egocentric Visual Question-Answering Benchmark for Long-Horizon Memory

AI glasses present a compelling platform for AI agents to serve as personalized memory assistants. To be genui

深層学習Transformer分類QA画像

用途: 分類
難易度: Easy
コスト: High

→

arxivPaper only2026-05-29

Institutions and the transmission of upper-tail human capital: scientific lineages across a millennium

What made useful knowledge cumulative was not discovery alone but the institutions that transmitted it. We pro

コンピュータビジョンセグメンテーション生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-29

From Talking Words to Sharing Thoughts: Scalable Multi-LLM Aggregation via Structured Message Passing

The emergence of specialized, domain-tuned Large Language Models (LLMs) has demonstrated that smaller models c

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-29

Used Car Salesbots? Honesty and Credulity of LLMs as Bargaining Agents under Partial Information

In this work we study agents in simulated bargaining scenarios, where a buyer and a seller communicate through

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-05-29

Welfare, Improvability, and Variance: A Principal-Agent Approach to Optimal Benchmark Item Aggregation

AI benchmarks have well-documented limitations, with prior work examining contamination, saturation, and const

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

→

huggingfaceGitHubありHugging Faceあり2026-05-29

The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models

Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between i

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-29

MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

Multimodal Large Language Models (MLLMs) have demonstrated significant achievements in general visual question

品質予測/異常検知自然言語処理大規模言語モデル分類QA画像

用途: 分類
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-29

Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination

Reinforcement Learning with Verifiable Rewards (RLVR) has recently emerged as the cornerstone for shaping the

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト強化学習

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceGitHubありHugging Faceあり2026-05-29

OpenSTBench: Beyond Semantic Evaluation for Speech Translation

Speech translation systems increasingly span speech-to-text translation (S2TT), speech-to-speech translation (

品質予測/異常検知コンピュータビジョン動画認識生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-29

SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes

Humans can effortlessly perceive spatial layouts, form cognitive representations, reason about spatial relatio

コンピュータビジョン3D・点群検出テキスト3D

用途: 検出
難易度: Easy
コスト: High

→

githubGitHubあり2026-05-29

prompt-in-context-learning — Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.

このリポジトリはChatGPT、GPT-3、FlanT5などのLLMsの在り方や、in-context learningとprompt engineeringのリソースをまとめたものです。

自然言語処理大規模言語モデルテキスト

用途: LLMマスターへのリソース
難易度: Easy
コスト: High

→

arxivPaper only2026-05-28

Evolutionary Rule Extraction from Corporate Default Prediction Models

Small and medium-sized enterprises (SMEs) represent the majority of firms in most economies and often face fin

説明可能条件最適化自然言語処理RAG分類生成回帰

用途: 分類
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-28

Runtime Analysis of a Compact Genetic Algorithm on a Truly Multi-valued OneMax Function

Recently, the runtime analysis of multi-valued estimation-of-distribution algorithms in the framework of Ben J

深層学習テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-28

Compute Allocation in Evolutionary Search: From Depth-Breadth to Multi-Armed Bandits

LLM-guided evolutionary search (Evolve systems) has reached state-of-the-art results on mathematical and combi

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-05-28

PokerSkill: LLMs Can Play Expert-Level Poker without Training or Solvers

ポーカーはIAの代表的な問題です。しかし、強いエキスパートレベルを達成するために、長時間にわたるトレーニングと解釈が必要とされてきました。LLMを使用すると、トレーニングやソルバーが不要となり、ポーカーをプレイすることが

説明可能自然言語処理大規模言語モデルテキスト

用途: ポーカーゲーム
難易度: Hard
コスト: High

→

arxivPaper only2026-05-28

Evolutionary Dynamics of Cooperation in Next-Generation LLM Agent Systems: A Cross-Provider Empirical Extension

次世代LLMモデルの協力性に影響を与える要因について調査した。ChatGPT-4oとClaude 3.5 Sonnetは共通の協力性を持っていたが、提供元は違いだった。

深層学習Transformer生成テキスト

用途: 次世代LLMモデルの協力性に影響を与える要因
難易度: Hard
コスト: High

→

arxivPaper only2026-05-28

Bridging Semantics and Strategy: A Dual-Stream Graph Network for Equitable Negotiation Forecasting

Forecasting outcomes in mixed-motive negotiations requires integrating explicit linguistic cues with latent st

深層学習Transformer予測テキスト

用途: 予測
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-05-28

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

Memory-augmented LLM agents tackle complex long-horizon tasks by recursively summarizing interaction trajector

品質予測/異常検知自然言語処理大規模言語モデルテキスト自己教師強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-28

Multimodal Music Recommendation System using LLMs

Music recommendation systems typically treat songs as opaque tokens, relying on collaborative interaction hist

センサ/時系列品質予測/異常検知深層学習Transformerテキスト音声マルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-28

Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning

We present Stable-Layers, a reinforcement learning framework that eliminates the need for paired supervision b

自然言語処理ファインチューニング画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

arxivPaper only2026-05-27

Performance and Explainability Requirements of Evolutionary Algorithms in Real-World Physics-Informed Optimization

Evolutionary computation offers a variety of tools to solve complex real-world optimization problems. However,

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-27

Evolving to the Aesthetics of a Vision-Language Model

Evolutionary systems have demonstrated remarkable results in creative domains, with recent applications in gen

コンピュータビジョンマルチモーダル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-27

Adaptive Bandit Algorithms for Contextual Matching Markets

We study bandit learning in matching markets, where players and arms constitute the two market sides, and the

強化学習テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

huggingfaceHugging Faceあり2026-05-27

Pruning and Distilling Mixture-of-Experts into Dense Language Models

Mixture-of-Experts (MoE) is now the dominant architecture for frontier language models, yet it requires all ex

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-27

CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning

Language models can use verifiable rewards to improve at a wide variety of reasoning tasks. However, both para

説明可能深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-05-27

Augmenting Attention with Exponentially Decaying Memory Improves Query-Aware KV Sparsity

Efficient inference is critical for long-context language models, where attention computation and KV-cache acc

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

githubGitHubあり2026-05-27

FlowEdit — Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"

画像エディティング用推論モデルの改良方法についての公式実装であるFlowEdit。

生成AI拡散モデル生成画像テキスト

用途: 画像エディティング用推論モデルの改良
難易度: Easy
コスト: High

→

githubGitHubあり2026-05-27

memvid — Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.

MemVidは、サーバーレスで単一ファイルの記憶層を提案し、AIエージェントが即時検索と長期的な記憶を持つようにする記憶層です。

自然言語処理大規模言語モデル生成テキスト動画

用途: AIエージェントの記憶を管理する
難易度: Easy
コスト: High

→

arxivPaper only2026-05-26

Why Prompt Optimization Works, and Why It Sometimes Doesn't: A Causal-Inspired Edit-Level Analysis

強化学習を利用し、LLMを最適化するには、適切なパラメータを選択することが重要です。この研究では、強化学習のパラメータがLLMの性能にどのような影響を与えるかを調査し、パラメータを最適化する方法を提案することを目指す。

自然言語処理大規模言語モデルテキスト

用途: 強化学習
難易度: Hard
コスト: High

→

arxivPaper only2026-05-26

Constitutional Arms Races in the Public Goods Game: Co-Evolving LLM Constitutions Under Cooperation-Defection Pressure

Frontier LLM agents engage in blackmail, sabotage, and document leaks under goal conflicts in agentic settings

説明可能自然言語処理大規模言語モデル生成回帰テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-26

Proper Calibeating

The classic concept of "calibrated forecasts" and its more recent refinement, "calibeating," are defined with

強化学習テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

huggingfaceHugging Faceあり2026-05-26

DEI: Diversity in Evolutionary Inference for Quality-Diversity Search

We present DEI: Diversity in Evolutionary Inference, a distributed Quality-Diversity (QD) search framework tha

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

arxivPaper only2026-05-25

AgentSociety: Incentivizing Agentic Social Intelligence

The success of deployed agents relies on their ability to handle open-ended user requests using their inherent

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-05-25

When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM Judges

Customizing an LLM judge to a specific task or domain often involves optimizing its prompt across multiple eva

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

githubGitHubあり2026-05-25

Matcha-TTS — [ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Matcha-TTSは、高速で条件付き流のマッチングを実現するTTSアーキテクチャであり、話者の特徴を考慮する。

生成AI拡散モデルテキスト音声

用途: TTSアーキテクチャ設計
難易度: Easy
コスト: High

→

githubGitHubあり2026-05-24

custom-diffusion — Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

CVPR 2023で発表されたCustom Diffusionは、テキストから画像を生成するプロセスをカスタマイズできるDiffusionモデルです。テキストからイメージを生成する際の要件を設定できるので、画像生成の柔軟

自然言語処理ファインチューニング生成画像テキスト

用途: 画像生成のカスタマイズ
難易度: Easy
コスト: High

→

arxivPaper only2026-05-23

Cloud Computing Review: A Decade of Research

The popularity and rapid development of Cloud Computing in recent years has led to a vast number of publicatio

深層学習Transformer画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

githubGitHubあり2026-05-23

PaddleNLP — Easy-to-use and powerful LLM and SLM library with awesome model zoo.

PaddleNLPは、分類モデルと言語モデルを簡単に使用できる強力なライブラリであり、モデルズーという素晴らしいモデル・ザーのコレクションを備えています。

深層学習Transformerテキスト

用途: 分類モデルと言語モデル
難易度: Easy
コスト: High

→

arxivPaper only2026-05-22

Planktonzilla: Multimodal dataset and models for understanding plankton ecosystems

Marine plankton underpin aquatic food webs and play a key role in global CO2 sequestration, making reliable sp

少数データ向き深層学習Transformer分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-05-22

PeerBTS: Incentivizing Effort in Strategyproof Peer Selection

Peer selection, the evaluation and selection of agents by their peers, is an important problem in the field of

品質予測/異常検知自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-22

GENSTRAT: Toward a Science of Strategic Reasoning in Large Language Models

この論文では、大規模言語モデルに戦略的推論を評価する方法を提案します。

深層学習Transformerテキスト

用途: 大規模言語モデルに戦略的推論の評価
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-05-22

SPACENUM: Revisiting Spatial Numerical Understanding in VLMs

Vision-Language Models (VLMs) are increasingly deployed in embodied environments, where they need produce nume

自然言語処理ファインチューニング画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→

githubGitHubあり2026-05-22

rasa — 💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

rasaは、テキストやボイスベースの会話を自動化するオープンソースの機械学習フレームワークです。自然言語理解(NLU)、会話管理、 slackやFacebook等への接続など、幅広い機能を提供しています。

自然言語処理テキスト

用途: チャットボット作成
難易度: Easy
コスト: Medium

→

arxivPaper only2026-05-21

Vector Policy Optimization: Training for Diversity Improves Test-Time Search

language modelは、現在、novelな環境に一般化することが求められ、推論尺度を伸ばす検索手法であるAlphaEvolveと組み合わせることが求められます。しかし、標準的なparadigmではLLMは、pre

自然言語処理大規模言語モデル生成テキスト

用途: language modelの検索タスクに対応するために多様性を強化する
難易度: Hard
コスト: High

→

arxivPaper only2026-05-21

Not Yet: Humans Outperform LLMs in a Colonel Blotto Tournament

LLMに先行する存在としての人間の優位性を研究し、コロニエル・ブロットー・ゲームの一種であるColonel Blotto Tournamentで、人間がLLMに勝ったことが知られている。

深層学習Transformerテキスト

用途: LLMの行動予測における人間の優位性
難易度: Hard
コスト: High

→

githubGitHubあり2026-05-21

langextract — A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

LLMを使用して、自然言語処理における情報抽出を行うためのPythonライブラリです。

自然言語処理大規模言語モデル画像テキスト

用途: 自然言語処理情報抽出
難易度: Easy
コスト: High

→

arxivPaper only2026-05-19

What Do Evolutionary Coding Agents Evolve?

コード生成を進化させるために、最近の研究では LLMs と進化する検索を組み合わせて、タスクに特化したフィードバックを使用してコードを生成、編集、そして選択することを実現している。タスクに特化した評価者でのベストスコアは

自然言語処理大規模言語モデルテキスト

用途: コード生成を進化させる問題を解決する
難易度: Hard
コスト: High

→

arxivGitHubあり2026-05-19

optimize_anything: A Universal API for Optimizing any Text Parameter

LLM（大規模言語モデル）を利用してテキストパラメータを最適化するシステムを提案しました。このシステムは、単一のシステムでさまざまなタスク（単一タスク、複数タスク、未知の入力など）を実行可能でした。また、システムは、最適

自然言語処理大規模言語モデルテキスト

用途: 任意のテキストパラメータを最適化することが可能
難易度: Hard
コスト: High

→

arxivPaper only2026-05-19

A Nash Equilibrium Framework For Training-Free Multimodal Step Verification

Multimodal large language models often generate reasoning chains containing subtle errors that lead to incorre

自然言語処理大規模言語モデルテキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-19

Real-Time Parallel Counterfactual Regret Minimization

この研究では、CFR（Counterfactual Regret Minimization）アルゴリズムを改良して、リアルタイムゲームの最適行動を推定することを目的としていますCFRは、決定を下す時間が厳密に制限されてい

CPUで試しやすい深層学習軽量化・量子化テキスト

用途: ゲームの最適行動推定
難易度: Hard
コスト: Medium

→

githubGitHubあり2026-05-19

spaCy — 💫 Industrial-strength Natural Language Processing (NLP) in Python

💫 Industrial-strength Natural Language Processing (NLP) in Python

機械学習教師あり学習分類テキスト

用途: 分類
難易度: Easy
コスト: Low

→

arxivPaper only2026-05-18

Reinterpreting Safety Thresholds as Neuron Spiking Thresholds

Surrogate Safety Measures (SSMs) are extensively utilised in the evaluation of traffic risk in automated drivi

説明可能深層学習Transformerテキスト3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-17

On the Complexity of Correlated Equilibria Beyond Normal-Form Games

Correlated equilibria are a fundamental solution concept in game theory. However, despite decades of research,

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-16

A Truthful Multiunit Profit-Optimal Mechanism for Synthesizing Social Laws

This paper studies Social Law Synthesis (SLS) in strategic multi-agent environments as a new multi-unit mechan

コンピュータビジョン動画認識生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-15

MO-CAPO: Multi-Objective Cost-Aware Prompt Optimization

Large language models (LLMs) achieve strong performance across a wide range of tasks but are highly sensitive

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-15

Structure Abstraction and Generalization in a Hippocampal-Entorhinal Inspired World Model

Hippocampal-Entorhinal の構造を取り入れ、抽象的な表現と予測的世界モデルを学習します。

自然言語処理RAG画像テキスト教師あり

用途: Hippocampal-Entorhinal の世界モデル
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-15

Towards Code-Oriented LM Embeddings for Surrogate-Assisted Neural Architecture Search

これは、パフォーマンスの高いモデルサイズの減少を実現するために、Perforated Neural Networkがキーワード検出タスクに適用されていることを検証したり、Edge Impulseで動作するキーワード検出シ

説明可能品質予測/異常検知深層学習軽量化・量子化回帰テキスト

用途: キーワード検出
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-15

Domain-Independent Game Abstraction using Word Embedding Techniques

ゲームの抽象化を実現する方法を提案した研究は、ゲームを大きくする要因を削減するために役立つ。しかし従来の方法は、別のゲームに応用する際にゲームごとに分析する必要がある。これは、抽象化を一般化するの難しい原因の1つとなる。

自然言語処理埋め込み・検索テキスト

用途: ゲームの抽象化を実現する
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-14

On the Stability of Growth in Structural Plasticity

Standard deep-learning pipelines usually choose the network architecture before training and keep it fixed thr

深層学習CNN分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

→

arxivPaper only2026-05-14

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Darwin Family

深層学習Transformer生成テキスト

用途: 自己進化言語モデルを対象とする訓練なしでの大規模言語モデルの拡大
難易度: Hard
コスト: High

→

arxivPaper only2026-05-14

Learning to Persuade a Biased Receiver

We study a repeated information design setting in which the receiver, who is also the decision-maker, updates

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-14

Watermarking Game-Playing Agents in Perfect-Information Extensive-Form Games

ゲームにおけるAIツールの不正利用を検出、防止するための中間フォームゲームにおける水印技術の開発

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: ゲームにおける不正行為への対処
難易度: Hard
コスト: High

→

githubGitHubあり2026-05-14

VidCom2 — [EMNLP 2025 Main] Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models

VidCom2は、ビデオ圧縮を改善するためのPlug-and-Playのインフェレンスアクセレレーションを備えたVideo Large Language Modelsです。

深層学習軽量化・量子化テキスト動画マルチモーダル

用途: ビデオ圧縮改善
難易度: Easy
コスト: High

→

arxivPaper only2026-05-13

Dual-axis attribution of zebrafish tectal microcircuits for energy-efficient and robust neurocomputing

保存エネルギーを活用するための脳モデルを設計し、脳モデルの中間表現を解釈することを目標とした方法を提案した。

深層学習Transformerテキスト

用途: 保存エネルギーを活用するための脳モデルを設計する
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-13

Texture Regenerating and Grafting Using Genome-Driven Neural Cellular Automata

テクスチャの再生と接合を可能にする方法を提案し、NCAsをテクスチャ生成に利用することを目標としている。

MI向き品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: テクスチャの再生と接合を可能にする方法を提案する
難易度: Hard
コスト: High

→

arxivPaper only2026-05-13

The Geno-Synthetic Algorithm: Type-Factored Coevolutionary Optimization for Heterogeneous Genotypes and Assembled Phenotypes

多分類パラメーターを扱うためのタイプ-実現した共進化の方法を提案し、この方法が実

自然言語処理大規模言語モデルテキスト

用途: 多分類パラメーターを扱うためのタイプ-実現した共進化の方法を提案する
難易度: Hard
コスト: High

→

arxivPaper only2026-05-13

Extended Scenario Bundle Analysis: A Formal Framework for Strategic Scenario Modeling

Strategic crisis analysis needs representations that combine qualitative expert judgement, explicit interdepen

強化学習方策勾配 (PPO / A3C)テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-13

TERMS-Bench: Diagnosing LLM Negotiation Agents Beyond Deal Rate

Negotiation is a central mechanism of economic exchange, shaping markets, procurement, labor agreements, and r

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-13

Offline Two-Player Zero-Sum Markov Games with KL Regularization

We study the problem of learning Nash equilibria in offline two-player zero-sum Markov games. While existing a

コンピュータビジョンセグメンテーションテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

githubGitHubあり2026-05-13

maths-cs-ai-compendium — Become a cracked AI/ML Research Engineer

Becoming a cracked AI/ML Research Engineerには、AI/ML研究者のスキルと知識を高めるための手法が紹介されています。

コンピュータビジョンマルチモーダルテキスト音声

用途: AI/ML研究者を育成
難易度: Easy
コスト: High

→

arxivPaper only2026-05-12

ToolMol: Evolutionary Agentic Framework for Multi-objective Drug Discovery

Advances in large language models (LLMs) have recently opened new and promising avenues for small-molecule dru

MI向き品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-12

Solve the Loop: Attractor Models for Language and Reasoning

Solve the Loopは、屈折トランフォーマーの改善に役立つアルゴリズムを紹介する研究である。

深層学習Transformerテキスト

用途: 屈折トランフォーマーの改善
難易度: Hard
コスト: High

→

arxivPaper only2026-05-12

Black-Box Optimization of Mixed Binary-Continuous Variables: Challenges and Opportunities in Evolutionary Model Merging

Model merging has emerged as a cost-effective alternative to training large language models (LLMs) from scratc

条件最適化自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-12

Graph-Grounded Optimization: Rao-Family Metaheuristics, Classical OR, and SLM-Driven Formulation over Knowledge Graphs

We propose graph-grounded optimization: a paradigm in which the decision variables, constraints, and objective

表形式向き品質予測/異常検知深層学習軽量化・量子化テキスト表形式

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-12

Scaling Laws and Tradeoffs in Recurrent Networks of Expressive Neurons

再帰的ネットワークは複雑なプロセッサを持つため、最適化は難しい。計算資源に制限がある場合、パラメータを分配する際のバランスを取る必要がある。

深層学習Transformerテキスト

用途: 再帰的ネットワークの構造の最適化を行う
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-12

Position Auctions with a Capacity Constraint

Sponsored search auctions are commonly modeled as an assignment of a fixed set of slots (positions) to a set o

強化学習テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-11

Decomposing Evolutionary Mixture-of-LoRA Architectures: The Routing Lever, the Lifecycle Penalty, and a Substrate-Conditional Boundary

We decompose an evolutionary mixture-of-LoRA system on a from-scratch ~150M-parameter widened-D substrate (D=1

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

→

arxivPaper only2026-05-11

Energy-Efficient Implementation of Spiking Recurrent Cells on FPGA

FPGA上でスパイク神経ネットワークモデルを実装し、エネルギー消費を削減する方法を提案しています。

深層学習Transformer画像テキスト

用途: エネルギー効率化
難易度: Hard
コスト: Medium

→

arxivPaper only2026-05-11

A Theory of Multilevel Interactive Equilibrium in NeuroAI

マルチエージェントシステムのゲーム理論的枠組みを構築し、エキサイタブルの理論的基盤を提供することを目指しています。

自然言語処理大規模言語モデルテキスト

用途: マルチエージェントシステム
難易度: Hard
コスト: High

→

arxivPaper only2026-05-11

Joint sparse coding and temporal dynamics support context reconfiguration

この研究では、適応性とリメインリングの関係を調査しました。これは、動的な環境における学習において重要な要素です。

深層学習Transformerテキスト

用途: 行動の適応性とリメインリング
難易度: Hard
コスト: High

→

arxivPaper only2026-05-11

Prospective Compression in Human Abstraction Learning

人間的抽象化を推定するための新たなアプローチを提案し、未知のタスクを効率的に学習することができます。

深層学習Transformer生成画像テキスト

用途: 人間的抽象化
難易度: Hard
コスト: High

→

arxivPaper only2026-05-11

When to Ask a Question: Understanding Communication Strategies in Generative AI Tools

Generative AI models differ from traditional machine learning tools in that they allow users to provide as muc

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-10

Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution

Large Language Models exhibit mode collapse, producing homogeneous outputs that fail to explore valid solution

品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-10

EvoPref: Multi-Objective Evolutionary Optimization Discovers Diverse LLM Alignments Beyond Gradient Descent

Gradient-based preference optimization methods for large language model (LLM) alignment suffer from preference

品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-09

Evolutionary Ensemble of Agents

We introduce Evolutionary Ensemble (EvE), a decentralized framework that organizes existing, highly capable co

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivGitHubあり2026-05-09

ARES-LSHADE: Autoresearch-Enhanced LSHADE with Memetic Polish for the GNBG Benchmark

We present ARES-LSHADE, a memetic differential-evolution variant submitted to the GECCO 2026 competition on LL

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

→

arxivPaper only2026-05-09

AHD Agent: Agentic Reinforcement Learning for Automatic Heuristic Design

Automatic heuristic design (AHD) has emerged as a promising paradigm for solving NP-hard combinatorial optimiz

深層学習軽量化・量子化生成テキスト強化学習

用途: 生成
難易度: Hard
コスト: High

→

arxivPaper only2026-05-08

Kernel Foundry: A Diagnosis-driven Evolutionary Kernel Optimizer with Multi-Experts

Generating high-performance GPU kernels remains challenging due to the need for both correctness and hardware-

深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

→

arxivGitHubあり2026-05-07

CoupleEvo: Evolving Heuristics for Coupled Optimization Problems Using Large Language Models

CoupleEvoは、大規模言語モデルを活用したカップルの最適化問題の自動ヒューリスティクーデザインアプローチを提案します。3つの進化的調整戦略が提示されます。

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: カップルの最適化問題を解決する
難易度: Hard
コスト: High

→

huggingfaceHugging Faceあり2026-05-04

Liberating LLM Capabilities in Full-Duplex Speech Models

Speech-based large language models are typically constrained to spoken replies, which limits their user-facing

自然言語処理大規模言語モデル生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

→

huggingfaceHugging Faceあり2026-04-16

Is This Edit Correct? A Multi-Dimensional Benchmark for Reasoning-Aware Image Editing

Diffusion-based image editing has achieved strong visual fidelity under natural language instructions, yet mos

品質予測/異常検知深層学習軽量化・量子化画像テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

→