MLinfo | 機械学習・AI論文まとめ

Reflector: Arrangement-Aware Harmonic Retrieval for Sample-Based Composition

Sample retrieval tools can help composers find harmonically compatible material, but querying from a fixed ref

MI向き自然言語処理埋め込み・検索テキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Rethinking Multi-Branch and Cross-Backbone Fusion for Vehicle Re-Identification in the Foundation-Model Era

Multi-branch architectures and CNN-Transformer fusion have long been regarded as effective ways to improve veh

深層学習Transformer画像

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

CPUで試しやすい深層学習Transformerテキスト

RIS-Kernel: A Model-Agnostic Architecture for Long-Context LLM Inference via Sparse Attention

Full self-attention in large language models scales as O(N^2), which limits long-context document analysis to

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Teachy Mini: Development and Preliminary Evaluation of a Knowledge-Based Generative Social Robot for Higher Education

Generative social robots (GSRs) powered by large language models offer new possibilities for personalized tuto

用途: 生成
難易度: Hard
コスト: High

少数データ向き説明可能深層学習軽量化・量子化生成テキスト

Towards Trustworthy and Cost-Efficient Data Integration: From Naïve RAG to Agentic RAG

Large language models (LLMs) and AI agents have demonstrated strong potential for data integration in zero-sho

用途: 生成
難易度: Hard
コスト: High

自然言語処理大規模言語モデル生成テキストマルチモーダル

Benchmarking Fine-tuning and Retrieval Strategies for a Multimodal Language Model on the NRC Reactor Operator Licensing Examination

The integration of large language models (LLMs) into the nuclear power industry requires outputs grounded in d

用途: 生成
難易度: Hard
コスト: High

A Factorial Study of Synthetic Data Generation for Low-Resource Machine Translation using Grammar Books

Most endangered languages lack the parallel data required for machine translation, despite the existence of de

自然言語処理大規模言語モデル生成翻訳テキスト

用途: 生成
難易度: Hard
コスト: High

Leveraging External Knowledge for Historical Document Restoration via Retrieval-Augmented Large Language Models

歴史資料の修復を支援するための新しい方法を提案し、歴史資料の修復を支援するためのモデルを作成した。

用途: 歴史資料の修復を支援する
難易度: Hard
コスト: High

arxivGitHubあり2026-07-24

The Lift Spectrum: How Measurement-to-Space Adaptivity Shapes Robustness in Image-Free Single-Pixel Sensing

Single-pixel sensing encodes a scene as a short sequence of coded measurements, and image-free methods infer t

センサ/時系列深層学習Transformer画像

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Three-Body Alignment: Aligning Chess Agent with Human Reasoning through Reranked Rationale

人間の推論とマシン・リーザーの推論を一致させることがAIセキュリティーやセーフティーにおける課題です。人間の推論とマシンの推論を対称化することで、AIシステムを安全かつ予測可能なものとすることができます。

説明可能深層学習軽量化・量子化検索画像テキスト

用途: 人間の推論とマシン・リーザーの推論の対称化
難易度: Hard
コスト: High

Learning What Matters: Supervising Sparse Attention Routing with Causal Evidence Sets

Sparse attention reduces the cost of long contexts by allowing each query to read only selected parts of the i

コンピュータビジョンセグメンテーションテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Encoding Invisible Causation for Bridge Diagnostic Agents: Triple-Guided Retrieval-Augmented Fine-Tuning with QLoRA

Bridge infrastructure deteriorates gradually, yet its root causes---salt intrusion, freezing, fatigue cracking

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデルテキスト

Data Quality over Capacity: Internalizing Documents into LoRA Adapters for Closed-Book QA

この研究では、クローズドブックQAのアドプターを用いて、質問に対する正確な回答を生成することを目的とします。

用途: 導致する問題の解決
難易度: Hard
コスト: High

MosaicJoin: Compact Semantic Sketches for Value-Level Join Discovery

Join discovery is a core task in dataset search, enabling users to find columns that can be joined with a give

深層学習軽量化・量子化

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Agentic Context Management: Solving Agent Memory and Cost by Treating Them as Lifecycle and Architecture Problems

Agentic Context Managementは、エージェントのメモリとコストを管理できるようにした。方法は、エージェントが自己管理できるように、トレーナーが制御できるようにした。

自然言語処理RAG要約テキスト

用途: エージェントメモリとコストの解決
難易度: Hard
コスト: Low

Toward Continuous Assurance for the Democratization of AI Agent Creation in Industry

Democratization of AI Agent Creationは、オーガナイゼーションがオープンなAIエージェントを作成できるようにした。方法は、エージェントの信頼性を

深層学習Transformer

用途: AIエージェントの民主化
難易度: Hard
コスト: Low

RUMBA: Russian User Memory Benchmark

この研究では、LLMsが長期記憶を持つ能力を評価するために開発された「RUMBA」という新しい基準を提示します。この基準は、記憶の長期間持つ能力を評価するための詳細な記憶関連質問の分類体系と、それを統合的に考慮するための

自然言語処理大規模言語モデルテキスト

用途: 長期記憶を解決する
難易度: Hard
コスト: High

GRADRAG: Cross-Component Prompt Adaptation for Coordinated Multi-Agent RAG

Retrieval-Augmented Generation (RAG) systems increasingly employ multiple LLM agents. Yet, most prior work opt

用途: 生成
難易度: Hard
コスト: High

A Comparative Evaluation of Embeddings and LLMs in a Greek Book Publisher Setting - The CUP Dataset

この研究では、大規模言語モデルを活用して、Greekに基づく書籍検索システムの評価を行いました。大規模言語モデルを活用することで、検索精度が高まりました。

深層学習Transformer要約

用途: 書籍検索システムの評価
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化生成テキスト

Enhancing SLMs for Sustainable Code Optimization in Radio-Astronomy

Recent Large Language Models (LLMs) can produce and optimize complex code. We investigate the use of LLMs to g

用途: 生成
難易度: Hard
コスト: High

CRAG-MM-Diagnostics: Enabling Stage-Wise Analysis of Knowledge-Intensive VQA

知識重視の質問応答システム (KI-VQA) を分析するために、新しい評価基準を提案します。これらの基準では、VLMの各タスクを個別に評価することができます。

自然言語処理大規模言語モデル分類QA画像

用途: 知識重視の質問応答システムの分析
難易度: Hard
コスト: High

AttriMem: Attribution-Guided Process Feedback for Agent Memory Learning

代理記憶の学習は、LGMが効果的に情報を保持・更新・処理できることを意味します。この研究では、アトリビューテッドグラフィックフィードバックを使用して、代理記憶を最適化する方法を提案します。

自然言語処理大規模言語モデルQA

用途: 代理記憶の学習
難易度: Hard
コスト: High

Anti-Periodic Positional Encoding: Möbius Boundary Conditions Make In-Context Retrieval Reliable

この論文では、対称的な位置エンコードにモビアスの対称性を適用しました。これにより、ローテーションの平面での各位置間のホロノミーが -1 となり、シーケンスの両端が決定的に結合されます。この手法により、精度が高額になること

自然言語処理プロンプトエンジニアリングテキスト

用途: モビアスの対称性に基づく対称的な位置エンコード
難易度: Hard
コスト: High

Capital Markets LLM Reliability Score (CM-LRS): From Plausible to Bankable

この研究では、リスクベンチャークラウドワークにおける可信性の向上を目的として、Capital Markets LLM Reliability Score（CM-LRS）を提案し、LLMsが生成したドキュメントの価値を確立

用途: リスクベンチャークラウドワークの可信性の向上
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

PrefReward: Learning User Preference Matrix for Personalized Text Generation

この研究では、テキスト生成における個別化を目的として、PrefRewardを提案

用途: テキスト生成における個別化
難易度: Hard
コスト: High

LegalCiteTrust: Benchmarking Citation Trustworthiness in Chinese Long-Form Legal Research Reports

Chinese language の長形法律研究報告における出典の信頼性を評価し、信頼性が低い出典を検出および評価する目的で LegalCiteTrust を提案している。

自然言語処理大規模言語モデル生成

用途: 法律研究報告の信頼性改善
難易度: Hard
コスト: High

arxivGitHubあり2026-07-23

REFACT: Adaptive Fact Restatement for Compact and Faithful Chain-of-Thought Reasoning

長形推論のための言語モデルが、提供されたコンテキストから乖離した論理を生成する可能性があることを指摘し、コンテキストと推論論理をより適切に融合するため、 REFACT (REstating Facts in Adapti

用途: Chain-Of-Thought (CoT) の改善
難易度: Hard
コスト: High

Causal-AgentIR: Self-Evolving Causal Memory for Adaptive Image Restoration Agents

画像修復エージェントは、実世界シナリオでさまざまな不正確なデフォームを処理するための柔軟な枠組みとして急速に開発されています。既存のエージェントは、エージェントは欠陥、ツール、修復オペレーションのための候補を検索し、修復

品質予測/異常検知生成AIGAN画像テキスト

用途: 画像修復エージェントの知識を自律に更新する。
難易度: Hard
コスト: Low

品質予測/異常検知数学・理論確率・統計テキストマルチモーダル

Achieving Text-based Person Retrieval with Any Granularity

文章ベースの人物検索には、実世界シナリオではQuery Granularityの不確実性が大きな課題です。この paper では、New Paradigm、Text-based Person Retrieval with

用途: さまざまな粒度の人物検索に対応できる。
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化テキスト動画マルチモーダル

ProCap: Prominence-guided Object Rectification for Faithful and Comprehensive Video Captioning

Improving video captioning quality typically demands retraining large vision-language models, an expensive and

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Distribution-Alignment Bridge for Uncertainty-Aware Text-to-Video Retrieval

本論文では、テキストと動画を対応させるDistribution-Alignment Bridge（DAB）を提案します。DABは、テキストと動画のエンティティを確率分布として表現し、両者の間の分布の差異を解決します。この

自然言語処理埋め込み・検索生成テキスト動画

用途: テキストから動画の検索
難易度: Hard
コスト: High

Notes to Self: Can LLMs Benefit from Experiential Abstractions?

大規模言語モデルがヒューリスティックを形成し、それを使用して未知のタスクに適用できるようにすることを目標に掲げました。これにより、大規模言語モデルの実用性を向上させることができると予想されます。

自然言語処理大規模言語モデルテキスト強化学習

用途: 大規模言語モデルへの実用性を向上させるための知識の抽出
難易度: Hard
コスト: High

Reinforcement Learning for Large Language Model Selective Evidence Adoption from Contaminated Retrieval Results

リトリーウードされた大規模な言語モデルは、有益な情報と誤った情報の混在するコンテキストに対処するのに苦労しています。拒否することで有益な情報を捨てることになるし、無条件に採用すると不正確または危険な回答が得られます。正確

自然言語処理大規模言語モデルテキスト強化学習

用途: 有害な情報の選択性的な採用
難易度: Hard
コスト: High

Overview of FinMMEval 2026 Task 2: Multilingual Financial Short-Answer Question Answering

FinMMEval 2026 タスク 2 は、英語で提出された短答式の金融問題を解決することを目的としています。英語以外の言語による証拠も使用されます。

自然言語処理RAG生成QA検索

用途: 金融問題を解決する
難易度: Hard
コスト: Low

Overview of FinMMEval 2026 Task 1: Multilingual Financial Multiple-Choice Question Answering

FinMMEval 2026 タスク 1 は、英語、中国語、アラビア語、ヒンディー語で行われる多言語的な金融質問に答えるものを評価します。

自然言語処理大規模言語モデルQAテキスト

用途: 金融問題を解決する
難易度: Hard
コスト: High

VizRAG: Enhancing Retrieval-Augmented Generation with Hypergraph Visualization

グラフやハイパーグラフを用いたリテラル・アンダラインは、複雑なn-アライアブル原子事実をエンティティ間の単一の関係を頼らない、エンティティ間の関係を明示することで、従来のグラフベースのアプローチを上回る精度を実現している

自然言語処理大規模言語モデル生成画像テキスト

用途: リテラル・アンダラインの精度評価
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

Beyond Relevance-Centric Retrieval: Rubric-Oriented Document Set Selection and Ranking

3D オキュピエンシー予測には、物体の配置と密度を解釈するための視覚的手法が必要です。従来の方法では、計算コストが高くなりすぎていたが、新しく提案されたGaussianSeedアルゴリズムは、層を階層化することで、計算コ

用途: 3次元空間における物体の配置と密度の予測
難易度: Hard
コスト: High

自然言語処理大規模言語モデル画像テキストマルチモーダル

Diverse-Intent Multi-Turn Fashion Image Retrieval

複数ターンのファッション画像検索は、実世界のファッション検索では重要なタスクです。Diverse-Intent Multi-Turn Fashion Image Retrievalアルゴリズムは、異なる検索用途を扱うこと

用途: 複数ターンのファッション画像検索
難易度: Hard
コスト: High

コンピュータビジョンセグメンテーション生成画像動画

Vera: Identity-Faithful Human Subject-to-Video Generation

人間の個体を表現するための高精度の生成モデルを提案し、複数の人間の個体を表現するための精度と柔軟性を向上させることを目指しています。

用途: 生成
難易度: Hard
コスト: High

センサ/時系列深層学習軽量化・量子化検出セグメンテーション埋め込み

arxivGitHubあり2026-07-22

Not All Patches are Equal: Sampling Matters for Visible-Infrared Pre-Training

可視光と非可視光のデータを連携するためのアプローチを提案し、可視光と非可視光のデータを連携するための精度と効率を向上させることを目指しています。

用途: 可視光と非可視光のデータを連携するためのアプローチ
難易度: Hard
コスト: High

RIM: A Retrieval-In-Matching Framework for Cross-Domain Global Visual Localization of UAVs

Global visual localization of unmanned aerial vehicles (UAVs) using remote-sensing reference maps has attracte

センサ/時系列深層学習軽量化・量子化検出画像3D

用途: 検出
難易度: Hard
コスト: High

Copy Less, Ground More: Overcoming Repetitive Copying in Long-Context Reasoning via Evidence-Aware Reinforcement Learning

Large language models that generate step-by-step reasoning traces have achieved strong performance on complex

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Selective State-Space Adaptation and Retrieval for Language Model Reasoning

Low-rank adaptation introduces a static learned update applied identically to every input. The update provides

深層学習RNN / LSTMテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

MIRA-Ev:A Benchmark for Granular Evidence Detection and Relational Reasoning in Clinical Exams

この研究では、臨床評価が主にMCQAに頼っているものの、モデルがどのような観点で回答を裏付けるかを検出できない問題に対処するために、マイクロドタイルと関係的推論のBENCHMARKであるMIRA-Evが提案された。

生成AIGAN分類検出QA

用途: マイクロドタイルの発見と関係的推論
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理埋め込み・検索分類生成

Supra Cognitive Modes: A Routed Architecture for Agent Memory

この研究では、エージェントメモリーのワークロードは直接的事実検索、関係連鎖や現在の状態の推論、長時間の履歴上に関係がある合成を組み合わせて、Supra Cognitive Modes を開発しました。このアーキテクチャで

用途: メモリアーキテクチャの設計
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデル分類検出生成

AutoJourn: Multi-Perspective Summarisation, Bias Detection and Bias Neutralisation for LLM-Generated News in Automated Journalism

この研究では、ニュースの多視点的生成とバイアス検出と軽減を行うために、AutoJournシステムが構築され、視点の多様化とバイアス検出と軽減のための新しい方法が提案された。

用途: 自動ニュース生成の視点の多様化とバイアス検出と軽減
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成QAテキスト

AILQA: Evaluating AI-Driven Legal Question Answering Systems for the Indian Legal System

This comprehensive study introduces an advanced Artificial Intelligence for Indian Legal Question Answering (A

用途: 生成
難易度: Hard
コスト: High

RF-Agent: A Practical Framework for Building Language Agents for RFIC Design

Large language models (LLMs) have driven rapid progress in electronic design automation (EDA), yet their appli

深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

自然言語処理プロンプトエンジニアリング画像テキスト動画

WorldScape Policy 2.0: Empowering Steerable World Action Modeling with Reasoning-Augmented Memory

World Action Models(WAMs)は、ロボットマニピュレーションをモデル化するパラダイム。WAMsは、視覚ステートトランジションとロボットアクションを同時にモデル化する。しかし、既存のWAMsは、一定の時

用途: 多目的マニピュレーション問題を解決する
難易度: Hard
コスト: High

arxivPaper only2026-07-20

Vector Search As Nearest Neighbor Matching: RAG-based Policy Learning in Causal Inference

因果推論を用いた政策学習を提案し、政策選択を行う際に最も近い類似の証拠によって行動の有効性を評価することを目指している。

深層学習Transformer生成

用途: 因果推論の政策学習
難易度: Hard
コスト: Low

arxivPaper only2026-07-19

Multi-Resolution Voxelized Map-Based Stereo Visual-Inertial Odometry

Incorporating prior maps significantly enhances the accuracy and robustness of pose estimation in visual-inert

コンピュータビジョン3D・点群画像3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-07-18

How to Build Marcus's Algebraic Mind: From Minsky's Emotion-Machine Viewpoint

In The Algebraic Mind, Marcus identified three cognitive components: operations over variables, recursively st

自然言語処理大規模言語モデル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-07-08

FedMark-FM: Auditable, Risk-Adjusted Data Markets for Federated Foundation-Model Adaptation

Federated foundation-model adaptation increasingly relies on heterogeneous private artifacts (retrieval corpor

品質予測/異常検知自然言語処理RAG

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-06-30

A Lifecycle and Application-Stack Survey of Large Language Model Vulnerabilities: Attacks, Risks, Defenses, and Open Problems

LLMの不正行為に対する防御。この研究では、LLMの不正行為を防ぐための防御の枠組みを開発し、LLMの不正行為の危険性を分析する。

自然言語処理大規模言語モデルテキスト

用途: LLMの不正行為に対する防御
難易度: Hard
コスト: High

arxivPaper only2026-06-13

Controlled Dynamics Attractor Transformer

この研究では、Controlled Dynamics Attractor Transformer (CDAT)を提案しました。このTransformerは、Self-Attention MechanismとAssocia

説明可能品質予測/異常検知深層学習Transformer分類検出異常検知

用途: Controlled Dynamics Attractor Transformer (CDAT)を提案すること。
難易度: Hard
コスト: Low