MLinfo | 機械学習・AI論文まとめ

OpenBB — Open Data Platform for analysts, quants and AI agents.

OpenBBは、分析家・量算家・AIエージェント用の金融データプラットフォームを提供している。

機械学習教師あり学習

用途: 金融分析用データ
難易度: Easy
コスト: Medium

自然言語処理大規模言語モデルテキスト音声マルチモーダル

screenpipe — YC (S26) | Record your screen 24/7 and plug into your agents. Local, private, secure. Connect to OpenClaw, Hermes agent and 100+ apps

ユーザーの行動を認識し、オートエージェントを構築するためのツール。

用途: オートエージェント構築
難易度: Easy
コスト: High

自然言語処理大規模言語モデルテキストマルチモーダル

ai-agent-book — 《深入理解 AI Agent：设计原理与工程实践》（李博杰著）开源主仓库：全书正文、编译版 PDF 与按章配套代码

この論文では、現在のVision-Language-Benchmark（VLB）を超える、MLLMがアクティブな観察を実演できるようにするためのバenchmark、ActiveVisionを提案する。このActiveVi

用途: 弁論の実際的な対象を形成するためにAIが活用される
難易度: Easy
コスト: High

ART — Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

ARTは、多段強化学習トレーナーです。このトレーナーは、GRPOを使用して、現実世界のタスクに対して、多段強化学習を行うことができます。

自然言語処理大規模言語モデル強化学習

用途: 多段強化学習トレーナー
難易度: Easy
コスト: High

AReaL — The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

このリポジトリでは、高性能で大規模なベクトルデータベースとベクトル検索エンジンを提供しています。

用途: 高性能で大規模なベクトルデータベース
難易度: Easy
コスト: High

mlflow — The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

このリポジトリでは、AIワークロードを管理するためのシステムであるSkypilotを提供しています。

品質予測/異常検知自然言語処理大規模言語モデル

用途: AIワークロードを管理するためのシステム
難易度: Easy
コスト: High

zenml — ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.

aimは、利用しやすく強力なオープンソースのエクスペリメントトラッカーです。

用途: AIプラットフォーム
難易度: Easy
コスト: High

haystack — Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

オープンソースのAIオーケストレーションフレームワークです。LLMアプリケーションの構築に必要なパイプラインやエージェントワークフローの設計ができるようになっています。

深層学習Transformer生成要約テキスト

用途: LLMアプリケーションの構築
難易度: Easy
コスト: High

DocsGPT — Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

このリポジトリでは、トークナイザーの最適化を提供しています。

深層学習Transformerテキスト

用途: トークナイザーの最適化
難易度: Easy
コスト: Medium

botpress — The open-source hub to build & deploy GPT/LLM Agents ⚡️

オープンソースのGPT/LLMエージェント作成ツールです。

用途: GPT/LLMエージェントの構築
難易度: Easy
コスト: High

arxivGitHubあり2026-07-23

Agentic coding without the cloud: evaluating open-weight large language models on longitudinal data preparation tasks

Large language models (LLMs) and agents are now widely used tools in code development, with data typically sen

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivGitHubあり2026-07-23

pAI-Econ-claude: A Gated Human-in-the-Loop Multi-Agent Architecture for AI-Assisted Economic Theory Development

この研究では、大規模言語モデルを活用して、経済学の研究活動をサポートするシステムを開発しました。このシステムは、学者が理論モデル開発を自動化することができます。

用途: 経済学の研究支援システム
難易度: Hard
コスト: High

arxivGitHubあり2026-07-23

Workflow-Localized Mechanism Learning: Attribution-Guided Repair and Knowledge Reuse for Structured Agent Skills

Agent Skills package reusable procedural knowledge as external artifacts for frozen language-model agents, yet

MI向き強化学習方策勾配 (PPO / A3C)

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

qlib — Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD-Agent to automate R&D process.

クエンティング投資プラットフォームを実現するためにAI技術を活用します。

強化学習方策勾配 (PPO / A3C)教師あり

用途: クエンティング投資プラットフォーム
難易度: Easy
コスト: Medium

AgentsMeetRL — Awesome List for Agentic RL

エージェントRRLに関連するアワーショットリスト。

用途: エージェントRRL
難易度: Easy
コスト: High

ml-agents — The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

Unityを使用してマシンラーニングエージェントを訓練して訓練できるツールです。

コンピュータビジョン3D・点群3D強化学習

用途: Unityでマシンラーニングエージェント
難易度: Easy
コスト: High

giskard-oss — 🐢 Open-Source Evaluation & Testing library for LLM Agents

giskard-ossは、LLMエージェントの評価とテストライブラリを提供します。

用途: LLMエージェントの評価とテストライブラリ
難易度: Easy
コスト: High

ArbiGraph: Arbitrarily Scalable Verifiable Task Graphs for Evaluating Context Management

We introduce ARBIGRAPH, a benchmark generator for evaluating whether tool-assisted language agents can retain,

MLOpsモデルデプロイテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Medium

PRO-LONG: Programmatic Memory Enables Long-Horizon Reasoning

Long-horizon tasks require sustained perception, reasoning, and exploration, and are a persistent challenge fo

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知コンピュータビジョンマルチモーダルQA画像

Silent Failures in Multimodal Agentic Search:A Diagnostic Taxonomy and Cross-Judge Evaluation

この研究では、可視化された質問への対応を評価するために、新しい方法を提案しました。この方法は、質問への回答の正確性だけでなく、質問への回答のパターンや特徴も評価することができます。

用途: 可視化された質問への対応を評価する
難易度: Hard
コスト: High

自然言語処理プロンプトエンジニアリング検出画像テキスト

ReferTrack: Referring Then Tracking for Embodied Visual Tracking

ReferTrack は、自然言語で対象の車両に付近する自動車を追従させるシステムである。このシステムでは、対象の車両に付近する自動車を認識する後、自動車の動きを予測する。

用途: 自動車が対象の車両に付きそわせるシステム
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-22

NexForge: Scaling Agent Capabilities through Requirement-Driven Task Synthesis for LLMs

Scaling executable agent training data for LLM post-training is bottlenecked by substrate-bound methods that t

用途: 生成
難易度: Easy
コスト: High

githubGitHubあり2026-07-22

atomic-agents — Building AI agents, atomically

AIエージェントを組み立てるためのライブラリ。

用途: AIエージェント建設
難易度: Easy
コスト: High

githubGitHubあり2026-07-22

Finance-LLMs — Comprehensive Compilation of Real-World LLM & AI Agent Use Cases in Financial Services

販売データを分析するために、機械学習モデルが使用されるリソースが提供されていました。

用途: 販売データを分析する
難易度: Easy
コスト: High

arxivGitHubあり2026-07-21

Knowledge-Centric Self-Improvement

知識を重視した自己向上の研究を実施し、自己向上を知識を重視することにより効果的に行う方法を提案した。

深層学習軽量化・量子化

用途: 知識を重視した自己向上
難易度: Hard
コスト: High

arxivGitHubあり2026-07-21

AutoIndex: Learning Representation Programs for Retrieval

リカバリーのためのプログラムを学習するフレームワークを提案し、そのプログラムを用いて、文書にラベルを付与する検索システムを構築する。

品質予測/異常検知自然言語処理RAGテキスト

用途: リカバリーのためのプログラムの学習
難易度: Easy
コスト: Low

huggingfaceHugging Faceあり2026-07-21

FinanceComplexQA: Benchmarking Agentic Reasoning on Industrial-grade Financial Documents

Agentic Reasoning has become a transformative force in financial analysis due to its ability to integrate larg

品質予測/異常検知自然言語処理RAG生成要約テキスト

用途: 生成
難易度: Easy
コスト: Low

huggingfaceHugging Faceあり2026-07-21

ABot-World-0: Infinite Interactive World Rollout on a Single Desktop GPU

We present ABot-World-0, an action-conditioned video world model for real-time, long-horizon closed-loop inter

品質予測/異常検知深層学習軽量化・量子化テキスト動画

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-21

agent-starter-pack — Ship AI Agents to Google Cloud in minutes, not months. Production-ready templates with built-in CI/CD, evaluation, and observability.

AIエージェントをGoogle Cloudに展開することが可能で、CI/CD、評価、観察など、プロダクションリードテンプレートが事前に用意されています。

用途: AIエージェントをGoogle Cloudに展開
難易度: Easy
コスト: High

githubGitHubあり2026-07-21

BettaFish — 微舆：人人可用的多Agent舆情分析助手，打破信息茧房，还原舆情原貌，预测未来走向，辅助决策！从0实现，不依赖任何框架。

微舆は人人可用的多Agent舆情分析助手であり、情報茧房を打破して舆情の原貌を還元し、未来の走向を予測し、決策を助けることができます。

用途: 舆情分析助手の問題を解決する
難易度: Easy
コスト: High

arxivGitHubあり2026-07-20

UniETP: Unifying Environments for Generalizable Embodied Task Planning

This paper focuses on the problem of Embodied Task Planning, where an agent is required to execute a sequence

自然言語処理RAG

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

huggingfaceGitHubありHugging Faceあり2026-07-20

Differentiable Logic Gate Networks for Low-Latency EEG Classification on Edge Devices

Real-time EEG classification on edge devices is bottlenecked by the floating-point arithmetic of conventional

CPUで試しやすい強化学習マルチエージェント分類検出

用途: 分類
難易度: Easy
コスト: Low

説明可能品質予測/異常検知自然言語処理大規模言語モデル動画マルチモーダル

EduPanel: A Three-Agent LLM Judge for Teaching Videos -- Reliability, Complementarity, and Human Trust Calibration

Teaching videos are becoming a major medium for education, creating a growing need for scalable evaluation of

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

FlashRT: Agent Harness for Guiding Agents to Deploy Real-Time Multimodal Applications

Real-time multimodal applications, including voice agents and interactive video generation, compose heterogene

深層学習軽量化・量子化生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

huggingfaceGitHubありHugging Faceあり2026-07-20

WorldCupArena: Fine-Grained Evaluation of Language Models and Deep-Research Agents on Football Forecasting

Predicting a football match before kickoff requires more than knowing past results: a model must use changing

コンピュータビジョンセグメンテーション予測テキスト

用途: 予測
難易度: Easy
コスト: Low

Self-State Attacks on Self-Hosted AI Agents: How Far Can OS Defenses Go?

Self-hosted AI agents read and write their own memory and configuration files to function. An agent may get co

深層学習Transformer検出

用途: 検出
難易度: Easy
コスト: Medium

Coercion and Deception in AI-to-AI Management: An Agentic Benchmark of Unprompted Escalation

Multi-agent systems routinely place one AI agent in authority over another. When a subordinate refuses a task,

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Easy
コスト: High

githubGitHubあり2026-07-20

Gymnasium — A standard API for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Gymnasiumは、シングルエージェントRLの疑似環境を提供するAPIです。

強化学習

用途: 疑似環境を提供する
難易度: Easy
コスト: Medium

huggingfaceHugging Faceあり2026-07-19

TimeLens2: Generalist Video Temporal Grounding with Multimodal LLMs

Video multimodal large language models (MLLMs) can describe what happens in a video, but rarely identify when

自然言語処理大規模言語モデル検出テキスト動画

用途: 検出
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-19

EvolvingWorld: An Open-Schema Framework for Co-Evolving Role-Play Agents and World Model in Interactive Literary World

This paper introduces EvolvingWorld, a framework and benchmark for character and world co-evolution in interac

用途: 生成
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-18

DataFlow-Harness: A Grounded Code-Agent Platform for Constructing Editable LLM Data Pipelines

Large language models (LLMs) are increasingly used to automate data-processing workflows, yet coding agents ty

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-18

Environment-free Synthetic Data Generation for API-Calling Agents

Training API-calling large language model (LLM) agents demands massive amounts of high-quality trajectories. H

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Easy
コスト: High

SeerGuard: A Safety Framework for Mobile GUI Agents via World Model Prediction

Mobile graphical user interface (GUI) agents have demonstrated remarkable capabilities in automating complex t

強化学習モデルベース

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Medium

Nonuniformity Principle in Human-AI Coworking

As generative AI is increasingly applied to automate multi-step and high-stake workflows, human judgment and i

品質予測/異常検知機械学習教師あり学習生成

用途: 生成
難易度: Easy
コスト: Medium

RecGPT-V3 Technical Report

Large language models (LLMs) are transforming recommender systems from matching co-occurrence patterns in hist

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

Recursive Harness Self-Improvement

Under model--harness co-evolution, harnesses are not merely inference-time scaffolds but data-generating compo

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

When Does Muon Help Agentic Reinforcement Learning?

Muon is competitive with AdamW in large-scale pre-training, but its value for reinforcement-learning (RL) post

深層学習正規化・最適化手法強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

DSWorld: A Data Science World Model for Efficient Autonomous Agents

Despite strong capabilities in data understanding and decision-making, autonomous data science agents still he

深層学習軽量化・量子化強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

Beyond Success Rate: Cost-Aware Evaluation of Offensive and Defensive Security Agents

Security-agent evaluations commonly measure peak offensive capability under generous inference budgets, emphas

コンピュータビジョンセグメンテーション

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Medium

huggingfaceHugging Faceあり2026-07-16

Multi-Turn On-Policy Distillation with Prefix Replay

We study on-policy distillation (OPD) for agentic tasks, where an LLM agent interacts with an environment over

深層学習軽量化・量子化

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-16

RESOURCE2SKILL: Distilling Executable Agent Skills from Human-Created Multimodal Resources

Skills are a useful abstraction for software agents, turning human and agent experience into reusable procedur

自然言語処理RAG画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-16

agent-lightning — The absolute trainer to light up AI agents.

最適なAIモデルを効率的に学習するためのオーサリングツール。Agent Lightningを使用して、トレーナーをセットアップし、データをトレーニングしてモデルを学習することができる。

用途: AI_AGENTのトレーナーを簡単にセットアップする
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-15

Diagnosing and Calibrating Tool-Call Boundary Drift in Multi-Teacher On-Policy Distillation

Agentic language models must learn when to call tools, when to consume tool responses, and when to answer dire

深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-15

Cura 1T: Specialized Model for Agentic Healthcare

Healthcare spans high-stakes communication, expert reasoning, and workflow execution, yet specialized LLMs tha

自然言語処理大規模言語モデル画像テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-15

ai-engineering-hub — In-depth tutorials on LLMs, RAGs and real-world AI agent applications.

このリポジトリには、LLM、RAG、およびオーソリティの認識を含む、AIエンジニアリングのための深いドキュメントがあります。

用途: 記事を理解するためのテキスト分析ツール
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-14

ReflectWorld-MM: An Entity-Oriented Multimodal Memory System for Open-Ended Video Streams

Building assistants that can continually watch the world, remember what they see, and reason over their accumu

コンピュータビジョンマルチモーダル画像テキスト音声

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-14

From Human-Centric to Agentic Code Review: The Impact of Different Generations of Generative AI Technology on Review Quality

Code review helps maintain software quality before code integration, but it also imposes a substantial workloa

品質予測/異常検知深層学習Transformer生成テキスト

用途: 生成
難易度: Easy
コスト: High

Awesome-Embodied-Robotics-and-Agent — This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

Embodied AIやロボットとLarge Language Modelを組み合わせた研究のリポジトリ。

自然言語処理大規模言語モデルテキスト

用途: Embodied AIやロボット研究
難易度: Easy
コスト: High

OpenRLHF — An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

OpenRLHFは、Ray上に構築された強化学習フレームワークです。このフレームワークは、PPO、DAPO、REINFORCE++など、様々な強化学習アルゴリズムをサポートしています。

深層学習Transformer画像

用途: 強化学習フレームワーク
難易度: Easy
コスト: High

agents-towards-production — End-to-end, code-first tutorials for building production-grade GenAI agents. From prototype to enterprise deployment.

AIエージェントの開発と実装を行うためのエンドツーマンド、コードファーストのチュートリアル。

用途: AIエージェントの開発と実装
難易度: Easy
コスト: High

memvid — Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.

MemVidは、サーバーレスで単一ファイルの記憶層を提案し、AIエージェントが即時検索と長期的な記憶を持つようにする記憶層です。

自然言語処理大規模言語モデル生成テキスト動画

用途: AIエージェントの記憶を管理する
難易度: Easy
コスト: High

githubGitHubあり2026-07-10

multimind-sdk — Your SDK solves all of this. One interface. Unified logic. Local + hosted models. Fine-tuning. Agent tools. Enterprise-ready. Hybrid RAG.Star 🌟 if you like it!

GUI操作自動化に伴う停止判定、復讐、再検索に関する問題を解決し、 GUI操作自動化を実現するためのフレームワークを開発します。

自然言語処理大規模言語モデルマルチモーダル

用途: GUI操作自動化ツール
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-08

DeepSearch-World: Self-Distillation for Deep Search Agents in a Verifiable Environment

Training tool-use agents to improve from their own experience remains challenging, as supervised fine-tuning r

深層学習軽量化・量子化生成強化学習

用途: 生成
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-08

Agon: Competitive Cross-Model RL with Implicit Rival Grading of Reasoning

Reinforcement learning from verifiable rewards (e.g. GRPO) is the engine behind today's reasoning models, yet

コンピュータビジョンセグメンテーションテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-07

Behavioral Privacy Leakage in Agentic Negotiation: Formalizing and Mitigating Inference Attacks via Randomized Policies

Autonomous negotiation agents are increasingly deployed in high-stakes settings such as insurance and procurem

センサ/時系列機械学習時系列

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Medium

githubGitHubあり2026-07-07

DATAGEN — DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing.

AIドライブのマルチエージェント研究アシスタント。仮説の生成、データ分析、およびレポートの生成を自動化する。

用途: AI研究アシスタント
難易度: Easy
コスト: High

githubGitHubあり2026-06-30

CV — ✅（已完结）超级全面的深度学习笔记【土堆 Pytorch】【李沐动手学深度学习】【吴恩达深度学习】【大飞大模型Agent】

深層学習のノート書。このノートには、土山さんのPytorchノート、おしうの「深層学習」を実践するノート、Wu's「深層学習」をテキスト化したノート、およびダフィンの「大モデルエージェント」のノートが含まれている。

用途: 深層学習ノート
難易度: Easy
コスト: High

arxivGitHubあり2026-06-28

When LLMs Develop Languages: Symbolic Communication for Efficient Multi-Agent Reasoning

Chain-of-Thought (CoT) improves large language models (LLMs) on difficult reasoning tasks, but it often incurs

MI向き深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-05-07

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL

Recent growth in reinforcement learning (RL) has surfaced a need for diverse, specialized training environment

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High