MLinfo | 機械学習・AI論文まとめ

強化学習方策勾配 (PPO / A3C)分類テキスト

paperless-ngx — A community-supported supercharged document management system: scan, index and archive all your documents

paperless-ngxは、コミュニティによってサポートされたスーパーチャージドのドキュメント管理システムで、ドキュメントのスキャン・インデックス・アーカイブが可能である。

用途: ドキュメント管理
難易度: Easy
コスト: Low

gradio — Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Pythonでマシンラーニングアプリを作成・共有することができるライブラリです。

強化学習方策勾配 (PPO / A3C)画像

用途: マシンラーニングアプリ作成
難易度: Easy
コスト: Medium

MaaAssistantArknights — 《明日方舟》小助手，全日常一键长草！| A one-click tool for the daily tasks of Arknights, supporting all clients.

ゲーム『明日方舟』の支援ツール。全日常のタスクを一括で実行可能。

強化学習方策勾配 (PPO / A3C)

用途: ゲームの支援ツール
難易度: Easy
コスト: Medium

machine-learning-for-trading — Code for Machine Learning for Trading, 3rd edition — from data sourcing to live execution.

LLMの推論 Transparency を高めるために、DiffusionGemmaの計算を分離しVariable Transparency とAlgorithmic Transparencyを評価します。

用途: LLMの透明性、誤用、過度安定化を理解する
難易度: Easy
コスト: High

stable-baselines3 — PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

このリポジトリでは、LLMベースのエージェントアプリケーションのための強化学習の橋渡しを提供しています。

用途: 強化学習を簡素化させる橋渡し
難易度: Easy
コスト: High

ART — Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

ARTは、多段強化学習トレーナーです。このトレーナーは、GRPOを使用して、現実世界のタスクに対して、多段強化学習を行うことができます。

自然言語処理大規模言語モデル強化学習

用途: 多段強化学習トレーナー
難易度: Easy
コスト: High

PufferLib — Puffing up reinforcement learning

用途: 強化学習用ライブラリ
難易度: Easy
コスト: Medium

rllm — Democratizing Reinforcement Learning for LLMs

このリポジトリでは、AIエンジニアリングのためのリソースを提供しています。

自然言語処理大規模言語モデル強化学習

用途: AIエンジニアリング
難易度: Easy
コスト: High

githubGitHubあり2026-07-23

qlib — Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD-Agent to automate R&D process.

クエンティング投資プラットフォームを実現するためにAI技術を活用します。

強化学習方策勾配 (PPO / A3C)教師あり

用途: クエンティング投資プラットフォーム
難易度: Easy
コスト: Medium

githubGitHubあり2026-07-23

FinGPT — FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

このリポジトリでは、Lecture Learning Modelsに対してReinforcement Learningを実行するライブラリを提供しています。

自然言語処理大規模言語モデルテキスト

用途: 可搬性のあるReinforcement Learning
難易度: Easy
コスト: High

githubGitHubあり2026-07-23

ml-agents — The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

Unityを使用してマシンラーニングエージェントを訓練して訓練できるツールです。

コンピュータビジョン3D・点群3D強化学習

用途: Unityでマシンラーニングエージェント
難易度: Easy
コスト: High

githubGitHubあり2026-07-21

Book-Mathematical-Foundation-of-Reinforcement-Learning — This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

Mathematical Foundations of Reinforcement Learningは、ディープラーニングにおける推論力学習の数学的基礎を網羅している。

用途: ディープラーニングに関する本書の制作
難易度: Easy
コスト: Medium

githubGitHubあり2026-07-20

Gymnasium — A standard API for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Gymnasiumは、シングルエージェントRLの疑似環境を提供するAPIです。

用途: 疑似環境を提供する
難易度: Easy
コスト: Medium

githubGitHubあり2026-07-17

open_spiel — OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

ゲームの一般的な強化学習用エンドポインティであるEnvironmentおよびアルゴリズムの集合。

用途: ゲームの一般的な強化学習用エンドポインティ
難易度: Easy
コスト: Medium

githubGitHubあり2026-07-15

vowpal_wabbit — Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbitは、機械学習を進歩させるためのオンライン学習、ハッシュ、reduceなどの強力なアルゴリズムを含むシステムです。その結果、さまざまな問題に応じて、高品質な解決策を提供できます。

強化学習テキスト

用途: 強い機械学習アルゴリズムを実行し複雑な問題を解決するためのシステム
難易度: Easy
コスト: Medium

githubGitHubあり2026-07-08

deep-reinforcement-learning — Repo for the Deep Reinforcement Learning Nanodegree program

この研究はDeep Reinforcement Learningに関する学習用リポジトリです。

強化学習モデルフリー (DQN / SAC)

用途: 実装・検証基盤
難易度: Easy
コスト: Medium