MLinfo | 機械学習・AI論文まとめ

prompts.chat — f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

prompts.chatは、コミュニティが共有したChatGPT用のプロンプットを発見・収集できる場所で、無料でオープンソースで提供されている。

用途: チャットGPT用のプロンプトを共有
難易度: Easy
コスト: High

ray — Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

rayは、core分布ランタイムとAIライブラリで構成されたAI計算エンジンで、スケーラブルなAI計算をサポートする。

用途: AI計算
難易度: Easy
コスト: High

rig — ⚙️🦀 Build modular and scalable LLM Applications in Rust

Rustを使ってモジュラーLLMアプリケーションを構築することができるライブラリです。

用途: モジュラーLLMアプリケーション作成
難易度: Easy
コスト: High

tiny-llm — A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Apple Silicon上でLLM推論サービスをシステムエンジニアが作成するチュートリアル。

用途: LLM推論サービングのチュートリアル
難易度: Easy
コスト: High

awesome-MLSecOps — A curated list of MLSecOps tools and resources for securing machine learning and AI systems - adversarial ML defense, LLM security, AI red teaming, model scanning, supply-chain protection, and MLOps pipeline security.

マルチモーダル理解技術のための新しいアプローチであるMIRRORを提案しました。MIRRORは、テキスト、図、テキストと図の組み合わせから等価な視点を提供することで、視覚的な推論や複雑な推論力を向上し、さまざまなモデルの

用途: マルチモーダル理解技術の開発
難易度: Easy
コスト: High

Medical_Image_Analysis — Foundation models based medical image analysis

医学画像分析は、医療の診断や治療を支援するために画像に記載されたデータから情報を抽出する研究分野です。この研究では、foundation modelsを用い、医療画像分析のための新しいアプローチを提案しました。found

自然言語処理大規模言語モデル生成画像テキスト

用途: 医学画像分析
難易度: Easy
コスト: High

自然言語処理大規模言語モデルテキスト音声マルチモーダル

screenpipe — YC (S26) | Record your screen 24/7 and plug into your agents. Local, private, secure. Connect to OpenClaw, Hermes agent and 100+ apps

ユーザーの行動を認識し、オートエージェントを構築するためのツール。

用途: オートエージェント構築
難易度: Easy
コスト: High

unsloth — Unsloth is a local UI for training and running Gemma 4, Qwen3.6, DeepSeek, Kimi, GLM and other models.

Unsloth Studioは、オープンモデルのトレーニングと実行を支援するWebUIです。このライブラリは、Gemma4、Qwen3.5などのオープンモデルのテストとトレーニングを支援するために使われます。

自然言語処理大規模言語モデルテキスト音声

用途: オープンモデルのトレーニングと実行
難易度: Easy
コスト: High

machine-learning-for-trading — Code for Machine Learning for Trading, 3rd edition — from data sourcing to live execution.

LLMの推論 Transparency を高めるために、DiffusionGemmaの計算を分離しVariable Transparency とAlgorithmic Transparencyを評価します。

強化学習

用途: LLMの透明性、誤用、過度安定化を理解する
難易度: Easy
コスト: High

自然言語処理大規模言語モデルテキストマルチモーダル

ai-agent-book — 《深入理解 AI Agent：设计原理与工程实践》（李博杰著）开源主仓库：全书正文、编译版 PDF 与按章配套代码

この論文では、現在のVision-Language-Benchmark（VLB）を超える、MLLMがアクティブな観察を実演できるようにするためのバenchmark、ActiveVisionを提案する。このActiveVi

用途: 弁論の実際的な対象を形成するためにAIが活用される
難易度: Easy
コスト: High

stable-baselines3 — PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

このリポジトリでは、LLMベースのエージェントアプリケーションのための強化学習の橋渡しを提供しています。

強化学習

用途: 強化学習を簡素化させる橋渡し
難易度: Easy
コスト: High

ART — Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

ARTは、多段強化学習トレーナーです。このトレーナーは、GRPOを使用して、現実世界のタスクに対して、多段強化学習を行うことができます。

自然言語処理大規模言語モデル強化学習

用途: 多段強化学習トレーナー
難易度: Easy
コスト: High

Mooncake — Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

この論文では、LLM を提供するために使用される Mooncake サービスプラットフォームについて説明しています。Mooncakeは、Kimi というリーディングのLLMサービスを提供するサービスです。Kimiは、M

用途: LLM用サービングプラットフォーム
難易度: Easy
コスト: High

rllm — Democratizing Reinforcement Learning for LLMs

このリポジトリでは、AIエンジニアリングのためのリソースを提供しています。

自然言語処理大規模言語モデル強化学習

用途: AIエンジニアリング
難易度: Easy
コスト: High

AReaL — The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

このリポジトリでは、高性能で大規模なベクトルデータベースとベクトル検索エンジンを提供しています。

用途: 高性能で大規模なベクトルデータベース
難易度: Easy
コスト: High

mlflow — The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

このリポジトリでは、AIワークロードを管理するためのシステムであるSkypilotを提供しています。

品質予測/異常検知自然言語処理大規模言語モデル

用途: AIワークロードを管理するためのシステム
難易度: Easy
コスト: High

skypilot — The AI Compute Platform for frontier teams. SkyPilot turns fragmented AI compute into one AI supercomputer, so frontier AI teams build custom intelligence faster.

このリポジトリでは、AIアプリケーションをローカルに実行できるツールキットであるRunAnywhere-sdksを提供しています。

用途: AIアプリケーションをローカルに実行できるツールキット
難易度: Easy
コスト: High

metaflow — Build, Manage and Deploy AI/ML Systems

TensorZeroは、LLMゲートウェイ、オブザーバビリティ、評価、最適化、実験を統一したオープンソースのLLMOpsプラットフォームです。

用途: AI/MLシステムの構築、管理、展開ツール
難易度: Easy
コスト: High

flyte — Dynamic, resilient AI orchestration. Coordinate data, models, and compute as you build AI workflows.

metaflowは、AI/MLシステムを構築・管理・ディプロイするために使用できるプラットフォームです。

用途: AIワークロードの実行管理
難易度: Easy
コスト: High

lance — Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

マルチモーダルAIに適したオープンレイクハウスフォーマットです。このフォーマットでは、パレットからデータを2行のコードで変換することができ、100倍速くなります。また、ベクトルインデックスやデータバージョニングが可能です

用途: オープンレイクハウスフォーマット
難易度: Easy
コスト: High

kserve — Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

flyteは、高度に動的で堅牢なAIオーケストレーションプラットフォームであり、データ、モデル、コンピューティングを統合してAIワークフローを作成することができます。

用途: エクスペリメントトラッカーを簡単にする
難易度: Easy
コスト: High

zenml — ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.

aimは、利用しやすく強力なオープンソースのエクスペリメントトラッカーです。

用途: AIプラットフォーム
難易度: Easy
コスト: High

runanywhere-sdks — Production ready toolkit to run AI locally

このリポジトリでは、AIモデルの互換性を確保するためのオープンスタンダードであるONNXを提供しています。

用途: AIモデルの互換性を確保するためのオープンスタンダード
難易度: Easy
コスト: High

verl-omni — Multimodal RL training framework for diffusion & omni models

CVV または CWE への分類を実現し、バグ修正のために重要な手順となるCVEへの CWE 分類を自動化する。

用途: CVVの分類と CWE 分類
難易度: Easy
コスト: High

vllm — A high-throughput and memory-efficient inference and serving engine for LLMs

このリポジトリでは、私的なAIプラットフォームであるDocGPTを提供しています。

深層学習Transformer

用途: 私的なAIプラットフォーム
難易度: Easy
コスト: High

haystack — Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

オープンソースのAIオーケストレーションフレームワークです。LLMアプリケーションの構築に必要なパイプラインやエージェントワークフローの設計ができるようになっています。

深層学習Transformer生成要約テキスト

用途: LLMアプリケーションの構築
難易度: Easy
コスト: High

onnx — Open standard for machine learning interoperability

このリポジトリでは、中文LLaMA & Alpaca LLMsを提供しています。

MLOpsモデルデプロイ

用途: 中文LLaMA & Alpaca LLMs
難易度: Easy
コスト: High

LlamaFactory — Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

LLMやVLMのFine-Tuningを簡素化したライブラリ。

深層学習Transformer

用途: LLMのFine-Tuning
難易度: Easy
コスト: High

RAG_Techniques — This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.

医学画像に対する疾患検出モデルを開発し、臨床現場で早期検出と迅速な介入を容易にすることを目的としたフレームワークを提案します。

用途: 医学画像の疾患検出
難易度: Easy
コスト: High

botpress — The open-source hub to build & deploy GPT/LLM Agents ⚡️

オープンソースのGPT/LLMエージェント作成ツールです。

用途: GPT/LLMエージェントの構築
難易度: Easy
コスト: High

Agentic coding without the cloud: evaluating open-weight large language models on longitudinal data preparation tasks

Large language models (LLMs) and agents are now widely used tools in code development, with data typically sen

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

pAI-Econ-claude: A Gated Human-in-the-Loop Multi-Agent Architecture for AI-Assisted Economic Theory Development

この研究では、大規模言語モデルを活用して、経済学の研究活動をサポートするシステムを開発しました。このシステムは、学者が理論モデル開発を自動化することができます。

用途: 経済学の研究支援システム
難易度: Hard
コスト: High

REFACT: Adaptive Fact Restatement for Compact and Faithful Chain-of-Thought Reasoning

長形推論のための言語モデルが、提供されたコンテキストから乖離した論理を生成する可能性があることを指摘し、コンテキストと推論論理をより適切に融合するため、 REFACT (REstating Facts in Adapti

用途: Chain-Of-Thought (CoT) の改善
難易度: Hard
コスト: High

深層学習Transformer画像テキストマルチモーダル

MVEI & EmObserver: Empowering MLLM-Oriented Visual Emotional Intelligence via Emotion Statement Judgement

感情認識は、現代のアギを促進するために不可欠ですが、大規模

用途: 感情認識
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-23

K12-KGraph: A Curriculum-Aligned Knowledge Graph for Benchmarking and Training Educational LLMs

Large language models are increasingly used in K-12 education, but existing benchmarks mainly test exam questi

自然言語処理大規模言語モデルQA画像テキスト

用途: QA
難易度: Easy
コスト: High

minimind — 🧠「大模型」2小时完全从0训练64M的小参数LLM！Train a 64M-parameter LLM from scratch in just 2h!

このライブラリは、空間情報を扱うためのコンピュータビジョンライブラリです。

用途: ジオメトリックなコンピュータビジョンライブラリ
難易度: Easy
コスト: High

nestia — NestJS Helper + AI Chatbot Development

NestJSベースのAIチャットボット開発ツールです。

用途: AIチャットボット作成
難易度: Easy
コスト: High

AgentsMeetRL — Awesome List for Agentic RL

エージェントRRLに関連するアワーショットリスト。

用途: エージェントRRL
難易度: Easy
コスト: High

awesome-llm-unlearning — A resource repository for machine unlearning in large language models

このリポジトリは大規モデルの無学習に関するリソースをまとめたものです。

用途: 大規模言語モデルの無学習
難易度: Easy
コスト: High

FinGPT — FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

このリポジトリでは、Lecture Learning Modelsに対してReinforcement Learningを実行するライブラリを提供しています。

用途: 可搬性のあるReinforcement Learning
難易度: Easy
コスト: High

xtuner — A Next-Generation Training Engine Built for Ultra-Large MoE Models

xtunerは、超大規模MoEモデルを高速にトレーニングするためのトレーニングエンジンです。

自然言語処理大規模言語モデル生成マルチモーダル

用途: MoEモデルの高速トレーニングを提供する
難易度: Easy
コスト: High

giskard-oss — 🐢 Open-Source Evaluation & Testing library for LLM Agents

giskard-ossは、LLMエージェントの評価とテストライブラリを提供します。

用途: LLMエージェントの評価とテストライブラリ
難易度: Easy
コスト: High

remove-ai-watermarks — AI watermark remover. CLI and Python library to strip visible and invisible AI watermarks (Gemini / Nano Banana sparkle, SynthID) and provenance metadata (C2PA, EXIF, IPTC) from images.

音声認識、声活動検出、テキスト処理などを行う、基盤となる音声認識ツールキットを提供する。

自然言語処理大規模言語モデル生成画像

用途: 音声認識の基盤技術の提供
難易度: Easy
コスト: High

表形式向き自然言語処理大規模言語モデル画像テキスト表形式

unstructured — Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

ドキュメントを構造化するために使えるオープンソースのETLソリューション。

用途: ドキュメントの構造化
難易度: Easy
コスト: High

txtai — 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

LLMを利用するために、セマンティック検索やLLMのオーケストレーションなどを行えるフレームワーク。

深層学習Transformer生成テキスト

用途: セマンティック検索
難易度: Easy
コスト: High

GaugeQuant: Online Learning of Quantization-Optimal Bases from LLM Symmetries

Transformers are known to have internal continuous symmetries that leave outputs invariant, while modifying qu

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

表形式向き自然言語処理大規模言語モデルテキスト表形式

Auto-Fill: Learning to Predict Missing Values Accurately with Specialist Language Models

Predicting missing cell values in tabular data is a fundamental problem in data cleaning. While state-of-the-a

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

PRO-LONG: Programmatic Memory Enables Long-Horizon Reasoning

Long-horizon tasks require sustained perception, reasoning, and exploration, and are a persistent challenge fo

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Rushes: A Human Preference Dataset for Pluralistic Alignment

We introduce Rushes, a dataset and benchmark for studying revealed human engagement preferences in interactive

用途: 生成
難易度: Hard
コスト: High

LKValues: Aligning Large Language Models with Sri Lankan Societal Values

Value alignment of Large Language Models (LLMs) has been shown to be culturally biased toward Western norms. T

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

D2VBench: Benchmarking Large Language Models with Value Dilemmas in Daily Scenarios

With the wide application of large language models (LLMs) in real-world scenarios, the value implication of th

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

自然言語処理大規模言語モデル画像テキストマルチモーダル

Development of an automated, reliable, and clinically meaningful artificial intelligence (AI) tool for diagnosing cardiac disease from conventional cardiovascular magnetic resonance (CMR) images

Aims: Cardiovascular magnetic resonance (CMR) imaging enables non-invasive assessment of myocardial structure,

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-22

NexForge: Scaling Agent Capabilities through Requirement-Driven Task Synthesis for LLMs

Scaling executable agent training data for LLM post-training is bottlenecked by substrate-bound methods that t

用途: 生成
難易度: Easy
コスト: High

opencompass — OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

LLMを評価するプラットフォームであり、さまざまなモデルとデータセットをサポートする。

用途: LLM評価
難易度: Easy
コスト: High

atomic-agents — Building AI agents, atomically

AIエージェントを組み立てるためのライブラリ。

用途: AIエージェント建設
難易度: Easy
コスト: High

picollm — On-device LLM Inference Powered by X-Bit Quantization

デバイス上のLLM推論をXビット量化を使用したもの。

深層学習軽量化・量子化生成

用途: ラジケイタクイズナイゼーション
難易度: Easy
コスト: High

Finance-LLMs — Comprehensive Compilation of Real-World LLM & AI Agent Use Cases in Financial Services

販売データを分析するために、機械学習モデルが使用されるリソースが提供されていました。

用途: 販売データを分析する
難易度: Easy
コスト: High

Total Variation Distance Estimation in Autoregressive Models

自動変換モデルで使用されるLLMの同定の精度の評価に役立つ「Total Variation Distance Estimation」を行った研究。この研究では3種類のアクセスモデルと異なる推定方法を提案し、実験で推定方

深層学習軽量化・量子化

用途: LLMの同定の精度の評価のためのTV距離の推定
難易度: Hard
コスト: High

Knowledge-Centric Self-Improvement

知識を重視した自己向上の研究を実施し、自己向上を知識を重視することにより効果的に行う方法を提案した。

深層学習軽量化・量子化

用途: 知識を重視した自己向上
難易度: Hard
コスト: High

表形式向き自然言語処理大規模言語モデルテキスト表形式

Prompt Design at Scale: How Format, Instruction Count, and Context Length Shape Instruction Adherence and Hallucination in Large Language Models

Practitioners make three prompt-design decisions with almost no controlled evidence behind them: how to format

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル生成テキスト強化学習

Beyond Score Prediction: LLM-Based Essay Scoring and Feedback Generation via Reinforcement Learning with Rubric Rewards

Large language models (LLMs) have been widely applied to automated essay scoring (AES) and automated feedback

用途: 生成
難易度: Hard
コスト: High

CASE: Causal Alignment and Structural Enforcement for Improving Chain-of-Thought Faithfulness

Chain-of-thought (CoT) reasoning is widely used to improve both the performance and interpretability of large

説明可能自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

BaseRT: Advancing Best-in-Class LLM Inference with Apple M5 Neural Accelerators

Apple's M5 generation introduces a redesigned GPU architecture in which every core carries a dedicated Neural

用途: 生成
難易度: Hard
コスト: High

MI向き品質予測/異常検知自然言語処理大規模言語モデル画像音声動画

OmniReasoner: Thinking with Long Audio-Video via Native Tool Use

オリジナルのデータとZoom-Inのツールを組み合わせた方法、OmniReasonerを提案する。これにより、オリンモードルLLMsの長いオーディオビデオの論理的推論を改善できる。

用途: 長いオーディオビデオの論理的推論を改善する
難易度: Hard
コスト: High

githubGitHubあり2026-07-21

agent-starter-pack — Ship AI Agents to Google Cloud in minutes, not months. Production-ready templates with built-in CI/CD, evaluation, and observability.

AIエージェントをGoogle Cloudに展開することが可能で、CI/CD、評価、観察など、プロダクションリードテンプレートが事前に用意されています。

用途: AIエージェントをGoogle Cloudに展開
難易度: Easy
コスト: High

githubGitHubあり2026-07-21

BettaFish — 微舆：人人可用的多Agent舆情分析助手，打破信息茧房，还原舆情原貌，预测未来走向，辅助决策！从0实现，不依赖任何框架。

微舆は人人可用的多Agent舆情分析助手であり、情報茧房を打破して舆情の原貌を還元し、未来の走向を予測し、決策を助けることができます。

用途: 舆情分析助手の問題を解決する
難易度: Easy
コスト: High

huggingfaceGitHubありHugging Faceあり2026-07-20

SciForma: Structure-Faithful Generation of Scientific Diagrams

Structural fidelity is essential to scientific methodology diagrams. To communicate research logic, these diag

品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル動画マルチモーダル

EduPanel: A Three-Agent LLM Judge for Teaching Videos -- Reliability, Complementarity, and Human Trust Calibration

Teaching videos are becoming a major medium for education, creating a growing need for scalable evaluation of

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

ConsiSpace: Learning Geometric Consistency Matters for Video Spatial Reasoning

Video spatial reasoning is essential for navigation-oriented perception and long-video question answering, whe

深層学習軽量化・量子化QAテキスト動画

用途: QA
難易度: Easy
コスト: High

HOMIE: Human-object Centric Video Personalization via Multimodal Intelligent Enchancement

Human-object centric video personalization (HOCVP) is a core task within subject-driven video generation. Howe

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル検出生成セグメンテーション

FlowMimic: Mask-free Visual Editing and Generation with Pixel-pair Warped Flow Field for Online Video Editing Data Generation and Modality Mimicry

In line with the prevailing direction of vision research, we explore the integration of both generation and ed

用途: 検出
難易度: Easy
コスト: High

FlashRT: Agent Harness for Guiding Agents to Deploy Real-Time Multimodal Applications

Real-time multimodal applications, including voice agents and interactive video generation, compose heterogene

深層学習軽量化・量子化生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

Coercion and Deception in AI-to-AI Management: An Agentic Benchmark of Unprompted Escalation

Multi-agent systems routinely place one AI agent in authority over another. When a subordinate refuses a task,

自然言語処理大規模言語モデル分類テキスト

用途: 分類
難易度: Easy
コスト: High

Scrapegraph-ai — Python scraper based on AI

AIを使ったwebスクレイピングツールです。

用途: 自然語り式のwebスクレイピング
難易度: Easy
コスト: High

ludwig — Low-code framework for building custom LLMs, neural networks, and other AI models

Ludwigは、LLM (Large Language Model) のカスタム化と構築のための低コストフレームワークです。このフレームワークは、ユーザーがカスタム LLM を構築し、トレーニングするのを容易にします。

用途: LLMのカスタム化と構築のための低コストフレームワーク
難易度: Easy
コスト: High

OpenLLM — Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

OpenAIに互換性があり、Cloud APIとして利用できるLLM。

用途: LLMのクラウドAPI
難易度: Easy
コスト: High

BentoML — The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

モデルをサービングするためのライブラリを紹介している。

自然言語処理大規模言語モデル生成マルチモーダル

用途: モデルのサービング
難易度: Easy
コスト: High

Open-dLLM — Open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.

Open-dLLMはOpen diffusion language modelを公開しており、コード生成の前トレーニング、評価、推論、チェックポイントを公開しています。

用途: コード生成の問題を解決する
難易度: Easy
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

arxivGitHubあり2026-07-19

CoEvoP&R: Co-Evolving Placement Objectives with Routing Feedback via Large Language Models

Analytical placers rely on differentiable objective functions to guide placement, typically combining intermed

用途: 生成
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-19

TimeLens2: Generalist Video Temporal Grounding with Multimodal LLMs

Video multimodal large language models (MLLMs) can describe what happens in a video, but rarely identify when

自然言語処理大規模言語モデル検出テキスト動画

用途: 検出
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-19

EvolvingWorld: An Open-Schema Framework for Co-Evolving Role-Play Agents and World Model in Interactive Literary World

This paper introduces EvolvingWorld, a framework and benchmark for character and world co-evolution in interac

用途: 生成
難易度: Easy
コスト: High

huggingfaceGitHubありHugging Faceあり2026-07-19

Distilled Reinforcement Learning for LLM Post-training

Large language model (LLM) post-training is essential for improving reasoning, adaptation, and alignment. Exis

説明可能品質予測/異常検知深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-19

testtimescaling.github.io — "what, how, where, and how well? a survey on test-time scaling in large language models" repository

大規模言語モデルのテスト時間調整に関する調査のリポジトリ。

用途: 大規模言語モデルのテスト時間調整
難易度: Easy
コスト: High

DataFlow-Harness: A Grounded Code-Agent Platform for Constructing Editable LLM Data Pipelines

Large language models (LLMs) are increasingly used to automate data-processing workflows, yet coding agents ty

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

Group Entropy-Controlled Policy Optimization

Entropy control has become an effective tool in reinforcement learning (RL) of large language models (LLMs), h

深層学習軽量化・量子化生成テキスト強化学習

用途: 生成
難易度: Easy
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

Environment-free Synthetic Data Generation for API-Calling Agents

Training API-calling large language model (LLM) agents demands massive amounts of high-quality trajectories. H

用途: 生成
難易度: Easy
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル分類QA画像

Can Multimodal Large Language Models Understand OCT?

Optical coherence tomography (OCT) imaging is essential for the diagnosis and treatment of retinal diseases. A

用途: 分類
難易度: Easy
コスト: High

自然言語処理大規模言語モデル画像テキストマルチモーダル

An Exam for Active Observers

Human vision is a closed loop: gaze is continuously redirected by intermediate hypotheses rather than a single

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

Apple-π: Benchmarking Thinking with Video Towards Law-Grounded Physical Intelligence

Modern video generation models are increasingly hailed as emerging world models with an internalized grasp of

自然言語処理大規模言語モデル生成動画

用途: 生成
難易度: Easy
コスト: High

RecGPT-V3 Technical Report

Large language models (LLMs) are transforming recommender systems from matching co-occurrence patterns in hist

深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

Understanding Reasoning from Pretraining to Post-Training

Reinforcement learning (RL) has become central to improving large language models (LLMs) on complex reasoning

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

MI向き自然言語処理大規模言語モデル生成画像テキスト

S1-Omni: A Unified Multimodal Reasoning Model for Scientific Understanding, Prediction, and Generation

We present S1-Omni, a unified multimodal reasoning model for scientific understanding, prediction, and generat

用途: 生成
難易度: Easy
コスト: High

DSWorld: A Data Science World Model for Efficient Autonomous Agents

Despite strong capabilities in data understanding and decision-making, autonomous data science agents still he

深層学習軽量化・量子化強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

説明可能自然言語処理大規模言語モデル画像テキスト音声

Audio-Visual Flamingo: Open Audio-Visual Intelligence for Long and Complex Videos

We present Audio-Visual Flamingo (AV-Flamingo), a fully open state-of-the-art audio-visual large language mode

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

自然言語処理大規模言語モデル生成テキストマルチモーダル

generative-ai — Comprehensive resources on Generative AI, including a detailed roadmap, projects, use cases, interview preparation, and coding preparation.

ゼネレーティブAIに関連するリソースの一覧。

用途: ゼネレーティブAI
難易度: Easy
コスト: High

LLM-API-Key-Proxy — Universal LLM Gateway: One API, every LLM. OpenAI/Anthropic-compatible endpoints with multi-provider translation and intelligent load-balancing.

さまざまなLLMのゲートウェイとして使えるライブラリ。

用途: LLMのゲートウェイ
難易度: Easy
コスト: High

clearml — ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

このリポジトリでは、高スループットと低メモリ消費のLLMインフェレンザエンジンであるVLLMを提供しています。

用途: 高スループットと低メモリ消費のLLMインフェレンザ
難易度: Easy
コスト: High

Awesome-Model-Merging-Methods-Theories-Applications — Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM Computing Surveys, 2026.

LLMのマージに関してのマニュアルです。理論、方法、応用などについての概要が記載されています。

用途: LLMのマージ
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-16

Multi-Turn On-Policy Distillation with Prefix Replay

We study on-policy distillation (OPD) for agentic tasks, where an LLM agent interacts with an environment over

深層学習軽量化・量子化

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-16

xHC: Expanded Hyper-Connections

Hyper-Connections (HC) expand the residual stream of Transformers into N parallel streams, providing a form of

深層学習Transformer生成

用途: 生成
難易度: Easy
コスト: High

huggingfaceGitHubありHugging Faceあり2026-07-16

On-Policy Delta Distillation

On-policy distillation is an alternative post-training method in reinforcement learning that alleviates the co

深層学習軽量化・量子化強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-16

agent-lightning — The absolute trainer to light up AI agents.

最適なAIモデルを効率的に学習するためのオーサリングツール。Agent Lightningを使用して、トレーナーをセットアップし、データをトレーニングしてモデルを学習することができる。

用途: AI_AGENTのトレーナーを簡単にセットアップする
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-15

Cura 1T: Specialized Model for Agentic Healthcare

Healthcare spans high-stakes communication, expert reasoning, and workflow execution, yet specialized LLMs tha

自然言語処理大規模言語モデル画像テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-15

Partially Correlated Verifier Cascades in LLM Harnesses: Concave Log-Odds, Polynomial Reliability, and Blind-Spot Ceilings

Serial verification gates are a core reliability primitive in LLM harnesses: a candidate answer is returned on

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-15

ai-engineering-hub — In-depth tutorials on LLMs, RAGs and real-world AI agent applications.

このリポジトリには、LLM、RAG、およびオーソリティの認識を含む、AIエンジニアリングのための深いドキュメントがあります。

用途: 記事を理解するためのテキスト分析ツール
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-14

From Human-Centric to Agentic Code Review: The Impact of Different Generations of Generative AI Technology on Review Quality

Code review helps maintain software quality before code integration, but it also imposes a substantial workloa

品質予測/異常検知深層学習Transformer生成テキスト

用途: 生成
難易度: Easy
コスト: High

Awesome-Embodied-Robotics-and-Agent — This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

Embodied AIやロボットとLarge Language Modelを組み合わせた研究のリポジトリ。

用途: Embodied AIやロボット研究
難易度: Easy
コスト: High

OpenRLHF — An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

OpenRLHFは、Ray上に構築された強化学習フレームワークです。このフレームワークは、PPO、DAPO、REINFORCE++など、様々な強化学習アルゴリズムをサポートしています。

深層学習Transformer画像

用途: 強化学習フレームワーク
難易度: Easy
コスト: High

agents-towards-production — End-to-end, code-first tutorials for building production-grade GenAI agents. From prototype to enterprise deployment.

AIエージェントの開発と実装を行うためのエンドツーマンド、コードファーストのチュートリアル。

用途: AIエージェントの開発と実装
難易度: Easy
コスト: High

memvid — Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.

MemVidは、サーバーレスで単一ファイルの記憶層を提案し、AIエージェントが即時検索と長期的な記憶を持つようにする記憶層です。

自然言語処理大規模言語モデル生成テキスト動画

用途: AIエージェントの記憶を管理する
難易度: Easy
コスト: High

huggingfaceGitHubありHugging Faceあり2026-07-13

RAGU: A Multi-Step GraphRAG Engine with a Compact Domain-Adapted LLM

Graph retrieval-augmented generation (GraphRAG) enhances large language models with structured knowledge, yet

自然言語処理大規模言語モデル検出生成要約

用途: 検出
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-13

Qwen-Music Technical Report

In this report, we introduce Qwen-Music, a powerful music generation model capable of producing highly musical

センサ/時系列品質予測/異常検知深層学習Transformer生成テキスト音声

用途: 生成
難易度: Easy
コスト: High

githubGitHubあり2026-07-13

Awesome-Mixture-of-Experts — Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts (MoME)

Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts

用途: 実装・検証基盤
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-12

Predictive Divergence Masks for LLM RL

Reinforcement learning for large language models (LLMs) typically relies on trust-region masks to stabilize of

深層学習軽量化・量子化テキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-11

Beyond Euclidean Clipping: Overcoming Exploration Collapse in LLM RL via Riemannian Isometric Policy Optimization

Reinforcement learning (RL) has become a dominant paradigm for enhancing LLMs' reasoning capabilities. However

自然言語処理大規模言語モデル強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-11

GigaChat Audio: Time-aware Large Audio Language Model

Temporal grounding in long recordings remains challenging for audio-conditioned LLMs. We present a time-aware

自然言語処理大規模言語モデルテキスト音声

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-11

LLMs-from-scratch — Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

この研究では、COVID-19臨床パスウェイズの予測監視を支援するために、パイプラインを構築しました。このパイプラインには、データリフティング、時間的再構成、イベントログの構築、プリフィックスベースの表現、予測モデルの整

深層学習Transformer生成

用途: 医療機器へのアクセスを予測する
難易度: Easy
コスト: High

githubGitHubあり2026-07-10

multimind-sdk — Your SDK solves all of this. One interface. Unified logic. Local + hosted models. Fine-tuning. Agent tools. Enterprise-ready. Hybrid RAG.Star 🌟 if you like it!

GUI操作自動化に伴う停止判定、復讐、再検索に関する問題を解決し、 GUI操作自動化を実現するためのフレームワークを開発します。

用途: GUI操作自動化ツール
難易度: Easy
コスト: High

githubGitHubあり2026-07-09

Awesome-Item-ID-Gen-RecSys — Updating curated list of research advancements on item identification and item tokenization in generative recommender systems. The survey is titled "A Survey of Item Identifiers in Generative Recommendation: Construction, Alignment, and Generation"

本研究では、生成推奨システムにおけるアイテムIDの構築、調整、生成の手法について、アイテムIDの構築方法を分析しています。

用途: 生成推奨システムのアイテムIDの問題解決
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-07

UI2App: Benchmarking Visual Interaction Inference in Executable Web Application Generation

Large language models (LLMs) have demonstrated growing competence in web page generation. However, existing te

深層学習Transformer生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

enchanted — Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.

iOS、macOS用のアプリ「Enchanted」は、個人でホストした言語モデル（LLama2、Mistral、Vicunaなど）とのチャットを可能にする。

用途: 私家版の言語モデルとチャットするためのiOS、マックアプリ
難易度: Easy
コスト: High

DATAGEN — DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing.

AIドライブのマルチエージェント研究アシスタント。仮説の生成、データ分析、およびレポートの生成を自動化する。

用途: AI研究アシスタント
難易度: Easy
コスト: High

home-llm — A Home Assistant integration & Model to control your smart home using a Local LLM

home-llmは、ローカルLIMを使ってスマートホームの制御を可能にするHome Assistantの統合モデルです。

用途: スマートホームの制御
難易度: Easy
コスト: High

VLM-R1 — Solve Visual Understanding with Reinforced VLMs

この研究では、画像理解を強化する強化されたビジョンホルシックスモデル (VLM-R1) が提案されます。この modelは、画像を理解しやすくするように設計されています。

自然言語処理大規模言語モデル画像マルチモーダル

用途: 画像理解の問題を解決
難易度: Easy
コスト: High

githubGitHubあり2026-07-05

llm-app — Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.

この論文では、RAG、AIパイプライン、企業検索を含むクラウドテンプレートを提供するアプリケーション「llm-app」を紹介します。 llm-app は Docker で動作し、Sharepoint、Google Dr

用途: AIパイプラインを構築する
難易度: Easy
コスト: High

githubGitHubあり2026-07-02

langextract — A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

LLMを使用して、自然言語処理における情報抽出を行うためのPythonライブラリです。

自然言語処理大規模言語モデル画像テキスト

用途: 自然言語処理情報抽出
難易度: Easy
コスト: High

githubGitHubあり2026-07-02

learning — A log of things I'm learning

学習中のアイデアや知識を整理するための日記。

用途: 知識の学習記録
難易度: Easy
コスト: High

githubGitHubあり2026-06-30

telegram-summary-bot — Summarize group chat with AI, LLM && query group chat, FREE to deploy your own, support img, link meta info, reply to, auto fold result, 支持中文检索.

telegramSummaryBotは、グループチャットをAIでサマライズすることができる。無料でデプロイして使用できる。

用途: グループチャットのサマリーサーバーをAIで構築
難易度: Easy
コスト: High

githubGitHubあり2026-06-30

mxcp — Model eXecution + Context Protocol: Enterprise-Grade Data-to-AI Infrastructure

データをAIに変換する基盤を構築することで、ビジネス上の問題を解決できます。この研究では、Model eXecution + Context ProtocolであるMXCPを提案し、データの変換を簡素化した上で、AIアプ

用途: データをAIに変換する基盤を構築することによって、ビジネスを改善する
難易度: Easy
コスト: High

githubGitHubあり2026-06-30

CV — ✅（已完结）超级全面的深度学习笔记【土堆 Pytorch】【李沐动手学深度学习】【吴恩达深度学习】【大飞大模型Agent】

深層学習のノート書。このノートには、土山さんのPytorchノート、おしうの「深層学習」を実践するノート、Wu's「深層学習」をテキスト化したノート、およびダフィンの「大モデルエージェント」のノートが含まれている。

用途: 深層学習ノート
難易度: Easy
コスト: High

arxivGitHubあり2026-06-28

When LLMs Develop Languages: Symbolic Communication for Efficient Multi-Agent Reasoning

Chain-of-Thought (CoT) improves large language models (LLMs) on difficult reasoning tasks, but it often incurs

MI向き深層学習軽量化・量子化テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

githubGitHubあり2026-06-28

awesome-japanese-llm — 日本語LLMまとめ - Overview of Japanese LLMs

分析システムの性能を向上するための学習モデル開発を行う。

自然言語処理大規模言語モデル生成マルチモーダル

用途: 分析システムの性能を向上するための学習モデル開発
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-05-07

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL

Recent growth in reinforcement learning (RL) has surfaced a need for diverse, specialized training environment

自然言語処理大規模言語モデルテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High