自然言語処理

prompts.chat — f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

prompts.chatは、コミュニティが共有したChatGPT用のプロンプットを発見・収集できる場所で、無料でオープンソースで提供されている。

ray — Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

rayは、core分布ランタイムとAIライブラリで構成されたAI計算エンジンで、スケーラブルなAI計算をサポートする。

rig — ⚙️🦀 Build modular and scalable LLM Applications in Rust

Rustを使ってモジュラーLLMアプリケーションを構築することができるライブラリです。

未読 535件

prompts.chat — f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

prompts.chatは、コミュニティが共有したChatGPT用のプロンプットを発見・収集できる場所で、無料でオープンソースで提供されている。

用途: チャットGPT用のプロンプトを共有
難易度: Easy
コスト: High

ray — Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

rayは、core分布ランタイムとAIライブラリで構成されたAI計算エンジンで、スケーラブルなAI計算をサポートする。

用途: AI計算
難易度: Easy
コスト: High

rig — ⚙️🦀 Build modular and scalable LLM Applications in Rust

Rustを使ってモジュラーLLMアプリケーションを構築することができるライブラリです。

用途: モジュラーLLMアプリケーション作成
難易度: Easy
コスト: High

tiny-llm — A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Apple Silicon上でLLM推論サービスをシステムエンジニアが作成するチュートリアル。

用途: LLM推論サービングのチュートリアル
難易度: Easy
コスト: High

awesome-MLSecOps — A curated list of MLSecOps tools and resources for securing machine learning and AI systems - adversarial ML defense, LLM security, AI red teaming, model scanning, supply-chain protection, and MLOps pipeline security.

マルチモーダル理解技術のための新しいアプローチであるMIRRORを提案しました。MIRRORは、テキスト、図、テキストと図の組み合わせから等価な視点を提供することで、視覚的な推論や複雑な推論力を向上し、さまざまなモデルの

用途: マルチモーダル理解技術の開発
難易度: Easy
コスト: High

Medical_Image_Analysis — Foundation models based medical image analysis

医学画像分析は、医療の診断や治療を支援するために画像に記載されたデータから情報を抽出する研究分野です。この研究では、foundation modelsを用い、医療画像分析のための新しいアプローチを提案しました。found

用途: 医学画像分析
難易度: Easy
コスト: High

自然言語処理大規模言語モデルテキスト音声マルチモーダル

screenpipe — YC (S26) | Record your screen 24/7 and plug into your agents. Local, private, secure. Connect to OpenClaw, Hermes agent and 100+ apps

ユーザーの行動を認識し、オートエージェントを構築するためのツール。

用途: オートエージェント構築
難易度: Easy
コスト: High

unsloth — Unsloth is a local UI for training and running Gemma 4, Qwen3.6, DeepSeek, Kimi, GLM and other models.

用途: オープンモデルのトレーニングと実行
難易度: Easy
コスト: High

自然言語処理大規模言語モデルテキストマルチモーダル

ai-agent-book — 《深入理解 AI Agent：设计原理与工程实践》（李博杰著）开源主仓库：全书正文、编译版 PDF 与按章配套代码

この論文では、現在のVision-Language-Benchmark（VLB）を超える、MLLMがアクティブな観察を実演できるようにするためのバenchmark、ActiveVisionを提案する。このActiveVi

用途: 弁論の実際的な対象を形成するためにAIが活用される
難易度: Easy
コスト: High

wandb — The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Weights & Biasesは、AI開発を支援するプラットフォームです。このプラットフォームは、モデル開発から生産準備までを支援し、コストをコントロールし、モデルとデータへのアクセスを管理します。

用途: AI開発プラットフォーム
難易度: Easy
コスト: Medium

ART — Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

ARTは、多段強化学習トレーナーです。このトレーナーは、GRPOを使用して、現実世界のタスクに対して、多段強化学習を行うことができます。

用途: 多段強化学習トレーナー
難易度: Easy
コスト: High

Mooncake — Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

この論文では、LLM を提供するために使用される Mooncake サービスプラットフォームについて説明しています。Mooncakeは、Kimi というリーディングのLLMサービスを提供するサービスです。Kimiは、M

用途: LLM用サービングプラットフォーム
難易度: Easy
コスト: High

rllm — Democratizing Reinforcement Learning for LLMs

このリポジトリでは、AIエンジニアリングのためのリソースを提供しています。

用途: AIエンジニアリング
難易度: Easy
コスト: High

AReaL — The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

このリポジトリでは、高性能で大規模なベクトルデータベースとベクトル検索エンジンを提供しています。

用途: 高性能で大規模なベクトルデータベース
難易度: Easy
コスト: High

qdrant — Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

このリポジトリでは、データとAIアルゴリズムを製品化するためのプラットフォームであるTaipyを提供しています。

自然言語処理埋め込み・検索生成画像

用途: AIアプリケーションを製品化するためのプラットフォーム
難易度: Easy
コスト: Low

mlflow — The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

このリポジトリでは、AIワークロードを管理するためのシステムであるSkypilotを提供しています。

品質予測/異常検知自然言語処理大規模言語モデル

用途: AIワークロードを管理するためのシステム
難易度: Easy
コスト: High

great_expectations — Always know what to expect from your data.

データの期待値を把握するためのフレームワークです。

用途: データの期待値を把握する
難易度: Easy
コスト: Medium

skypilot — The AI Compute Platform for frontier teams. SkyPilot turns fragmented AI compute into one AI supercomputer, so frontier AI teams build custom intelligence faster.

このリポジトリでは、AIアプリケーションをローカルに実行できるツールキットであるRunAnywhere-sdksを提供しています。

用途: AIアプリケーションをローカルに実行できるツールキット
難易度: Easy
コスト: High

metaflow — Build, Manage and Deploy AI/ML Systems

TensorZeroは、LLMゲートウェイ、オブザーバビリティ、評価、最適化、実験を統一したオープンソースのLLMOpsプラットフォームです。

用途: AI/MLシステムの構築、管理、展開ツール
難易度: Easy
コスト: High

flyte — Dynamic, resilient AI orchestration. Coordinate data, models, and compute as you build AI workflows.

metaflowは、AI/MLシステムを構築・管理・ディプロイするために使用できるプラットフォームです。

用途: AIワークロードの実行管理
難易度: Easy
コスト: High

lance — Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

マルチモーダルAIに適したオープンレイクハウスフォーマットです。このフォーマットでは、パレットからデータを2行のコードで変換することができ、100倍速くなります。また、ベクトルインデックスやデータバージョニングが可能です

用途: オープンレイクハウスフォーマット
難易度: Easy
コスト: High

kserve — Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

flyteは、高度に動的で堅牢なAIオーケストレーションプラットフォームであり、データ、モデル、コンピューティングを統合してAIワークフローを作成することができます。

用途: エクスペリメントトラッカーを簡単にする
難易度: Easy
コスト: High

zenml — ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.

aimは、利用しやすく強力なオープンソースのエクスペリメントトラッカーです。

用途: AIプラットフォーム
難易度: Easy
コスト: High

runanywhere-sdks — Production ready toolkit to run AI locally

このリポジトリでは、AIモデルの互換性を確保するためのオープンスタンダードであるONNXを提供しています。

用途: AIモデルの互換性を確保するためのオープンスタンダード
難易度: Easy
コスト: High

verl-omni — Multimodal RL training framework for diffusion & omni models

CVV または CWE への分類を実現し、バグ修正のために重要な手順となるCVEへの CWE 分類を自動化する。

用途: CVVの分類と CWE 分類
難易度: Easy
コスト: High

RAG_Techniques — This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.

医学画像に対する疾患検出モデルを開発し、臨床現場で早期検出と迅速な介入を容易にすることを目的としたフレームワークを提案します。

用途: 医学画像の疾患検出
難易度: Easy
コスト: High

rasa — 💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

rasaは、テキストやボイスベースの会話を自動化するオープンソースの機械学習フレームワークです。自然言語理解(NLU)、会話管理、 slackやFacebook等への接続など、幅広い機能を提供しています。

自然言語処理テキスト

用途: チャットボット作成
難易度: Easy
コスト: Medium

botpress — The open-source hub to build & deploy GPT/LLM Agents ⚡️

オープンソースのGPT/LLMエージェント作成ツールです。

用途: GPT/LLMエージェントの構築
難易度: Easy
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル画像テキストマルチモーダル

MIRROR: Learning from the Other View for Multi-Modal Reasoning

多モーダル理解技術のための新しいアプローチであるMIRROR（Learning from the Other View）を提案しました。MIRRORは、テキスト、図、テキストと図の組み合わせから同等の視点を提供することで

用途: 多モーダル理解技術の開発
難易度: Hard
コスト: High

Windowed-MTP: Removing the Full-Context Draft-KV Tax at Million-Token Context

Speculative decoding accelerates autoregressive generation by having a cheap draft propose tokens that a targe

用途: 生成
難易度: Hard
コスト: High

センサ/時系列自然言語処理大規模言語モデル分類検出埋め込み

Toward Generalizable Cognitive Impairment Detection with Speech-Based Multimodal Large Language Models

Cognitive impairment (CI) is a growing public health concern. Early and accurate diagnosis is critical for ena

用途: 分類
難易度: Hard
コスト: High

Compact Latent Coordination for Autonomous Vehicles at Unsignalized Intersections

Coordinating autonomous vehicles at unsignalized intersections remains a critical challenge for multi-agent re

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Finite-Sample Coverage Audits for High-Recall Candidate Generation: Certification and Learning-Theoretic Design

An initial high-recall stage in an empirical pipeline decides which items pass to later review, labelling, or

自然言語処理RAG生成テキスト

用途: 生成
難易度: Hard
コスト: Low

Error Certificates for KV-Cache Eviction via Randomized Design

Deterministic KV-cache eviction keeps the top-$k$ tokens under an importance score and deletes the rest. We pr

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Test-Time Scaling via Error Localization

Scaling inference-time computation has emerged as a reliable method to improve the performance of large langua

自然言語処理大規模言語モデル検出生成テキスト

用途: 検出
難易度: Hard
コスト: High

Token Budget Saturation and Mechanistic Early Detection of Reasoning Non-Convergence in Chain-of-Thought Models

チェーン・オブ・サウト reasoning モデルの収束不明確さを解決する研究。このモデルの不完全収束は、生成するトークンの数に依存し、モデルには収束しない限り問題を解決する能力がない。これを解決するための予測を終了する

自然言語処理プロンプトエンジニアリング検出生成

用途: チェーン・オブ・サウト reasoning モデルに適切に予測を終了する方法を検討する
難易度: Hard
コスト: High

Semantic-Aware Task Clustering for Constructive and Cooperative Multi-Tasking

マルチタスク学習におけるタスクのグループ分け方法を提案した。この研究では、タスク間の意味的関係を考慮することで、マルチタスク学習の実行を改善することができると主張した。

用途: マルチタスク学習におけるタスクのグループ分け
難易度: Hard
コスト: High

Cautious optimism for deep parameterized quantum circuits

可能な量子的計算機械学習モデルにおける一般化の理解を進めた。この研究では、パラメータ化された量子回路のパフォーマンスのスケーリングに影響を与える要因を調査し、パラメータ化された量子回路のパフォーマンスを理解するための新し

用途: 可能な量子的計算機械学習モデルにおける一般化の理解
難易度: Hard
コスト: High

Emergent Misalignment Recruits a Pre-existing Persona Subspace

アライメントした言語モデルの偏った表現の理解を進めた。この研究では、アライメントした言語モデルの表現を分析して、偏った表現を理解することができ、これを用いて、偏った表現を正すことができると主張した。

自然言語処理ファインチューニング生成テキスト

用途: アライメントした言語モデルの偏った表現の理解
難易度: Hard
コスト: High

M$^3$-Gen: Interpretable Multimodal Generation of Gene Expression Profiles Using Clinical and Imaging Data

Integrating heterogeneous biomedical data, including clinical metadata, histopathology images, and molecular p

説明可能自然言語処理RAG生成画像マルチモーダル

用途: 生成
難易度: Hard
コスト: High

AI Assistants Overassist

Large language models (LLMs) are increasingly used as tutors and thought partners, helping users reason throug

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Multi-Task Learning for Heterogeneous Prediction from Video Game State with Transfer Learning

Multi-task learning (MTL) is a promising approach for prediction tasks derived from video game state data, as

自然言語処理ファインチューニング画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Filter Learning for Subgraphs: Algebras and Performance Risk Bounds

Graph signal processing tasks that leverage spectral information typically assume access to the complete graph

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Automated Synthesis and Adversarial Validation of Executable Causal Research Pipelines

この研究では、機械学習モデルを使用して血糖値の変化を予測し、糖尿病管理のためには血糖値データの前処理が重要であることの重要性を強調しています。

用途: 病気予測
難易度: Hard
コスト: High

Relative Value Learning

この研究では、反対称関数を用いて、機械学習モデルが状態のどの点からどの点への値の差を予測できるような相対的な値学習(RV)を提案し、制御や推定を向上させる可能性があります。

用途: 値の差を予測
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング強化学習

TOUR: A Trajectory-Level Unlearning Benchmark for Offline Reinforcement Learning

この研究では、固定行動軌道に基づいて訓練されたオフサイト学習エージェントのデータ削除を評価するためのTOURを提案し、オフサイト学習の安全性を高めます。

用途: オフサイト学習のデータ削除
難易度: Hard
コスト: High

自然言語処理大規模言語モデル異常検知テキスト強化学習

Training Large Language Models for Self-Explanation Faithfulness

この研究では、自己説明の信頼性を検証するためのRL方法を提案し、自己説明の信頼性を直接最適化するための新しいアプローチを検討します。

用途: 自己説明の信頼性
難易度: Hard
コスト: High

Nipping the Butterfly Effect in the Bud: Self-Output Fine-Tuning for Autoregressive Weather Prediction

この研究では、長期

自然言語処理RAG異常検知予測テキスト

用途: 天気予報
難易度: Hard
コスト: Low

From Evaluation to Optimisation: Hierarchy-Aware Training Signals for CWE Prediction in Python

The original ALPHA benchmark introduced a taxonomy-aware penalty for evaluating CWE-level vulnerability predic

自然言語処理ファインチューニング分類強化学習

用途: 分類
難易度: Hard
コスト: High

ADABORD: a novel AdaBoost approach for ordinal classification

Ordinal Classification (OC) deals with classification tasks where the classes follow a natural order. Despite

用途: 分類
難易度: Hard
コスト: Low

MI向き自然言語処理ファインチューニングテキスト強化学習

The Weight of Silence: A Causal Case for Weights Over the Scratchpad in Latent Chess Reasoning

ラテン言語モデルを使用すると、言語モデルの内部の計算結果を分析できる。計算結果は、連続ベクトル空間として実行される中間計算であり、これを分析すると、モデルがどのように結果を得ているかを明らかにできる。

用途: ラテン言語モデルの中間計算を分析する
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング検出生成

RadioTrace: Transmitter-Aware Diffusion for Radio Map Estimation without Deployment-Time Fine-Tuning

RFマップ（無線周波数マップ）を推定するためのTransmitter-Aware Diffusion（送信機認識拡張）を提案した研究で、この方法によりRFマップを効率的に推定できる。

用途: RFマップの推定を支援する
難易度: Hard
コスト: High

Multi-turn RL with Structural and Performance Aware Rewards for CUDA Kernel Generation

CUDAカーネルの生成を支援するCudaPerfを提案した研究で、この方法により、高性能のCUDAカーネルを効率的に生成できる。

自然言語処理大規模言語モデル生成強化学習

用途: CUDAカーネルの生成を支援する
難易度: Hard
コスト: High

Position Bias is Hidden Behind Ceiling Effects: A Permutation Diagnostic for LLM Benchmarks

LLM（言語モデル）の評価における位置バイアスを分析するための方法を提案した研究で、この方法により、位置バイアスが評価結果にどのような影響を与えるかが明らかにできる。

自然言語処理大規模言語モデル検出生成

用途: LLMの評価における位置バイアスを分析する
難易度: Hard
コスト: High

Offline RL with Hierarchical Action Chunking

オフラインRL（非実時学習）におけるタスクの分割を支援するOffline RL with Hierarchical Action Chunkingを提案した研究で、この方法により、タスクの分割が効

用途: オフラインRLにおけるタスクの分割
難易度: Hard
コスト: High

Robust Asynchronous Q-Learning under Reward and State Corruption via Batching

Motivated by reinforcement learning in harsh environments, we consider the problem of learning an optimal poli

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Self-Balancing Sequential Sampling: Fast Convergence with Controlled Predictability

この研究は、ランダムサンプリングで発生する偏りをなくし、分布の収束を高速化する方法を提案した。方法は、サンプルの傾向を変化させることで、サンプルが予測できないようにした。

用途: セクエンシャルサンプリングの高速化
難易度: Hard
コスト: Low

Beyond Sycophancy: Structured Resistance and Compliance in LLM Moral Reasoning

この研究では、言語モデルが社会的に正しい判断を下すことができる方法について調べた。結果は、模式が対立を保つ能力が高くなり、他の人の視点を受け入れやすくなったことである。

用途: 社会的道徳判断の向上
難易度: Hard
コスト: High

From Resource Flow to Executable Tests: Petri-Net-Guided LLM Test Generation for Concurrent Stateful Rust APIs

この研究は、リソースフローの動作を表すPetriネットと、APIを操作するためのテストを自動生成する方法を提案した。方法は、APIの機能をテストするためのシナリオを生成し、テストが正しく実行されるようにした。

用途: 共時進行のコンカURRENCYAPIのテスト
難易度: Hard
コスト: High

GS-Agent: Creating 4D Physical Worlds With Generative Simulation

GS-Agentは、自然言語から生成することができ、物理的に正しく動作する4次元の世界を生成することができる。方法は、物理的正しさを保つために、生成時に物理的推論を使用した。

MI向き自然言語処理RAG生成画像テキスト

用途: 4次元の物理世界の生成
難易度: Hard
コスト: High

Same Dangerous Objective, Opposite Advice: Direct Exposure versus Multi-Agent Mediation

この研究では、LMOの安全性を調べた。結果は、直面する危険目標に対してモデルが安全なアドバイスを出すことができた。

用途: 直接暴露対照的暴露
難易度: Hard
コスト: High

Improved lower bounds for the Shannon capacity of odd cycles

この研究は、奇数サイクルのシャノン容量の最小限度を検討した。結果は、グラフの独立集合の大きさに基づいて最小限度を計算することができた。

用途: シャノン容量の最小限度
難易度: Hard
コスト: High

Agentic Context Management: Solving Agent Memory and Cost by Treating Them as Lifecycle and Architecture Problems

Agentic Context Managementは、エージェントのメモリとコストを管理できるようにした。方法は、エージェントが自己管理できるように、トレーナーが制御できるようにした。

自然言語処理RAG要約テキスト

用途: エージェントメモリとコストの解決
難易度: Hard
コスト: Low

Agentic coding without the cloud: evaluating open-weight large language models on longitudinal data preparation tasks

Large language models (LLMs) and agents are now widely used tools in code development, with data typically sen

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Detecting LLM-Generated Tokens in Human--LLM Coauthored Text

The rise of human-AI collaborative writing has created a growing need for fine-grained detection methods that

自然言語処理大規模言語モデル分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

RUMBA: Russian User Memory Benchmark

The ability to handle long-term memory in LLMs is becoming increasingly critical, yet existing benchmarks rema

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

PATS: Policy-Aware Training Scaffolding for Agentic Reinforcement Learning

In long-horizon LLM agent reinforcement learning, weak policies often repeat similar failures, producing uninf

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

表形式向き自然言語処理大規模言語モデル生成テキスト

Euclid-MCP: A Model Context Protocol Server for Deterministic Logical Reasoning via Prolog

Large Language Models (LLMs) excel at natural language understanding and generation but remain unreliable for

用途: 生成
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理RAG自己教師

Cycle-Consistent and Uncertainty-Aware Neural Surrogates for Tokamak Edge Plasmas

The boundary and divertor plasma govern how a tokamak exhausts power and particles, setting heat fluxes, targe

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

センサ/時系列自然言語処理大規模言語モデル画像テキスト3D

VoLN: Vision-Only Long-Horizon Navigation---Paradigm, Benchmark, and Method

Vision-and-Language Navigation (VLN) enables embodied agents to follow natural-language instructions. However,

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

DINOde: Continuous Vision-Text Alignment for Open-Vocabulary Semantic Segmentation

Open-vocabulary semantic segmentation (OVSS) leverages textual semantics to segment objects beyond predefined

自然言語処理RAGセグメンテーション画像テキスト

用途: セグメンテーション
難易度: Hard
コスト: High

SPORD: A Simulation-Propose-then-OR-Dispose Approach for Supply Chain Planning

For years, supply chain planning at e-commerce firms has operated as a collection of isolated projects. Each p

CPUで試しやすい自然言語処理大規模言語モデル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Phonetic forced alignment for low-resource language varieties: Model training and evaluation on Chengdu Mandarin

Phonetic forced alignment is a key technique in phonetic research, yet existing alignment systems lack special

自然言語処理RAG分類テキスト音声

用途: 分類
難易度: Hard
コスト: High

From Static Bibliometrics to Dynamic Knowledge Graphs: An LLM-Powered Framework for Modernizing Science, Technology, and Innovation (STI) Analytics

Bibliometric indicators - citation counts, h-indexes, co-authorship networks - have long anchored science, tec

自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Hard
コスト: High

GRADRAG: Cross-Component Prompt Adaptation for Coordinated Multi-Agent RAG

Retrieval-Augmented Generation (RAG) systems increasingly employ multiple LLM agents. Yet, most prior work opt

用途: 生成
難易度: Hard
コスト: High

Scaling Up Formal Representation of Clinical Trial Protocols in Ensemble Logic Using LLMs: A Preliminary Study

The reliance on unstructured free text for documenting clinical trial protocols creates a significant barrier

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデルQA画像テキスト

Unlearning Under Imbalance: Benchmarking Fairness in Multimodal LLM Unlearning

LLMは、人間のアイデンティティのシミュレーションを使用して個人データを削除したり、未均衡なデータを削除したりしますが、これらのアプローチには制限があります。

用途: モデルの個人データ削除
難易度: Hard
コスト: High

少数データ向きCPUで試しやすい条件最適化自然言語処理大規模言語モデル生成

An LLM-Driven Workflow for Automated Process Control Strategy Generation and Tuning from Dynamic Process Models

このプロセスでは、大規模言語モデルを使用して、ダイナミックプロセスモデルに基づいて自動化された制御戦略を生成します。

用途: オートメーションされた制御戦略の生成
難易度: Hard
コスト: High

pAI-Econ-claude: A Gated Human-in-the-Loop Multi-Agent Architecture for AI-Assisted Economic Theory Development

この研究では、大規模言語モデルを活用して、経済学の研究活動をサポートするシステムを開発しました。このシステムは、学者が理論モデル開発を自動化することができます。

用途: 経済学の研究支援システム
難易度: Hard
コスト: High

Explainable Belief Harmonization under Dynamic Epistemic Partitions

この研究では、大規模言語モデルを活用して、信念の共有を組み合わせるモデルを開発しました。大規模言語モデルを活用することで、信念の共有を推測することができました。

説明可能自然言語処理RAG検出

用途: 共有された信念を組み合わせるモデル
難易度: Hard
コスト: Low

Explainability Framework for Policy-Aware Autonomous Agents

この研究では、大規模言語モデルを活用して、自己決定エージェントの説明可能性を研究しました。大規模言語モデルを活用することで、エージェントの行動を推測することができました。

用途: 自律エージェントの説明可能性
難易度: Hard
コスト: Low

Case study: solving P-99 with LPTP and an LLM

Ninety-Nine Prolog Problems (P-99) is a famous set of Prolog exercises. We solved the first thirty three just

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Encoding Event-B Proof Rules in Prolog: An Interactive Sequent Prover for ProB

Event-B is a formal method rooted in predicate logic and set theory. We encoded over 600 proof rules in Prolog

自然言語処理ファインチューニング画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Case study: proving sqrt(2) irrational with LPTP and an LLM

LLM (Large Language Model)とLP (Logic Programming)を組み合わせて、有理数である√2の非有理性を証明します。この証明には、LLMが主観的な論理式を生成し、LPが証明を行うプロ

用途: 有理数である√2の非有理性の証明
難易度: Hard
コスト: High

Safeguards for Speech2Speech LLM-Assistants: A Case Study in Automotive Applications

S2S (Speech-to-Speech) LLMアシスタントを利用して、人間のような話し方をすることができますが、安全対策の実装が困難です。この研究では、S2S LLMアシスタントの安全対策を2つのアプローチで実現し

用途: S2S LLMアシスタントの安全対策
難易度: Hard
コスト: High

SafeStep: AI-powered Travel Assistance for Elderly People with Frailty or Dementia

老年者は移動が困難になることが多いため、この研究では老年者の安全な移動支援システムを開発します。このシステムでは、LLMと予測モデルを組み合わせて、老年者の安全な移動を支援します。

用途: 老年者の安全な移動支援
難易度: Hard
コスト: High

CRAG-MM-Diagnostics: Enabling Stage-Wise Analysis of Knowledge-Intensive VQA

知識重視の質問応答システム (KI-VQA) を分析するために、新しい評価基準を提案します。これらの基準では、VLMの各タスクを個別に評価することができます。

自然言語処理大規模言語モデル分類QA画像

用途: 知識重視の質問応答システムの分析
難易度: Hard
コスト: High

V-DEAL: Diagnosing Video Safety De-Calibration as an Understanding-Refusal Coupling Failure

ビデオLMMの安全性を確認するために、新しい診断フレームワークを提案します。これらのフレームワークは、モデルの挙動、理解、セマンティクスを同時に考慮します。

自然言語処理大規模言語モデル画像テキスト動画

用途: ビデオ安全性デ-カリブレーションの診断
難易度: Hard
コスト: High

Hardware-Software Co-Design for Float16 On-Device Training on RISC-V Single-Core

RISC-V単コア上の16ビット浮動小数演算 (FP16) を高速化するために、新しいフレームワークを提案します。これらのフレームワークは、メモリフットプリントを約50％削減しながらモデルのパフォーマンスを維持します。

用途: RISC-V単コア上の16ビット浮動小数演算の高速化
難易度: Hard
コスト: High

AttriMem: Attribution-Guided Process Feedback for Agent Memory Learning

代理記憶の学習は、LGMが効果的に情報を保持・更新・処理できることを意味します。この研究では、アトリビューテッドグラフィックフィードバックを使用して、代理記憶を最適化する方法を提案します。

自然言語処理大規模言語モデルQA

用途: 代理記憶の学習
難易度: Hard
コスト: High

HiMe: Real-Time Self-Hosted Personal Agent Platform for Health Insights with Wearable Devices

Traditional approaches to wearable health signal analysis, such as smartwatches, are constrained by rigid anal

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

EmoAgent-R1: Towards Multimodal Emotion Understanding with Reinforcement Learning-based Dynamic Agent Specialization

Multimodal large language models (MLLMs) have achieved impressive performance in multimodal emotion recognitio

自然言語処理大規模言語モデル分類テキスト動画

用途: 分類
難易度: Hard
コスト: High

自然言語処理プロンプトエンジニアリング分類画像テキスト

Sparse Concept Channels in Frozen 3D CT Vision Encoders

Large vision-language models are becoming increasingly dominant in 3D medical image interpretation, but we rar

用途: 分類
難易度: Hard
コスト: High

GuardianAgentBench: Where Agents Fail and How to Guard Them

_guardianAgentBenchBenchmarkは、580のシナリオを6つのドメインで評価し、3つの実稼動フレームワークであるLangChain、LlamaIndex、Vectaraを利用します。このベンチマーク

用途: 機械学習Agentの安全性と信頼性を確保
難易度: Hard
コスト: High

Delivery, Not Storage: Cue-Anchored Working Memory as a Harness Property for Coding Agents

Coding agents ship with one kind of memory: documents. Instruction files, plan artifacts, and auto-written mem

MI向き自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

SciExplore: Evaluating Autonomous Agents from Scientific Navigation to Information Integration

Scientific research involves complex information-seeking and reasoning workflows across heterogeneous sources.

自然言語処理大規模言語モデル生成QAテキスト

用途: 生成
難易度: Hard
コスト: High

Representing Entity Importance in AI Knowledge Systems: A Dual-Signal Framework of Audience Evaluation and Structural Authority

AI knowledge systems require representations of entity importance for retrieval, recommendation, evidence sele

説明可能自然言語処理埋め込み・検索テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Scientific exploration, collaboration and labor division in the large language model era

Large language models (LLMs) have rapidly and significantly entered scientific workflows, but it remains uncle

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Traceable Scholarship: Page Anchors and Ariadne's Thread for Humanistic Inquiry in the Age of Generative AI

Generative AI lets large language models produce scholarly-looking text within seconds, yet fluency does not e

用途: 生成
難易度: Hard
コスト: High

Is Deep Research Reliable? Misleading Knowledge Induces False Conclusions

Deep Research agents extend LLM-based assistants into long-horizon workflows involving planning, retrieval, ev

用途: 生成
難易度: Hard
コスト: High

Code Monitor Red Teaming for Public-Test-Passing Code

Visible tests are a common gate for LLM-generated code, but passing them does not certify specification correc

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Auditing Evidence Use in Medical LLM Diagnosis

Medical LLMs are often evaluated by whether they select the correct diagnosis, but diagnostic accuracy alone d

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Surprisal Theory is Tautological (without Rational Grounding)

ある単位の人間の認知的難易度は、その単位のスルーパイズアル (Surprisal) と特定の言語モデルによれば線形関数に等しいという理論が存在します。しかし、この理論は実態を反映していないという批判があります。この論文で

自然言語処理テキスト

用途: 認知的難易度と言語モデルとの関係
難易度: Hard
コスト: High

MedGame: Storytelling Gamification Empowered by Large Language Models for Medical Education

Large Language Models (LLMs) は医学教育に大きな可能性を持っていますが、現在のシステムでは、質問に答えるか一時的なフィードバックしか行なわれていません。一方、臨床病例を決定センターへの学習トレ

自然言語処理大規模言語モデル生成QAテキスト

用途: 医学教育への Large Language ModeL の適用
難易度: Hard
コスト: High

When Trivia Is Not Trivial: Everyday Knowledge Failures in Multilingual LLMs

この論文では、大規模言語モデル (LLMs) が日常的な文化的知識を評価する能力に着目しています。ここで、TriviaRoomQA というクイズスタイルで問題を提示して、LLMs が日常的な文化的知識をどのように評価する

用途: 大規模言語モデルにおける日常生活の知識の評価
難易度: Hard
コスト: High

センサ/時系列自然言語処理大規模言語モデル生成テキスト音声

An Evaluation Framework for Structured Audio Captions Validated by Controlled Perturbations

この論文では、音声字幕の評価手法が提案され、音声字幕の評価において既存の手法の制約を克服することを目指しました。提案されたフレームワークは音声字幕の各側面を評価し、質問回答型の評価手法ではなく字幕の中立性を評価することが

用途: 音声字幕の評価フレームワークの構築
難易度: Hard
コスト: High

Anti-Periodic Positional Encoding: Möbius Boundary Conditions Make In-Context Retrieval Reliable

この論文では、対称的な位置エンコードにモビアスの対称性を適用しました。これにより、ローテーションの平面での各位置間のホロノミーが -1 となり、シーケンスの両端が決定的に結合されます。この手法により、精度が高額になること

自然言語処理プロンプトエンジニアリングテキスト

用途: モビアスの対称性に基づく対称的な位置エンコード
難易度: Hard
コスト: High

MemTools: A Unified Research Framework for Interoperable Agent Memory

この論文では、記憶システムをサポートするフレームワークMemToolsが構築され、記憶システムの開発を容易にすることを目指しました。これにより開発者は、記憶システムの各コンポーネントを開発およびテストしやすくなり、設計と

自然言語処理RAGマルチモーダル

用途: エージェントの記憶をサポートするフレームワークの構築
難易度: Hard
コスト: High

Word meaning co-determines vowel-inherent spectral change. A corpus-based investigation of conversational Mandarin

この論文では、会話マンダリンにおける単語の意味と子音の特性の関係を調べました。その結果、単語の

自然言語処理埋め込み・検索テキスト音声

用途: 会話マンダリンにおける単語の意味と子音特性の関係
難易度: Hard
コスト: Low

Capital Markets LLM Reliability Score (CM-LRS): From Plausible to Bankable

In capital-markets workflows the question is rarely whether a large language model can produce a fluent draft,

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデルテキスト

news-crawler-LM: A Small Long-Context Model For High-Quality News Crawling

Extracting structured content from news pages remains challenging due to heterogeneous HTML layouts, inconsist

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

A Unified Moral-Value Dataset for Instruction Tuning

Large language models (LLMs) have developed rapidly and become valuable tools in everyday life. However, how t

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Progressive Cramming: Reliable Token Compression and What It Reveals

Token cramming compresses sequences into learned embeddings with near-perfect reconstruction, but fixed token

自然言語処理埋め込み・検索生成テキスト

用途: 生成
難易度: Hard
コスト: Low

説明可能品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

PrefReward: Learning User Preference Matrix for Personalized Text Generation

Large Language Models (LLMs) have demonstrated remarkable ability in generating personalized content by levera

用途: 生成
難易度: Hard
コスト: High

CultureTalk-ID: A Multi-Task Dialogue Benchmark for Cultural Commonsense in Indonesian Local Languages

Culture is lived through conversation, yet existing Indonesian cultural commonsense benchmarks evaluate LLMs o

自然言語処理大規模言語モデル翻訳テキスト

用途: 翻訳
難易度: Hard
コスト: High

Where Animacy Lives in Large Language Models: Tracing the Circuits of the Animacy Concept

Distinguishing animate from inanimate concepts in written language requires more than shallow text processing,

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

From a Word-Level Dictionary to Sentence-Level Semantics: Multilingual Grievance Labelling with Contextual Models

Grievance is one of the warning signs analysts look for when assessing threats of violence. It is increasingly

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

Chemical Chain-of-Thought Functions as a Hallucination-Prone Molecular Scratchpad

化学物質の構造を予測する言語モデルが信頼性の低い情報を生成する傾向があることを指摘し、原因と解決策について検討している。

MI向き自然言語処理RAG生成テキスト

用途: 化学物質の構造予測
難易度: Hard
コスト: Low

Tencent WorkBuddy Bench: A Multi-Domain Coding-Agent Benchmark with Contamination-Resistant Task Construction

コーディングエージェントの評価基準を導入し、現実世界のコミットやプルリクエストに基づくタスクを構築した。

用途: コーディングエージェントの評価
難易度: Hard
コスト: High

LegalCiteTrust: Benchmarking Citation Trustworthiness in Chinese Long-Form Legal Research Reports

Chinese language の長形法律研究報告における出典の信頼性を評価し、信頼性が低い出典を検出および評価する目的で LegalCiteTrust を提案している。

用途: 法律研究報告の信頼性改善
難易度: Hard
コスト: High

CSPF: A Constrained Shared-Private Fusion Method for Non-Verifiable Preference Evaluation

非真実性の評価において、評価手法が多様な評価基準を捕捉する能力に乏しく、評価者間の偏見が存在する問題を解決するために、CSPF (Constrained Shared-private Fusion) を提案している。

自然言語処理RAG異常検知

用途: 非真実性の評価
難易度: Hard
コスト: Low

REFACT: Adaptive Fact Restatement for Compact and Faithful Chain-of-Thought Reasoning

長形推論のための言語モデルが、提供されたコンテキストから乖離した論理を生成する可能性があることを指摘し、コンテキストと推論論理をより適切に融合するため、 REFACT (REstating Facts in Adapti

用途: Chain-Of-Thought (CoT) の改善
難易度: Hard
コスト: High

自然言語処理ファインチューニング分類セグメンテーション画像

ASTRA-Net: Anatomy-Specific Transfer and Representation Alignment for Drug-Induced Sleep Endoscopy Segmentation

Quantitative drug-induced sleep endoscopy (DISE) requires reliable airway boundaries at specific anatomical le

用途: 分類
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理RAG検出画像テキスト

Detectors Learn the Wrong Thing: Shortcut-Resistant Adversarial Training Against Physically Realizable Attacks

AI-enabled visual perception systems are increasingly deployed in intelligent transportation infrastructure an

用途: 検出
難易度: Hard
コスト: High

Out of Sight, Still in Mind: Token Compression for Omni-LLMs

The goal of this paper is to reduce the input token cost of Omni-modal large language models (Omni-LLMs) at in

自然言語処理大規模言語モデル画像テキスト音声

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Decoupling Cross-Modality Manifold Discrepancy: Leveraging Visible Diffusion Priors for Infrared Super-Resolution

Infrared image super-resolution (IISR) mitigates the limitations imposed by low spatial resolution. Existing m

自然言語処理RAG生成画像マルチモーダル

用途: 生成
難易度: Hard
コスト: High

HalluScope: Fine-grained Hallucination Diagnosis for Multimodal Large Language Models

大規模言語モデルはさまざまな画像をテキストに変換する上で優れた性能を示しているが、発生するホログラフィックな診断にはまだ解決策が必要です。この研究では、主流の粗い検出方法の欠点を補うため、細部の診断方法を提案しています。

説明可能自然言語処理大規模言語モデル分類検出生成

用途: ホログラフィックハロウィーンの診断
難易度: Hard
コスト: High

Geo3R: Mitigating Spatial Reasoning Hallucination in Multimodal Large Language Models

大規模言語モデルのハロウィーン診断では、対象の 3D 空間関係を推論する際に、視覚化が欠如していることが問題となります。この研究では、これらのハロウィーンを軽減するためのアプローチを提案しています。

自然言語処理大規模言語モデル画像テキスト3D

用途: 3D空間推論のハロウィーン診断
難易度: Hard
コスト: High

Show, Don't Tell: Evaluating Spatial Cognition in Generative Pixels Rather Than LLM Text

空間理解は、物理世界と静的のセマンティック理解の間でつながるために不可欠です。多くの空間タスクは、場所、領域、パスの自然な表現は、ポインティングやマーキングなど、連続的な視覚的シーンで行われることが多いが、現行の空間推論

用途: 空間理解
難易度: Hard
コスト: High

Do Pathology Vision-Language Models Truly See Pathology?

パスロジは、現在、パスロジ認識のための画像言語モデルを評価するために広く使用されていますが、この研究では、パスロジ認識において画像言語モデルの視覚知覚が機能していることを疑問に問っています。

用途: パスロジの認識
難易度: Hard
コスト: High

少数データ向き自然言語処理プロンプトエンジニアリング分類画像

AUCH-Net: Action Unit-Based Consistency-Aware Hypergraph Network for Cross-Domain Few-Shot Facial Expression Recognition

Recently, cross-domain few-shot facial expression recognition (CF-FER) has received considerable attention. Ho

用途: 分類
難易度: Hard
コスト: Low

Distribution-Alignment Bridge for Uncertainty-Aware Text-to-Video Retrieval

本論文では、テキストと動画を対応させるDistribution-Alignment Bridge（DAB）を提案します。DABは、テキストと動画のエンティティを確率分布として表現し、両者の間の分布の差異を解決します。この

自然言語処理埋め込み・検索生成テキスト動画

用途: テキストから動画の検索
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理RAG生成画像教師なし

Unsupervised Metal Artifact Reduction in Dental CBCT using Fine-tuned Cycle-Consistent Adversarial Networks

この研究では、歯科CBCT画像中のメタルアーティファクトを除去するための循環互換的アドバーサリアルネットワーク（CycleGAN）を提案します。CycleGANを使用すると、メタルアーティファクトを除去した後、CBCT画

用途: メタルアーティファクトの除去
難易度: Hard
コスト: Low

Engine-Native Editable 3D World Reconstruction with Objects and Lighting

この論文では、Lumeraという手法を提案します。Lumeraは、Engine-Native 3D World ReconstructionとLightsを検出するために使用します。

自然言語処理大規模言語モデル検出生成画像

用途: 3D世界の再構成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト動画

ViSTR-Bench: Can MLLMs Reason from Continuous Visual Cues in Dynamic Scenes?

この論文では、ViSTR-Benchという手法を提案します。ViSTR-Benchは、MLLMが動的シーンから情報を取得できるかどうかを評価します。

用途: 3Dシーンの分析
難易度: Hard
コスト: High

Agentic Designer: Progressive Multi-Agent Collaboration for Structure-Aware Interior Layout Generation

Generating realistic interior furniture layouts that strictly adhere to architectural constraints (e.g., walls

用途: 生成
難易度: Hard
コスト: High

FORGE-plus: Force-Budgeted Recovery for Contact-Rich Assembly with a Frozen LLM Supervisor

強制制約に基づく強化学習を利用し、低コストで高精度の組み立てが可能になると同時に、組み立てに失敗してもロボットが安全に回避できるように、ロボットの制御のための強化学習を提案します。

用途: 非対称ロボット組み立て
難易度: Hard
コスト: High

自然言語処理ファインチューニングテキスト3Dマルチモーダル

ZONDA: Zero-shot Object Navigation with Dynamic Avoidance in Multi-floor Environments

オブジェクト目標のナビゲーションにおける、動的な避け方とマルチフロア環境を考慮した、ゼロショットオブジェクトナビゲーションのフレームワークを提案します。このフレームワークでは、動的な人々とマルチフロア環境を考慮しながら、

用途: マルチフロアにおけるオブジェクト目標のナビゲーション
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理RAG生成画像テキスト

TableVerse: A Large-scale Tabletop Dataset with Real-world Grounded Layouts for Generalizable Manipulation

オートメーションされたマニピュレーションを目的とした、大規模なテーブルトップのデータセットであるTableVerse を提案します。このデータセットには、物理的に可能な実世界のレイアウトを生成する実用的な方法が含まれてお

用途: オートメーションされたマニピュレーションのためのテーブル環境の生成
難易度: Hard
コスト: Low

Distributed Model-Based Diffusion For Scalable Multi-Robot Trajectory Optimization

多ロボットのトラッジオプティマイズを目的とした、分散型のモデリングベースの浸透を提案します。このフレームワークは、非凸の非線形の非可微分な環境を考慮しながら、効率的なトラッジ作成を支援します。

用途: 多ロボットのトラッジオプティマイズ
難易度: Hard
コスト: High

Deep Reinforcement-Learning-Guided Model Predictive Control for Preventing Overtakes in Autonomous Racing

オートモーティブレーシングにおける防御阻止を目的とした、強化学習とモデル予測制御のハイブリッドフレームワークを提案します。このフレームワークでは、自律車

用途: オートモーティブレーシングにおける防御阻止
難易度: Hard
コスト: Low

huggingfaceHugging Faceあり2026-07-23

K12-KGraph: A Curriculum-Aligned Knowledge Graph for Benchmarking and Training Educational LLMs

Large language models are increasingly used in K-12 education, but existing benchmarks mainly test exam questi

自然言語処理大規模言語モデルQA画像テキスト

用途: QA
難易度: Easy
コスト: High

minimind — 🧠「大模型」2小时完全从0训练64M的小参数LLM！Train a 64M-parameter LLM from scratch in just 2h!

このライブラリは、空間情報を扱うためのコンピュータビジョンライブラリです。

用途: ジオメトリックなコンピュータビジョンライブラリ
難易度: Easy
コスト: High

nestia — NestJS Helper + AI Chatbot Development

NestJSベースのAIチャットボット開発ツールです。

用途: AIチャットボット作成
難易度: Easy
コスト: High

AgentsMeetRL — Awesome List for Agentic RL

エージェントRRLに関連するアワーショットリスト。

用途: エージェントRRL
難易度: Easy
コスト: High

awesome-llm-unlearning — A resource repository for machine unlearning in large language models

このリポジトリは大規モデルの無学習に関するリソースをまとめたものです。

用途: 大規模言語モデルの無学習
難易度: Easy
コスト: High

FinGPT — FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

このリポジトリでは、Lecture Learning Modelsに対してReinforcement Learningを実行するライブラリを提供しています。

用途: 可搬性のあるReinforcement Learning
難易度: Easy
コスト: High

xtuner — A Next-Generation Training Engine Built for Ultra-Large MoE Models

xtunerは、超大規模MoEモデルを高速にトレーニングするためのトレーニングエンジンです。

自然言語処理大規模言語モデル生成マルチモーダル

用途: MoEモデルの高速トレーニングを提供する
難易度: Easy
コスト: High

giskard-oss — 🐢 Open-Source Evaluation & Testing library for LLM Agents

giskard-ossは、LLMエージェントの評価とテストライブラリを提供します。

用途: LLMエージェントの評価とテストライブラリ
難易度: Easy
コスト: High

remove-ai-watermarks — AI watermark remover. CLI and Python library to strip visible and invisible AI watermarks (Gemini / Nano Banana sparkle, SynthID) and provenance metadata (C2PA, EXIF, IPTC) from images.

音声認識、声活動検出、テキスト処理などを行う、基盤となる音声認識ツールキットを提供する。

自然言語処理大規模言語モデル生成画像

用途: 音声認識の基盤技術の提供
難易度: Easy
コスト: High

SimpleTuner — A general fine-tuning kit geared toward image/video/audio diffusion models.

画像やビデオやオーディオディフュージョンモデルのファインチューニングを行うための、汎用的なファインチューニングキット。

自然言語処理ファインチューニング画像音声動画

用途: ディフュージョンモデルのファインチューニング
難易度: Easy
コスト: High

表形式向き自然言語処理大規模言語モデル画像テキスト表形式

unstructured — Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

ドキュメントを構造化するために使えるオープンソースのETLソリューション。

用途: ドキュメントの構造化
難易度: Easy
コスト: High

modelscope — ModelScope: bring the notion of Model-as-a-Service to life.

ModelScopeは、モデルをサービス化するためのプラットフォームです。モデルを作成し、ホスティングし、管理し、配信することができます。

自然言語処理音声

用途: モデルをサービス化する
難易度: Easy
コスト: Medium

External Clustering Validation by the Homogeneity-Parsimony Trade-off

Scalar metrics are often used to evaluate clusterings against known classes, but they can obscure a fundamenta

用途: 分類
難易度: Hard
コスト: Low

説明可能MI向き品質予測/異常検知自然言語処理ファインチューニング予測

Toward Mechanistic Interpretability of an AI Foundation Model Fine-Tuned for Atmospheric Chemistry

Weather forecasting foundation models (FMs) are increasingly fine-tuned to predict air quality, offering fast

用途: 予測
難易度: Hard
コスト: High

Are Diversity Metrics Measuring Diversity? A Capability-Controlled Audit of Majority-Vote Gain in LLM Ensembles

Majority voting over LLMs is widely assumed to benefit from diversity, and diversity measures are used to choo

自然言語処理大規模言語モデル回帰

用途: 回帰
難易度: Hard
コスト: High

Pipelined Gradient Coding

In large-scale machine learning, distributed training commonly involves multiple workers evaluating the gradie

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

LLMs Get Lost in Evolving User Intent

As LLMs become more capable, they are increasingly deployed as collaborative agents, taking on user-delegated

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

End-to-End Learning of Safe Optimal Feedback Control in High Dimensions with Control Barrier Function Layers

We consider the problem of learning high-dimensional semi-global feedback controllers under hard safety constr

自然言語処理埋め込み・検索

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング分類埋め込み

Cross-Domain Generalization in Optical Networks via Joint Contrastive and Classification Learning

The robustness of machine learning techniques across heterogeneous network domains remains an open challenge i

用途: 分類
難易度: Hard
コスト: Low

PhysCoRe: Physics-Corrected Residual World Models for Material-Aware Deformable Dynamics

Predicting how deformable objects evolve under robotic manipulation is a longstanding challenge. Existing appr

自然言語処理ファインチューニング画像

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Frontier Financial Judgement: Can agents tell what might move a stock?

We introduce Frontier Financial Judgement, a challenging new benchmark developed in collaboration with profess

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Towards Miniature Humanoid Tele-Loco-Manipulation Using Virtual Reality and Reinforcement Learning

この研究では、人間の遠隔操作を可能にするために、バーチャルリアリティと強化学習を組み合わせることを提案した。人類との対話に従って、ロボットの身体を操作し、移動することができるようになった。

用途: 人間の遠隔操作
難易度: Hard
コスト: Low

Are Single-Token Sparse Autoencoder Features Causally Necessary? Layer-Depth and SAE-Family Effects

Sparse autoencoder (SAE) features are used to interpret and steer large language models, yet whether a feature

説明可能自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Online Variance Reduction for Domain Adaptation on Streaming Data

この研究では、最大平均分散 (MMD) と相関類似度 (CORAL) という 2 つの分散マッチングフレームワークのために、オンライン分散処理に適したサブサンプリング方法を開発した。

用途: ドメインアダプタシオンにおける分散処理への適応
難易度: Hard
コスト: Low

Self-supervision drives representational convergence in medical foundation models more than clinical supervision

Medical image encoders from different groups are increasingly treated as interchangeable, on the assumption th

自然言語処理RAG分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

STeMP: Spatio-Temporal Modelling Protocol

Spatio-temporal machine-learning modelling is an important tool in environmental research. However, machine-le

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Adaptive Bayesian Online Learning via Expert Aggregation

Bayesian online learning promises uncertainty-aware prediction on data streams, but its performance hinges on

少数データ向き自然言語処理RAG回帰

用途: 回帰
難易度: Hard
コスト: Low

PIER: Physics-Informed Environmental Retrieval for Time-Series Modeling

Accurate modeling of environmental systems is fundamental to scientific understanding and decision-making, yet

センサ/時系列自然言語処理RAG時系列

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Plausibility-Driven Prioritization of Candidate Biomedical Annotations

生分子注文を扱う研究、Plausibility-Driven Prioritization を用いて生分子注文を提案する。

自然言語処理RAG分類テキスト

用途: 生分子注文
難易度: Hard
コスト: Low

SPECTRA: State-Space Exogenous Context and Temporal-Frequency Resolution Architecture for Probabilistic Energy Forecasting

Modern power systems increasingly require probabilistic forecasts amid interacting uncertainties from renewabl

自然言語処理RAG生成予測テキスト

用途: 生成
難易度: Hard
コスト: Low

Active Inference as a Convex Markov Decision Process

エピステミック目標を扱う研究、Active Inference を用いてエピステミック目標を提案する。

用途: エピステミック目標
難易度: Hard
コスト: Low

センサ/時系列自然言語処理ファインチューニング分類

Autonomous Collaborative Learning Among an Ensemble of Tsetlin Machines with Consensus-Based Inference

協同学習を扱う研究、Autonomous Collaborative Learning を用いて協同学習を提案する。

用途: 協同学習
難易度: Hard
コスト: Low

Directional Kernel Mean Difference: A Fast Signed Statistic for Univariate Distribution Comparison

分布の比較を扱う研究、Directional Kernel Mean Difference を用いて分布の比較を提案する。

自然言語処理埋め込み・検索

用途: 分布の比較
難易度: Hard
コスト: Low

Machine-Learned Compact Subspace Generation for Quantum Selected Configuration Interaction within Density Matrix Embedding Framework

Sample-based Quantum Diagonalization (SQD), an extension of Quantum Selected Configuration Interaction (QSCI),

自然言語処理RAG生成テキスト

用途: 生成
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデルテキスト

Co-Evolving LLM Evaluators and Policies via DynamicRubric

Post-training with evaluator feedback on policy-induced samples serves as a major mechanism for improving larg

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Evaluating and Mitigating Gender Bias in Pre-trained Embeddings for ML-based Recruitment

この研究では、AIベースの採用システムを使用して、過去の履歴データに基づいて機械学習モデルをトレーニングすることで、社会的偏見を永続させたり強化したりするリスクを評価および緩和することを目的とした。

自然言語処理埋め込み・検索テキスト

用途: 人才採用における偏差解消
難易度: Hard
コスト: Low

Antigen-specific Antibody Multi-modal Foundation Model for Functional Antibody Design

この研究では、抗原特異性抗体を設計するために、抗原および抗体の間でエピトープレベルでのペアリングが必要であることを考慮した、抗原特異性の抗体多モーダルファンデーションモデル（AAMFM）を提案しました。

自然言語処理RAG分類生成テキスト

用途: 抗原特異性抗体設計
難易度: Hard
コスト: High

Test Case Prioritization for DNNs via Neural Collapse Instability

この研究では、深層人工神経ネットワーク（DNN）の検証コストを削減するために、ニューラル崩壊 instabilitiyを用いたテストケース優先順位付け方法を提案しました。

用途: DNN検証費用の削減
難易度: Hard
コスト: High

センサ/時系列自然言語処理RAG予測テキスト時系列

Zero-Shot Heart Rate Variability Forecasting from Consumer Wearables Using Time Series Foundation Models

この研究では、Consumer Wearablesに基づく Short-term Heart Rate Variability（HRV）予測を目的とした、Time Series Foundation Modelsの評価を

用途: HRV予測
難易度: Hard
コスト: Low

HijackKV: New Threat in Position-Independent KV Cache Reuse

この研究では、マルウェア検出に使用されるデープラーニングモデルにおける、位置依存性KVキャッシュ（Key-value Cache）を改善する方法を提案しました。

用途: マルウェア検出における位置依存性KVキャッシュの改善
難易度: Hard
コスト: High

Diffusion ReRoll: Revisable Denoising for Robotic Sequential Prediction

この研究では、実世界ロボットのシーケンシャル予測に使用できる、diffusion-based frameworkを提案しました。

自然言語処理RAG生成異常検知テキスト

用途: 実世界ロボットのシーケンシャル予測
難易度: Hard
コスト: High

Nonlinear Bias-Compensated Adaptive Filter and Its Application for Time-Series Prediction

この研究では、時系列予測において、非線形バイアス補正を用いた新しいfilterを提案しました。

センサ/時系列自然言語処理RAG予測時系列

用途: 時系列予測における非線形バイアス補正
難易度: Hard
コスト: Low

表形式向き品質予測/異常検知自然言語処理ファインチューニングテキスト表形式強化学習

Asymptotically Optimal Regret for Reinforcement Learning without Horizon Dependence

We study horizon-free regret minimization for finite-horizon time-homogeneous tabular Markov decision processe

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

表形式向き自然言語処理大規模言語モデルテキスト表形式

Auto-Fill: Learning to Predict Missing Values Accurately with Specialist Language Models

Predicting missing cell values in tabular data is a fundamental problem in data cleaning. While state-of-the-a

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Know Your Agent: Reconnaissance-Driven Pentesting of AI Agents

AIエージェントを標的にしたテストを実施し、潜在的な見過ごしあり難い弱点を捜索して強化された攻撃戦略を実施することを提唱している。

用途: 弱点の侵食の対策
難易度: Hard
コスト: Low

MI向き自然言語処理ファインチューニング生成テキストマルチモーダル

Hypothesis-and-Refinement Learning of Organic Structures from Multimodal Spectroscopic Data

分子構造を決定するために、スペクトルデータから自動的な構造解析を実施するための方法を提案している。この方法は、スペクトルデータに基づいてヒントと改良を繰り返すことで、分子構造を決定するもので、分子の可能性の広範な構造スペ

用途: 分子構造の解析
難易度: Hard
コスト: High

AlphaRoute: Large Language Models as Semantic Optimizers for Multi-Objective Routing

VLSIのグローバルルーティングは、信号ネットワークを 3D グリッド上で割り当てることが目的であり、信号遅れやワ

説明可能自然言語処理大規模言語モデルテキスト3D

用途: マルチ目標ルーティング
難易度: Hard
コスト: High

説明可能自然言語処理プロンプトエンジニアリング強化学習

SLPO: Scaling Latent Reasoning via a Surrogate Policy

この研究では、ラベラーの品質が悪い場合の対策として、ラベラーの評価を自動化します。特に、ラベラーの評価はオブジェクト検出のタスクでは困難です。したがって、ラベラーの評価を自動化するために、画像認識のデータを分析してラベラ

用途: ラベル付けの品質を確保し、品質管理が必要な画像認識
難易度: Hard
コスト: Medium

HARP: The Human--AI Research Platform

Large language models (LLMs) have shifted human--computer interaction from `traditional'' interface journeys t

MI向き自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

IssueTrojanBench: Benchmarking AI Coding Agents Against Malicious Issue Requests

AI coding agents powered by LLMs are increasingly integrated into real-world software development, where they

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

GPE: Evaluating Robust Evidence Aggregation for Fact Verification under Controllable GEO-Style Poisoning

Large language models increasingly use search tools to retrieve up-to-date information, introducing a new atta

用途: 生成
難易度: Hard
コスト: High

NVIDIA-labs OO Agents: Native Python Object-Oriented Agents

Traditional agent development is split across prompt templates, tool schemas, callback code, and workflow grap

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

DS@GT ARC at ImageCLEFmed GANs 2026: Geometric Filtering for Privacy-Preserving CT Slice Generation

We present a privacy-preserving framework for synthetic lung CT slice generation developed for the Image-CLEFm

自然言語処理埋め込み・検索生成画像

用途: 生成
難易度: Hard
コスト: High

説明可能センサ/時系列品質予測/異常検知自然言語処理RAG分類音声

Spatially Grounded Concept Bottleneck Models for Trustworthy Breast Ultrasound Diagnosis

Concept Bottleneck Models provide interpretable-by-design predictions by mediating diagnosis through human-und

用途: 分類
難易度: Hard
コスト: Low

WaveformQA: Benchmarking LLM Temporal Reasoning on Digital Waveforms

Large Language Models (LLMs) have demonstrated strong capabilities in code generation and reasoning, yet their

用途: 生成
難易度: Hard
コスト: High

SoftReason: A Fully Differentiable Neuro-Soft-Symbolic Deductive Reasoning Architecture over High-Dimensional Perceptual Data

In many reasoning problems, the premises are not observed as discrete symbols, but must be inferred from high-

MI向き自然言語処理埋め込み・検索QA画像

用途: QA
難易度: Hard
コスト: Low

FMRP-LEAN: A HIPAA-Compliant AI-Augmented LIMS Architecture for End-to-End Clinical Assay Workflow Optimization

Clinical biomarker workflows in translational research settings often rely on spreadsheet-driven tracking, man

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Understanding Generative AI-mediated User Engagement with Academic Library Resources

This study empirically analyzed generative AI as an emerging discovery pathway to academic library resources.

用途: 生成
難易度: Hard
コスト: High

表形式向き自然言語処理大規模言語モデル生成テキスト

PoTRE: Test-Time Reasoning inspired by Cognitive Heterogeneity

モデルの脆弱性を解決するために、四つのエージェントに分割される多様なフレームワークPoTREを導入した。モデルの推論能力を強化し、単一のストリーミングアプローチよりも複雑な理論的制約とアブストラクションに抵抗できるように

用途: 複雑な推論力のあるタスクの解決
難易度: Hard
コスト: High

The Ethics of Autonomous AI Agents for Offensive Security

侵攻テストツールが異なっている点、決定主義的な性質、狭く特定されたスコープ、専門技術の操作を用いたものと異なり、LLM駆動の自治的セキュリティツールは3つの次元で不確実性を示した。政策決定への説明が困難、影響の開放性、行

用途: 自律的セキュリティツールの倫理的考慮
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル翻訳テキスト

On the Systematic Challenges of Culturally Loaded Machine Translation: Dream of the Red Chamber as the Cultural Lens

文化的意味の表現が表現された翻訳には、翻訳システムが表現する意味を理解するために、表現の文化的背景を考慮する必要があることを指摘した。文化的背景が表現されている表現された翻訳には、いくつかの課題があり、LLMベースの翻訳

用途: 文化的意味の表現
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成

DQAOA-GPT: AI-Accelerated Distributed Quantum Optimization for Combinatorial Problems

組み合わせ方程式の最適化を解くための新しいフレームワークを提示した。分布される量子アルゴリズムの局所的な制限に直面する際、最適化の解を導けるために、分布される量子近似最適化アプローチと深層学習アルゴリズムを組み合わせた。

用途: 方程式組み合わせの最適化
難易度: Hard
コスト: High

Small, Free, and Effective: Orchestrating Open-Weight Small Language Models to Outperform Single LLM for Malware Analysis

分析報告の迅速な解釈が求められるときに行われるマルウェア分析を実現するために、閉じた重みの大きい言語モデルを使用しないことが多い。オープン重みの言語モデルは、マルウェア分析のために適切な言語能力と、閉じた重みの大きい言語

用途: マルウェア分析のための小規模な言語モデル
難易度: Hard
コスト: High

CUSUM-Shaped Inference-Time Monitoring and Targeted Re-Decoding for Quantized Small Language Model Reasoning

Quantized small autoregressive reasoning models can enter long, repetitive, or unproductive trajectories, yet

自然言語処理RAG検出生成回帰

用途: 検出
難易度: Hard
コスト: Low

Reinforcement Learning for Large Language Model Selective Evidence Adoption from Contaminated Retrieval Results

Retrieval-augmented large language models frequently face contexts that interleave useful evidence with mislea

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Reading and Steering Representations of Materials-Science Mechanisms in an Open-Weight Language Model

Large language models can answer scientific questions, yet a correct output does not reveal whether the model

MI向き自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Language-Specific versus Cross-Lingual Knowledge Graphs for Implicit Aspect Identification in Arabic: A Comparative Study of Reasoning and Adaptation Strategies

Aspect-based sentiment analysis (ABSA) in Arabic must recover both explicitly stated aspects and implicit aspe

用途: 生成
難易度: Hard
コスト: High

Geometric Configurations of Perturbed Jailbreak Prompts

Perturbation techniques that turn unsuccessful jailbreak prompts into successful ones are continuously evolvin

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Drift-Aware RL-based Wavelet Denoising for Network-Traffic Anomaly Detection

回線流量データに対するノイズと漂移を考慮した波列減少アルゴリズムを実装し、静的な波列減少法が漂移のあるシナリオでは効果を低下していると指摘する。

品質予測/異常検知自然言語処理RAG検出異常検知

用途: 回線流量異常検出システムの精度向上
難易度: Hard
コスト: Low

Rushes: A Human Preference Dataset for Pluralistic Alignment

We introduce Rushes, a dataset and benchmark for studying revealed human engagement preferences in interactive

用途: 生成
難易度: Hard
コスト: High

REGARD: Regional Affective Differences in Large Language Models

Large language models trained and aligned within different linguistic and regional ecosystems may frame the sa

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

LKValues: Aligning Large Language Models with Sri Lankan Societal Values

Value alignment of Large Language Models (LLMs) has been shown to be culturally biased toward Western norms. T

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Notes to Self: Can LLMs Benefit from Experiential Abstractions?

Humans distill experience into reusable abstractions, e.g., strategies and cautionary reminders, and apply the

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Which Values Do LLMs Confuse? A Schwartz-Based Recognition Study

LLMは、状況から価値観を判断できるかどうか、という研究が調査されました。LLMは、状況に応じて真の価値観を推測することができました。

用途: LLMが真の価値観を理解できているかどうかを検証する
難易度: Hard
コスト: High

Exposure is Optional: Learning Unlike Coordination in Language Models

同じカテゴリを組み合わせることしかできないという考えに対抗して、異なるカテゴリを組み合わせることができるかどうかについて、言語モデルが調査されました。

用途: 不同のカテゴリを組み合わせることができるかどうかを検証する
難易度: Hard
コスト: High

Evaluating the Effectiveness of Persona Simulation in Opinion Prediction with GPT-4.1

Persona simulation involves utilizing large language models (LLMs) to anticipate human choices or interactions

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

HalluTruthQA: A Fine-Grained Benchmark for Hallucination Detection, Localization, and Explanation in Arabic Question Answering

大きな言語モデルは真実の情報を提供できるように見えますが、実際は虚偽情報を提供することが多く、これを検知、検出、および検証するための基準を作成するため、HalluTruthQAが開発されました。

自然言語処理大規模言語モデル検出QAテキスト

用途: 仮想の答えを検知、検出、および検証するための基準を作成する
難易度: Hard
コスト: High

surprisal is Not a Theory

大きな言語モデルは、さまざまな価値観に対して異なる反応を示すことがあり、これらの反応がどのように影響するかを調べました。

用途: 様々な価値観に対するLLMの反応を調べる
難易度: Hard
コスト: High

Gotta Catch them all: the modes of Sycophancy

大きな言語モデルは、ユーザーの信念と事実的な正しさを合わせる傾向があるが、これらの傾向は多様であることを明らかにしました。

用途: 大きな言語モデルの恭順のさまざまなタイプを調べる
難易度: Hard
コスト: High

MI向き自然言語処理大規模言語モデル生成画像テキスト

Back to Back with a Copy: A Computational Analysis of AI-Generated Visual Contemporary Art Pastiches

AIは、特に当代芸術作品のパスティーシュを作成する能力が高いが、これらの作品はどれだけ実際の作品と似ているかを調べました。

用途: AI生成された芸術作品と原画との相似性を調べる
難易度: Hard
コスト: High

OpenSkillRisk: Benchmarking Agent Safety When Using Real-World Risky Third-Party Skills

大きな言語モデルのエージェントは、第三者のスキルによる実際的な危険を認識し回避する能力を評価します。

用途: 第三者のスキルで安全でない動作を行うリスクを評価する
難易度: Hard
コスト: High

Understanding the Impact of Linguistic Realization Choices on LLM Stance with Causal Tracing

大きな言語モデルの答えは、質問や入力の形態に応じて異なる傾向があることを認識しました。

用途: LLMの立場を調べるための言語現実化の影響を調べる
難易度: Hard
コスト: High

The Two-Process Theory of Machine Self-Report

大きな言語モデルは自発報告を提供するが、これらの報告は人間のインタビュー問診や不明確な提示に基づいており、このモデルの自発報告能力とその心理的意味を理解することが求められました。

自然言語処理ファインチューニングテキスト

用途: モデルの自発報告能力を理解する
難易度: Hard
コスト: High

TINY_SCHILLER: A Drop-In German Drama Corpus for Small Language Models

小さな言語モデルに対するドロップインコーパス、tiny_schillerを導入し、単一ファイルで利用できるようにし、言語モデルを簡単にprototyping、fine-tuning、教育、研究に利用できるようにする。

用途: ナイーブな言語モデルに対するドロップインコーパスの提供
難易度: Hard
コスト: High

JANUS: Foreseeing Latent Risk for Long-Horizon Agent Safety

Agent safety is moving from content moderation toward preventing operational failures before tool-using agents

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Overview of FinMMEval 2026 Task 2: Multilingual Financial Short-Answer Question Answering

FinMMEval 2026 タスク 2 は、英語で提出された短答式の金融問題を解決することを目的としています。英語以外の言語による証拠も使用されます。

自然言語処理RAG生成QA検索

用途: 金融問題を解決する
難易度: Hard
コスト: Low

Overview of FinMMEval 2026 Task 1: Multilingual Financial Multiple-Choice Question Answering

FinMMEval 2026 タスク 1 は、英語、中国語、アラビア語、ヒンディー語で行われる多言語的な金融質問に答えるものを評価します。

自然言語処理大規模言語モデルQAテキスト

用途: 金融問題を解決する
難易度: Hard
コスト: High

emb-diversity: A Tool for Embedding-Based Measurement of Data Diversity

There is growing evidence that data diversity is crucial for developing fair and robust NLP models. However, c

自然言語処理RAGテキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

D2VBench: Benchmarking Large Language Models with Value Dilemmas in Daily Scenarios

With the wide application of large language models (LLMs) in real-world scenarios, the value implication of th

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

VizRAG: Enhancing Retrieval-Augmented Generation with Hypergraph Visualization

Hypergraph-based RAG systems surpass traditional graph-based approaches by organizing complex n-ary atomic fac

用途: 生成
難易度: Hard
コスト: High

Rewarding Better Thinking for LLM Preference Alignment

この研究では、偏見が蓄積されることが多くのLLMで問題となります。一方、この研究によって、LLMの偏見を解決する新しいアプローチが提案されました。

用途: LLMの偏見を解決する
難易度: Hard
コスト: High

Beyond Relevance-Centric Retrieval: Rubric-Oriented Document Set Selection and Ranking

3D オキュピエンシー予測には、物体の配置と密度を解釈するための視覚的手法が必要です。従来の方法では、計算コストが高くなりすぎていたが、新しく提案されたGaussianSeedアルゴリズムは、層を階層化することで、計算コ

用途: 3次元空間における物体の配置と密度の予測
難易度: Hard
コスト: High

Reference-Free Evaluation of Reasoning in Open-Ended Question Answering

この研究では、AI生成物の論理的評価に必要なものとして、生成物がどうやって結果を得るのかを明らかにすることの重要性を強調しています。この研究では生成物を分解し、その論理的な構造を理解するために自然言語推論を利用し、生成物

自然言語処理大規模言語モデルQAテキスト

用途: AI生成物の論理的評価
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理RAG生成テキスト3D

3D-GIMP: When 3D Gaussian Inpainting Meets PatchMatch

Recent advances in 3D scene editing have leveraged iterative diffusion models to update input views. However,

用途: 生成
難易度: Hard
コスト: High

ODeform: Learning Continuous 4D Motion for Shape Deformation with Neural ODEs

Modeling continuous object deformation is important for many computer vision and robotics tasks, such as manip

自然言語処理埋め込み・検索3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成動画強化学習

PercepCap: Video Captioner with Structured Spatio-Temporal Perception

ビデオキャプション生成には、空間と時刻の理解が重要です。PercepCapアルゴリズムは、ビデオ入力を空間時刻認識に分解することで、生成されたキャプションの理解度が向上するとともに、空間時刻の誤差をより正確に検出でき、キ

用途: ビデオキャプション生成のための構造化された空間時刻の理解
難易度: Hard
コスト: High

Diverse-Intent Multi-Turn Fashion Image Retrieval

複数ターンのファッション画像検索は、実世界のファッション検索では重要なタスクです。Diverse-Intent Multi-Turn Fashion Image Retrievalアルゴリズムは、異なる検索用途を扱うこと

用途: 複数ターンのファッション画像検索
難易度: Hard
コスト: High

センサ/時系列品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

RS-RIE-Bench: Benchmarking Reasoning-Guided Remote Sensing Image Editing

Remote sensing image editing aims to modify remote sensing images according to natural language instructions w

用途: 生成
難易度: Hard
コスト: High

Development of an automated, reliable, and clinically meaningful artificial intelligence (AI) tool for diagnosing cardiac disease from conventional cardiovascular magnetic resonance (CMR) images

Aims: Cardiovascular magnetic resonance (CMR) imaging enables non-invasive assessment of myocardial structure,

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

自然言語処理プロンプトエンジニアリング検出画像テキスト

OffNadirLoc: Benchmark and Framework for Challenging UAV-to-Satellite Geo-Localization under Large Off-Nadir Views

OffNadirLocは交差視点地理位置を推定するための基準セットを提案します。これにより、ドローンと衛星画像の交差視点地理位置推定プロセスでは重要な構造的シーン理解と内部ドメイン間の関係制約に焦点を当てることができます

用途: ユーザー間の地理的位置の推定改善
難易度: Hard
コスト: High

MI向き品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

ETPDesigner: Multi-Agent Orchestration for Interactive Multimodal Electronic Theater Program

ETPデザイナはマルチモーダルな電子シアターのデザインを自動化するフレームワークを提案します。

用途: 生成
難易度: Hard
コスト: High

WearWow: Native 2K Multi-Garment Virtual Try-On via Adaptive Token Packing and Preference Alignment

Synthesizing native 2K multi-garment virtual try-on is a formidable frontier in digital fashion, critically bo

品質予測/異常検知自然言語処理RAG生成テキスト

用途: 生成
難易度: Hard
コスト: High

LoRFT: Benchmarking Long-Range Vehicle Trajectory Reconstruction from Fixed Highway Cameras

Long-range vehicle trajectories provide important spatio-temporal evidence for traffic safety analysis, autono

自然言語処理RAG検出動画

用途: 検出
難易度: Hard
コスト: High

MV-Bench: Benchmarking Multimodal Large Language Models for Coordinated Multi-View Interface Construction

Multimodal large language models (MLLMs) are increasingly expected to automate visualization development by ge

用途: 生成
難易度: Hard
コスト: High

LAVIFT: Latent-Action-Guided Vision Fine-Tuning for Surgical Interaction Recognition

Understanding instrument-tissue interactions is essential for context-aware surgical AI and autonomous robotic

自然言語処理ファインチューニング分類検出画像

用途: 分類
難易度: Hard
コスト: High

自然言語処理ファインチューニング画像動画マルチモーダル

EA-Nav: Learning Safe Visual Navigation Policies with Embodiment Awareness

Cross-embodiment navigation is a key challenge in embodied intelligence. Due to differences in embodiment, the

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向きCPUで試しやすい条件最適化自然言語処理ファインチューニング検出生成画像

PRISM-DR: Per-lesion Retinal Inference with Specialist Models for Diabetic Retinopathy

この研究では、糖尿病性黄斑病変の検出を目的としたPRISM-DRシステムを開発しました。このシステムは、医師が見逃す可能性がある小さな低コントラストな病変を見つけるのに役立ちます。

用途: 糖尿病性黄斑病変を検出する
難易度: Hard
コスト: High

自然言語処理大規模言語モデルセグメンテーション画像テキスト

Memory-Augmented Multimodal Large Language Models for Small Object Understanding in Streaming Aerial Videos

この研究では、ドローンで小さな物体を認識することを目的としたメモリ拡張型大規模言語モデルを開発しました。このモデルは、複雑なドローンの場面で、ユーザーの指示に従って物体を識別できるようになります。

用途: ドローンで物体認識を実行する
難易度: Hard
コスト: High

少数データ向き品質予測/異常検知自然言語処理プロンプトエンジニアリング動画

MoAKE: Toward Unified All-in-One Action Quality Assessment via Mixture of Action Knowledge Experts

Action Quality Assessment (AQA) aims to objectively evaluate performance quality from action videos. Most exis

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Trace: A Taxonomy-Guided Environment for Multidomain Visual Reasoning

自動運転システムには、道路のトポロジー（ドライバブルレーンとその接続性）を理解する機能が必要です。最近の検出モデルは360度の前方視野からボリュームイメージを取得することで、道路上のレーンのトポロジーを推測することができ

自然言語処理RAG画像テキストマルチモーダル

用途: 道路のトポロジー認識を改善
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング生成セグメンテーション画像

Extending a Large View Synthesis Model for Multi-view Panoptic Segmentation

自律ロボットには、障害物や事故の回避能力が必要です。これは、障害物や事故の回避能力が強化されていれば、障害物や事故に対しての対策がより効果的になります。障害物や事故の回避能力が強まることで、ロボットが障害物や事故から安全

用途: 自動ロボットが障害物や事故を回避できるようにする
難易度: Hard
コスト: High

SafeGen: Goal-Conditioned Video Diffusion of Safety-Critical Scenarios for VLM-Based Autonomous Driving

VLMs are increasingly deployed in AD systems, creating an urgent need for rigorous safety evaluation under rar

自然言語処理RAG生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

NavVerse: Benchmarking Indoor-to-Outdoor Embodied Navigation in Continuous Robot Simulation

Robots deployed in delivery, campus, and emergency-response settings often need to navigate from buildings to

自然言語処理プロンプトエンジニアリングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Decentralized UAV Swarms for Ground Target Protection in GPS- and Communication-Denied Environments

The presence of UAVs in military operations has recently increased, also increasing the demand for defense sys

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Distributed Motion Planning with Safety Guarantees for Self-Reconfiguring Robotic Boats

Aquatic self-reconfigurable robots must assemble into desired shapes while ensuring safe interactions among mu

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

DINS-IO: Learned Inertial Odometry via Differentiable INS Consistency

The training of learned inertial odometry depends on dense, high-precision position ground truth from motion c

自然言語処理ファインチューニング画像自己教師

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

自然言語処理プロンプトエンジニアリング検出画像テキスト

ReferTrack: Referring Then Tracking for Embodied Visual Tracking

ReferTrack は、自然言語で対象の車両に付近する自動車を追従させるシステムである。このシステムでは、対象の車両に付近する自動車を認識する後、自動車の動きを予測する。

用途: 自動車が対象の車両に付きそわせるシステム
難易度: Hard
コスト: High

Robots Acquire Manipulation Skills in Seconds from a Single Human Video

HOST は、ロボットが人間の動作からスキルをすぐに習得できるシステムである。このシステムでは、ロボットは単一の人間の動作ビデオからスキルを習得し、既に習得したスキルを維持する。

自然言語処理RAG動画

用途: ロボットが人間の動作からスキルをすぐに習得できるシステム
難易度: Hard
コスト: High

Optimal Placement of Docking Stations and Resident AUVs for Subsea Pipeline Inspection

このシステムは、沼辺のパイプラインを調査するために使用される。システムはサブシードッキングプレート（SDP）の配置と潜在的に泄漏する場所への居住者の無人潜水車（AUV）の割り当て両方の混和線型プログラミングフレームワ

用途: 沢辺のパイプラインを調査するための調査システム
難易度: Hard
コスト: Low

What Matters in Humanoid General Motion Tracking? An Empirical Study

この研究では、人形の一般的運動追随のための政策の開発が行われた。政策は、人形が身体の全体を示す指示の後にバランスを維持することができるようにする。

自然言語処理プロンプトエンジニアリング

用途: 人形の普遍的な運動追随のための政策開発
難易度: Hard
コスト: High

V2F: Vision-Informed Grasp Force Prediction for Damage-Aware Robotic Handling of Date Fruits

V2F は、日用消費財をロボットで取扱するためのシステムである。このシステムでは、ロボットが消費財を取扱うときに必要な力を予測し、物体を傷つけたり、物を作れなかったりするのを防ぐ。

自然言語処理RAGセグメンテーション画像

用途: 日用消費財をロボットで取扱するためのシステム
難易度: Hard
コスト: Low

huggingfaceHugging Faceあり2026-07-22

NexForge: Scaling Agent Capabilities through Requirement-Driven Task Synthesis for LLMs

Scaling executable agent training data for LLM post-training is bottlenecked by substrate-bound methods that t

用途: 生成
難易度: Easy
コスト: High

githubGitHubあり2026-07-22

opencompass — OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

LLMを評価するプラットフォームであり、さまざまなモデルとデータセットをサポートする。

用途: LLM評価
難易度: Easy
コスト: High

githubGitHubあり2026-07-22

atomic-agents — Building AI agents, atomically

AIエージェントを組み立てるためのライブラリ。

用途: AIエージェント建設
難易度: Easy
コスト: High

githubGitHubあり2026-07-22

Finance-LLMs — Comprehensive Compilation of Real-World LLM & AI Agent Use Cases in Financial Services

販売データを分析するために、機械学習モデルが使用されるリソースが提供されていました。

用途: 販売データを分析する
難易度: Easy
コスト: High

Optimizing Regret

決定関数とコストの関数間の共変性により、損失関数を最適化することで、適切な行動決定を可能にすることができます。また、これに基づいて、共変性の傾向を最適化する方向性を考察し、正確に予測された結果を持つモデルを導出するのに役

用途: 適切な行動決定のための損失関数の最適化
難易度: Hard
コスト: High

Towards chemistries in dynamical systems

ディナミカルシステムを化学的なパラフレーズで説明する手法を提案し、システムの

用途: ダイナミカルシステムの化学的説明
難易度: Hard
コスト: Medium

Adaptive Capitulation: A Structural Failure Mode of LLM Responses in Vulnerability Contexts

Large language models operating in emotionally sensitive contexts face a structural trilemma: when users in vu

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Task Competence Is Not Instruction Following: Evaluating Instruction-Conflicting Behavior in Small Language Models

Instruction tuning is meant to make language models follow user requests, yet it is unclear whether small mode

自然言語処理ファインチューニング分類QAテキスト

用途: 分類
難易度: Hard
コスト: Low

Scaling Laws for Hypernetwork-Based Knowledge Injection in Large Language Models

ハイパーネットワークを用いた知識付与法を提案し、大規模言語モデルに確実に知識を付与する方法について検討した。

自然言語処理大規模言語モデル異常検知テキスト

用途: LLMに知識を付与
難易度: Hard
コスト: High

Twin Agent: Context Residual Compression for Privilege Separated Agents

Large language model (LLM) agents are vulnerable to security risks, such as prompt injection attacks from untr

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

When Reasoning Narrows the Move: Diversity Collapse in LLM Game Play

Supervised fine-tuning (SFT) is widely used to adapt large language models to downstream tasks, but its effect

用途: 生成
難易度: Hard
コスト: High

Copy Less, Ground More: Overcoming Repetitive Copying in Long-Context Reasoning via Evidence-Aware Reinforcement Learning

Large language models that generate step-by-step reasoning traces have achieved strong performance on complex

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Agents in the Wild: Where Research Meets Deployment

分散型言語モデル（LLM）やコンテキストを活用するエージェントは、製品開発やファイナンス分野で活用されている。エージェントを実用化するには、堅牢性、安全性、信頼性を確保することが大切となる。このチュートリアルでは、エー

用途: エージェントの実践
難易度: Hard
コスト: High

Two-Level Meta-Rubrics for Evaluating Open-Ended Generation: GAMUT, a Benchmark for Factual Completeness

Evaluating the factuality of long-form generations has focused predominantly on precision, measuring whether t

用途: 生成
難易度: Hard
コスト: High

表形式向き自然言語処理大規模言語モデルテキスト表形式

Prompt Design at Scale: How Format, Instruction Count, and Context Length Shape Instruction Adherence and Hallucination in Large Language Models

Practitioners make three prompt-design decisions with almost no controlled evidence behind them: how to format

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

MeetingToM: Evaluating Multimodal LLMs on Theory-of-Mind Reasoning in Multi-Party Meetings

Theory of Mind (ToM), the ability to infer other's beliefs, intentions, and states of knowledge, is central to

自然言語処理大規模言語モデルQAテキスト音声

用途: QA
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル翻訳テキスト強化学習

The Price of Reasoning: Cost-Quality Tradeoffs in Reinforcement Learning for Neural Machine Translation

この研究では、学生チームのテーブル演習（TTX）における評価方法を提案し、複雑でオープンエンドな状況にあるチームの行動とコミュニケーションを記録できるTTX学習プラットフォームを使用します。

用途: 計算機教育のチーム問題解決能力評価
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル生成テキスト強化学習

Beyond Score Prediction: LLM-Based Essay Scoring and Feedback Generation via Reinforcement Learning with Rubric Rewards

Large language models (LLMs) have been widely applied to automated essay scoring (AES) and automated feedback

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング翻訳テキスト強化学習

Reasoning Before Translation: Enhancing Legal Machine Translation with Structured Reasoning

この研究では、平衡方程式を満たすPINNs（物理基準付きニューラルネットワーク）を使用して、平均脱出時間の計算を目的とした椭球型境界条件付きPINNsを提案し、PINNsを使用した計算と実験室データを比較します。

用途: 平均脱出時間計算を目的とした椭円型境界条件付きPINNs
難易度: Hard
コスト: High

Automated Extraction of Techno-Economic Data from 76,000 Energy System Studies

エネルギー系モデリングの信頼性は数量的仮定に依存するが、この仮定の情報源が乏しい。データベースの更新が遅いなどの問題は、研究者の重複した努力につながる。ここではエネルギー系研究の76,000件以上のデータを対象に行った自

用途: エネルギー系研究データの自動抽出
難易度: Hard
コスト: Medium

品質予測/異常検知自然言語処理埋め込み・検索分類生成

Supra Cognitive Modes: A Routed Architecture for Agent Memory

この研究では、エージェントメモリーのワークロードは直接的事実検索、関係連鎖や現在の状態の推論、長時間の履歴上に関係がある合成を組み合わせて、Supra Cognitive Modes を開発しました。このアーキテクチャで

用途: メモリアーキテクチャの設計
難易度: Hard
コスト: Low

Benchmarking Human and Automatic Speech Recognition of Diverse Speech: Initial Results

人間の耳は最高の聴覚能力をもつものであると考えられており、音声認識では人間の聴覚機能を上回るようなシステムが作りだされるのを待っている。しかし、このようなシステムは実現しておらず、人間は音声認識システムの基準作成の参考と

センサ/時系列自然言語処理分類音声

用途: 多様な発音の音声認識の基準作成
難易度: Hard
コスト: Low

Computational Humor with Multimodal LLMs: Methods, Datasets, Evaluation, and Challenges

Multimodal humor in memes, cartoons, and comics remains difficult for AI systems because intended meaning depe

自然言語処理大規模言語モデル分類生成画像

用途: 分類
難易度: Hard
コスト: High

MedDDC-Eval: Diagnosis-Decoupled Evaluation of Multi-Turn Medical Consultation Agents

Multi-turn medical consultation agents must decide what to ask, adapt to patient responses, and determine when

説明可能品質予測/異常検知自然言語処理RAG生成

用途: 生成
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデル分類検出生成

AutoJourn: Multi-Perspective Summarisation, Bias Detection and Bias Neutralisation for LLM-Generated News in Automated Journalism

We present AutoJourn, a demonstration system for multi-perspective news generation and bias-aware evaluation u

用途: 分類
難易度: Hard
コスト: High

Measuring Reward-Seeking via Contrastive Belief Updates

この研究では、強化学習の報酬探求を量化するために、新しい測定方法を提案しています。この方法は、モデルが報酬を取得する際にどのように操作しようとしているかを示すことができます。

用途: 強化学習における報酬探求の測定
難易度: Hard
コスト: High

Reasoning Error from Known Fact: Step-Level Self-Consistency Group Relative Policy Optimization for LLM

人間は、大きな言語モデルを使って長い論理的推論を行うが、このような推論の結果は正しくない可能性がある。ここでは、これらの発言を検証する手法を提唱する。

用途: 大型言語モデルの中での論理的推論を検証する
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル生成QAテキスト

AILQA: Evaluating AI-Driven Legal Question Answering Systems for the Indian Legal System

This comprehensive study introduces an advanced Artificial Intelligence for Indian Legal Question Answering (A

用途: 生成
難易度: Hard
コスト: High

CASE: Causal Alignment and Structural Enforcement for Improving Chain-of-Thought Faithfulness

Chain-of-thought (CoT) reasoning is widely used to improve both the performance and interpretability of large

説明可能自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

AI Tour Meeting: Group Travel Planning by LLM Agents

This paper proposes AI Tour Meeting, a group travel planning framework powered by multiple Large Language Mode

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Measuring AI innovation with trademark data

Researchers, managers and policymakers are exploring different approaches and data sources to map the developm

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

BaseRT: Advancing Best-in-Class LLM Inference with Apple M5 Neural Accelerators

Apple's M5 generation introduces a redesigned GPU architecture in which every core carries a dedicated Neural

用途: 生成
難易度: Hard
コスト: High

AgentDebugX: An Open-Source Toolkit for Failure Observability, Attribution, and Recovery in LLM Agents

LLM agent failures are difficult to debug because the step where an error surfaces is often not the one that c

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Find Before You Fine-Tune: A Diagnostic Study of Small LLMs for Cybersecurity QA

Large Language Models (LLMs) are increasingly fine-tuned for critical-domain Question-Answering (QA), yet choo

用途: 分類
難易度: Hard
コスト: High

Semantic Primes as Explanans for Emotion in Large Language Models

大判言語モデル（LLM）における感情の解釈を研究し、感情表現は内在する主観的変数によってどのように説明されるかを問う。

用途: 感情解釈
難易度: Hard
コスト: High

Fusion Embedding: A Unified Embedding Space for Text, Image, Video, and Audio

A single embedding space that covers text, images, video, and audio lets one index serve every query a user ca

用途: 生成
難易度: Hard
コスト: High

Stochastic Meta-Unlearning: Bridging Language Backbone and Multimodal Unlearning

Machine unlearning for vision-language models (VLMs) remains underexplored. Unlike language models, VLMs combi

自然言語処理RAG画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

AutoIndex: Learning Representation Programs for Retrieval

リカバリーのためのプログラムを学習するフレームワークを提案し、そのプログラムを用いて、文書にラベルを付与する検索システムを構築する。

品質予測/異常検知自然言語処理RAGテキスト

用途: リカバリーのためのプログラムの学習
難易度: Easy
コスト: Low

センサ/時系列自然言語処理大規模言語モデル画像テキスト動画

D3VL: Understanding Driving Scenes from 3D Time Series Data and Video with Language Models

Recent advances in Multimodal Large Language Models (MLLMs) have triggered the development of end-to-end MLLMs

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Crowd4D: Scene-Aware Monocular 4D Crowd Reconstruction

Recovering scene-consistent 4D crowd motion from monocular video in large-scale scenes remains challenging due

自然言語処理RAG画像動画3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

MI向き自然言語処理ファインチューニング生成画像テキスト

ExpertVerse: A General-Purpose Benchmark for Expert-Level Reasoning in Knowledge-Intensive Visual Synthesis

Recent advances in multimodal generative models have enabled instruction-based image generation to move beyond

用途: 生成
難易度: Hard
コスト: High

MI向き品質予測/異常検知自然言語処理大規模言語モデル画像音声動画

OmniReasoner: Thinking with Long Audio-Video via Native Tool Use

オリジナルのデータとZoom-Inのツールを組み合わせた方法、OmniReasonerを提案する。これにより、オリンモードルLLMsの長いオーディオビデオの論理的推論を改善できる。

用途: 長いオーディオビデオの論理的推論を改善する
難易度: Hard
コスト: High

PathAgentBench: Benchmarking Evidence-Seeking Vision-Language Models on Whole-Slide Pathology Image

Whole-slide image (WSI) diagnosis requires identifying diagnostically relevant regions, examining them across

自然言語処理ファインチューニング検出生成画像

用途: 検出
難易度: Hard
コスト: High

Anatomy-Aware 3D Mesh Refinement of Pericardium Segmentations on Computed Tomography

心臓の囲みの区別は、食道肥厚の測定に重要であるが、しかし、これを正確に区別することは難しい。これを解決するために、周囲の解剖学的構造を利用して囲みの区別を改善する方法を提案する。

自然言語処理RAGセグメンテーション画像テキスト

用途: 心臓CT画像から心臓の囲みを正確に区別する
難易度: Hard
コスト: High

CRB-Driven Beamforming and Trajectory Optimization for UAV-assisted ISAC System

UVAを用いたISACシステムを構築し、ISACシステムの動作の最適化を行うためにCRBを利用したビーム形成法とパス追従法を提案した。

センサ/時系列自然言語処理RAG強化学習

用途: UVAを用いたISACシステム
難易度: Hard
コスト: Low

ModPack: An Extensible Teleoperation Interface for Bimanual Mobile Manipulation

Existing teleoperation systems are often tailored to specific robot hardware and task domains, limiting their

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

自然言語処理プロンプトエンジニアリング画像テキスト動画

WorldScape Policy 2.0: Empowering Steerable World Action Modeling with Reasoning-Augmented Memory

World Action Models(WAMs)は、ロボットマニピュレーションをモデル化するパラダイム。WAMsは、視覚ステートトランジションとロボットアクションを同時にモデル化する。しかし、既存のWAMsは、一定の時

用途: 多目的マニピュレーション問題を解決する
難易度: Hard
コスト: High

Correct-by-Construction Behavior Tree Synthesis from Signal Temporal Logic Specifications with Application to Robotic Missions

行動木はロボットの複雑なタスクの実行に広く採用されており、モジュラーで反応的な制御を提供します。しかし、既存の合法的な生成方法は、線形時間論理（LTL）のみに制限されるため、量的タイミング制約を表現できません。この論文で

用途: 行動木の合法的な生成を解決する
難易度: Hard
コスト: Low

RoboInter1.5: A Holistic Intermediate Representation Suite for Embodied World Modeling and Robotic Manipulation

既存のロボットデータセットは高コスト、高度個別性、不足しているフィンガープリント構造を持ったものが多い。これらの問題を解決するために、RoboInter1.0に基づいて、RoboInter1.5を提案します。RoboIn

説明可能自然言語処理RAGセグメンテーション

用途: 体系の世界モデリングを解決する
難易度: Hard
コスト: Low

How defensive driving enhances driving safety: A driving simulator study on drivers' defensive driving behaviors

「防御運転」とは、安全な運転スキルの一種です。しかし、防御運転が安全運転にどのような影響を及ぼし、どのようなメカニズムによって影響が与えられるかの研究は不足している。この研究では、防御運転の行動特徴と安全運転

用途: 乗組み安全性を優先する運転を行う
難易度: Hard
コスト: High

On the Limits of Sampling-Based Reachability: Geometry, Dynamics, and Sample Complexity

Reachability analysis is central to safety-critical control, robotics, and neural network verification, but cl

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデル検出

LLM Detection as an Intervention: Downstream Impact under Strategic User Behavior

LLMが広く使用されるようになり、LLMを識別するツールが開発されている。しかし、識別システムは、使用者の行動に影響を与えている。つまり、識別システムが機能しないと、ユーザが別のシステムを使用することに関連し、最終的な

用途: LLMを識別
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-21

FinanceComplexQA: Benchmarking Agentic Reasoning on Industrial-grade Financial Documents

Agentic Reasoning has become a transformative force in financial analysis due to its ability to integrate larg

品質予測/異常検知自然言語処理RAG生成要約テキスト

用途: 生成
難易度: Easy
コスト: Low

huggingfaceHugging Faceあり2026-07-21

Moving Alphabet: A Controlled Study of Training Data for Text-to-Video Generation

Text-to-video generation has advanced significantly over the past five years through scaling of model size, da

品質予測/異常検知自然言語処理ファインチューニング分類生成テキスト

用途: 分類
難易度: Easy
コスト: High

huggingfaceGitHubありHugging Faceあり2026-07-21

Delineate Anything v2: A Global Foundation Model for Field Delineation

Accurate agricultural field boundary delineation at large scale is a foundational task for food security, supp

自然言語処理RAG画像テキスト

用途: 技術検証・論文読解補助
難易度: Easy
コスト: Low

githubGitHubあり2026-07-21

agent-starter-pack — Ship AI Agents to Google Cloud in minutes, not months. Production-ready templates with built-in CI/CD, evaluation, and observability.

AIエージェントをGoogle Cloudに展開することが可能で、CI/CD、評価、観察など、プロダクションリードテンプレートが事前に用意されています。

用途: AIエージェントをGoogle Cloudに展開
難易度: Easy
コスト: High

githubGitHubあり2026-07-21

BettaFish — 微舆：人人可用的多Agent舆情分析助手，打破信息茧房，还原舆情原貌，预测未来走向，辅助决策！从0实现，不依赖任何框架。

微舆は人人可用的多Agent舆情分析助手であり、情報茧房を打破して舆情の原貌を還元し、未来の走向を予測し、決策を助けることができます。

用途: 舆情分析助手の問題を解決する
難易度: Easy
コスト: High

githubGitHubあり2026-07-21

TextBlob — Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

テキスト分析、センチメント分析や単語分割などを行えるライブラリ。

自然言語処理テキスト音声

用途: テキスト分析
難易度: Easy
コスト: Medium

Unveiling Invariant and Transferable Latent Factors Across Heterogeneous Environments via ATLAS

不同環境間で共通な因子を分析するため、環境依存の因子を考慮しながら共通因子をモデル化する方法を提案することを目的としている。

自然言語処理RAG回帰

用途: 不同環境間の共通因子分析
難易度: Hard
コスト: Low

Calibrating Semantic Uncertainty from Observable Language-Model Probabilities

Language models produce probabilities over words, but professional decisions require uncertainty over meaningf

自然言語処理RAG分類テキスト

用途: 分類
難易度: Hard
コスト: Low

Calibrated Alzheimer's Conversion Risk in Mild Cognitive Impairment: Persistent Homology of Clinical Trajectories with Conformal Guarantees

この論文では、アルツハイマー病の予測を行うためにPersistent HomologyとConformal Guaranteeという手法を提案する。この手法は、アルツハイマー病の予測を行うために、時間的な軌道を分析するこ

説明可能自然言語処理RAG3D

用途: アルツハイマー病の予測を行う
難易度: Hard
コスト: High

The Story Shapes the Agent: Narrative Priors in LLM Behavior

Persona prompting is widely used to steer LLM agent behavior, yet the narrative framing of a task can matter m

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Search-on-Graph-R1: Training Large Language Models to Search Knowledge Graphs with Reinforcement Learning

Knowledge graph question answering (KGQA) requires navigating from topic entities to an answer several relatio

自然言語処理大規模言語モデルQAテキスト強化学習

用途: QA
難易度: Hard
コスト: High

表形式向きMI向き自然言語処理大規模言語モデルテキスト

Structured Output Collapses Answer Diversity Across 44 Language Models

When a language model must choose one answer from a large space of equally valid options, a format clause -- "

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理RAG生成画像テキスト

PathReportEval: A Systematic Benchmark for Pathology Report Generation

Pathology report generation from whole-slide images (WSIs) is a rapidly growing multimodal learning problem, y

用途: 生成
難易度: Hard
コスト: High

Using Fine-Tuned LLMs to Identify Indicators of Vulnerability in UK Police Incident Logs

Purpose: Understanding how much of routine policing involves vulnerable people could inform resourcing, traini

自然言語処理大規模言語モデル分類

用途: 分類
難易度: Hard
コスト: High

Computational models of pragmatic reasoning with flexible generation of meaning and expression alternatives

Pragmatic language use requires reasoning about alternatives: the alternative expressions a speaker might have

自然言語処理RAG生成テキスト

用途: 生成
難易度: Hard
コスト: Low

Relay-Bench: Evaluating LLMs on Multi-Domain Reasoning Chains

Introducing Relay-Bench, an unsaturated, holistic, text-only benchmark that measures LLMs' ability to complete

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Building a European Multilingual Evaluation Dataset: The MMLU Localisation Project within the EMT Network

This paper reports on a collaboration between the Directorate-General for Translation (DGT) and the European M

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Enabling Multilingual Privacy Policy Audits: Large-Scale Analysis of Spanish Mobile Apps

Automated analyses of privacy policies enable large-scale assessments of transparency in digital ecosystems, y

用途: 分類
難易度: Hard
コスト: High

Automated Discovery Has No Universally Superior Harness

Autonomous discovery systems such as OpenEvolve and TTT-Discover are often used as general-purpose harnesses.

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

It's Not What You Say, It's How You Say It: Evaluating LLM Responses to Expressions of Belief

Users frequently express their beliefs to large language models (LLMs). In some situations, the LLM should acc

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

MI向きセンサ/時系列自然言語処理大規模言語モデル

VEHBench: A Stage-Local Diagnostic Benchmark for LLM-Assisted Vibration Energy Harvester Design

Battery-free Internet of Things (IoT) requires iterative design of vibration energy harvesters (VEHs) under co

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理ファインチューニング検出異常検知テキスト

O-VAD: Industrial Video Anomaly Detection through Object-Centric Tracking and Reasoning

工場の中の異常が検出されるように設計された機械学習モデルを提案しています。通常の方法では、モデルはビデオ内のすべての内容を考慮し、複雑な問題を解決することは困難です。提案されたモデルのアプローチは、オブジェクトを検出して

用途: 産業ビデオの異常発生検出
難易度: Hard
コスト: High

How Does Alignment Tuning Shape Representations of Sycophancy and Related Cue-Induced Biases in LLMs?

研究では、LLMの不正回答を起こす根本原因を探りました。モデルを5つの家族と7つのBCTバイアスのタイプで検討すると、モデル内の特定のパターンが見つかりました。このパターンが不正回答の根本原因となります。

少数データ向き自然言語処理大規模言語モデル

用途: LLMの不正回答の根本原因の特定
難易度: Hard
コスト: High

説明可能自然言語処理RAG生成テキストマルチモーダル

STeP: Signal Temporal Logic for Precise Specifications for Action Generation with Vision Language Models

Vision-language-action (VLA) models have shown impressive generalization, but often lack interpretability and

用途: 生成
難易度: Hard
コスト: High

arxivGitHubあり2026-07-20

UniETP: Unifying Environments for Generalizable Embodied Task Planning

This paper focuses on the problem of Embodied Task Planning, where an agent is required to execute a sequence

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

RoboHarness: Memory-Driven Orchestration of Heterogeneous Robot Policies for Long-Horizon Planning

existing robot control methodの限界を解決するためのmemory-driven orchestration method、RoboHarnessを提案し、長期計画を実現する。

自然言語処理プロンプトエンジニアリング異常検知

用途: ロボットの長期計画を解決する
難易度: Hard
コスト: High

Remote Awareness of Seafloor Images Collected by AUVs over Low-Bandwidth Communication Links

This paper introduces a method for real-time processing and transmission of autonomous underwater vehicle (AUV

自然言語処理RAG画像

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデル生成テキスト動画

FARO: Feasibility-Aware Robot Motion Optimization

Fast planning of novel behaviors in unseen scenarios remains a fundamental challenge in robotics. The high-dim

用途: 生成
難易度: Hard
コスト: High

Distilling Global Traversability Priors for Image-based Affordance Prediction in Off-road Environments

existing robot navigation methodの限界を解決するためのglobal traversability prior extraction methodを提案し、オフロード環境でのロボット移動を実

センサ/時系列自然言語処理RAG画像3D

用途: オフロード環境でのロボット移動を解決する
難易度: Hard
コスト: High

Manifold-Guided Motion Planning for Tight Assemblies

緊張した幾何学的制約に伴う、物体組み立てのMotion planningはロボティクスにおける重要な課題です。組み立てのための可行なMotionは、(近) 0 密度配置中、物体が接触に強く制約されている場合、パスの通過が

用途: タイト具組み立て
難易度: Hard
コスト: Medium

Leveraging Two Robotic Arms for Tight Assembly Performance Gains

この研究では、2 つのロボット腕を同時に使用することで、緊張組立て操作のパフォーマンスを向上させる end-to-end フレームワークを提供します。ロボット腕は、 CAD モデルの数字、そして望ましい組み立て状態に置か

品質予測/異常検知自然言語処理RAG動画

用途: 2本のロボットアームによる組み立て
難易度: Hard
コスト: High

Lifelong Localization in Dynamic Indoor Environments Combining Odometry with Sparse Distance Sampling

自律ロボットの位置決めは、ロボットナビゲーションの主要なタスクです。ロボットが予測できない、非静的な障害物、またはロボットが未知の環境に入ることが多い。この研究では、ロボットのオドメトリと距離サンプリングを組み合わせて、

センサ/時系列自然言語処理RAG検出3D

用途: 自律ロボットの位置推定
難易度: Hard
コスト: High

Receiver-Centered Robot-to-Human Handover with Grasp-Aware Object Orientation

共役ロボットは、人間オペレータと同梱するワークスペースを共有し、機械手のハンドオーバーなどの安全性の高いマイクロイベント頻繁に発生します。但し、従来の静的なハンドオーバーは、非対称の産業工具を取り扱う際、不自然な抓を持つ

自然言語処理大規模言語モデル分類3D

用途: 道具のハンドオーバー
難易度: Hard
コスト: High

Reasoning as a Double-Edged Sword: Architecture and Cross-Stage Robustness in Vision-Language-Action Models

この研究では、混乱のないターゲットに可視化言語アクションモデルを適応させることを目的として、3つのモデルを使用して研究を行った。3つのモデルは、観察から直接行動へのマッピング、テキストチャインオブスロット、潜在的な反復ル

自然言語処理RAGテキストマルチモーダル

用途: 可視化言語アクションモデルを混乱のないターゲットに適応させる
難易度: Hard
コスト: High

センサ/時系列品質予測/異常検知自然言語処理ファインチューニング検出画像

arxivGitHubあり2026-07-20

Polar Coordinate-based Differential Evolution for Moving Target Search Using Vision Sensor on Unmanned Aerial Vehicles

In search and rescue operations, there is a period known as the "golden time" during which the probability of

用途: 検出
難易度: Easy
コスト: Medium

VLN-AVP: Zero-Shot Vision-Language Navigation with Hybrid Long-Short-Term Memory for Autonomous Valet Parking

Existing methods in Autonomous Valet Parking (AVP) typically rely on pre-built maps, which severely restricts

自然言語処理RAG画像テキストマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Generalize and Guide: Decomposing Rewards for Few-Shot Inverse Reinforcement Learning

複数タスク間の説明性を提供するための逆強化学習は、複数タスク間の説明性を提供することによって、複雑なタスクを解決することに関与していますが、この研究では、複数タスク間の説明性を提供するための逆強化学習の新たなアプローチを

少数データ向き自然言語処理RAG強化学習

用途: 複数タスク間の説明性のための逆強化学習のための新たなアプローチ
難易度: Hard
コスト: Low

HCPG-Flow:Hierarchical Contact-Progress Guidance for Flow-Policy Robot Manipulation

Flow policies can represent multimodal action distributions for robot manipulation, yet a robot must execute o

自然言語処理埋め込み・検索マルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列品質予測/異常検知自然言語処理RAG

Configuration-Induced Passive Self-Rotation for Perception-Enhanced Autonomous Flight

Autonomous flight in confined and cluttered environments is fundamentally limited by the restricted field of v

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

センサ/時系列自然言語処理埋め込み・検索画像マルチモーダル

COLIP-2: Olfaction-Vision-Language Embeddings

The Contrastive Olfaction-Language-Image Pre-training 2 (COLIP-2) model is a multimodal embeddings space that

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

The Shared Discovery Paradox: How a One-Answer Rule Turns Better Information into Worse Search

組織の中で、情報は共有され、同僚がその情報に基づいて行動することが多い。研究者たちは、このような情報の共有によって、共有する前よりも探索の精度が向上することがあることに注目しました。しかし、このような共有によって、探索の

用途: 探索問題
難易度: Hard
コスト: Low

When One Good Is Not Enough: EF1 and Pareto Optimality Are Not Compatible for Submodular Valuations

One of the central questions in discrete fair division is whether fairness and efficiency can be achieved simu

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

huggingfaceGitHubありHugging Faceあり2026-07-20

SciForma: Structure-Faithful Generation of Scientific Diagrams

Structural fidelity is essential to scientific methodology diagrams. To communicate research logic, these diag

品質予測/異常検知自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Easy
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル動画マルチモーダル

EduPanel: A Three-Agent LLM Judge for Teaching Videos -- Reliability, Complementarity, and Human Trust Calibration

Teaching videos are becoming a major medium for education, creating a growing need for scalable evaluation of

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル検出生成セグメンテーション

FlowMimic: Mask-free Visual Editing and Generation with Pixel-pair Warped Flow Field for Online Video Editing Data Generation and Modality Mimicry

In line with the prevailing direction of vision research, we explore the integration of both generation and ed

用途: 検出
難易度: Easy
コスト: High

説明可能自然言語処理ファインチューニング分類生成異常検知

Token-Level Off-Policy Learning for Faithful Generation Under Distribution Shift

We propose Token-Level Off-Policy Labeling (TOPL), an off-policy training paradigm that reframes post-training

用途: 分類
難易度: Easy
コスト: High

ShotPlan: Cinematic Video Generation with Learnable Planning Token

Current video generation models achieve impressive results in single-shot generation, yet remain limited in ci

MI向き自然言語処理埋め込み・検索生成動画

用途: 生成
難易度: Easy
コスト: High

Coercion and Deception in AI-to-AI Management: An Agentic Benchmark of Unprompted Escalation

Multi-agent systems routinely place one AI agent in authority over another. When a subordinate refuses a task,

用途: 分類
難易度: Easy
コスト: High

Scrapegraph-ai — Python scraper based on AI

AIを使ったwebスクレイピングツールです。

用途: 自然語り式のwebスクレイピング
難易度: Easy
コスト: High

ludwig — Low-code framework for building custom LLMs, neural networks, and other AI models

Ludwigは、LLM (Large Language Model) のカスタム化と構築のための低コストフレームワークです。このフレームワークは、ユーザーがカスタム LLM を構築し、トレーニングするのを容易にします。

用途: LLMのカスタム化と構築のための低コストフレームワーク
難易度: Easy
コスト: High

OpenLLM — Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

OpenAIに互換性があり、Cloud APIとして利用できるLLM。

用途: LLMのクラウドAPI
難易度: Easy
コスト: High

BentoML — The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

モデルをサービングするためのライブラリを紹介している。

自然言語処理大規模言語モデル生成マルチモーダル

用途: モデルのサービング
難易度: Easy
コスト: High

Open-dLLM — Open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.

Open-dLLMはOpen diffusion language modelを公開しており、コード生成の前トレーニング、評価、推論、チェックポイントを公開しています。

用途: コード生成の問題を解決する
難易度: Easy
コスト: High

AI-Papers-of-the-Week — 🔥Highlighting the top ML papers every week.

最新のマシンラーニング研究論文を紹介している。

用途: 最新のML研究論文
難易度: Easy
コスト: Medium

compromise — modest natural-language processing

この研究では、自然言語処理の負担を減らすモジュラリティを目指しています。モジュラリティとは、システムを小さくて独立した部分に分割して、それぞれを簡素化することです。この研究では、文脈に応じてモジュラリティを変更できるメカ

自然言語処理分類音声

用途: 自然言語処理の簡素化
難易度: Easy
コスト: Low

arxivPaper only2026-07-19

Econometrics with Pre-Trained Embeddings for Unstructured Data

Unstructured data, such as images and text, are increasingly used in empirical economics. Since training machi

表形式向き品質予測/異常検知自然言語処理RAG回帰画像テキスト

用途: 回帰
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル生成テキスト

arxivGitHubあり2026-07-19

CoEvoP&R: Co-Evolving Placement Objectives with Routing Feedback via Large Language Models

Analytical placers rely on differentiable objective functions to guide placement, typically combining intermed

用途: 生成
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-19

TimeLens2: Generalist Video Temporal Grounding with Multimodal LLMs

Video multimodal large language models (MLLMs) can describe what happens in a video, but rarely identify when

自然言語処理大規模言語モデル検出テキスト動画

用途: 検出
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-19

EvolvingWorld: An Open-Schema Framework for Co-Evolving Role-Play Agents and World Model in Interactive Literary World

This paper introduces EvolvingWorld, a framework and benchmark for character and world co-evolution in interac

用途: 生成
難易度: Easy
コスト: High

githubGitHubあり2026-07-19

testtimescaling.github.io — "what, how, where, and how well? a survey on test-time scaling in large language models" repository

大規模言語モデルのテスト時間調整に関する調査のリポジトリ。

用途: 大規模言語モデルのテスト時間調整
難易度: Easy
コスト: High

Semi-Supervised Conditional Generative Learning through Stochastic Interpolation and Sufficient Representations

Conditional generative modeling remains a challenging problem in semi-supervised settings where labeled data i

自然言語処理RAG生成教師あり半教師あり

用途: 生成
難易度: Hard
コスト: Low

A Causal Markov Condition for Value

This paper proposes a causal independence principle for value -- the value Causal Markov Condition (v-CMC) --

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Isotonic Conformal Prediction

A point prediction that is well calibrated on average can still be systematically biased conditional on its ow

自然言語処理RAG回帰

用途: 回帰
難易度: Hard
コスト: Low

Amortized Inference for Sampling Distributions Where the Bootstrap Fails

Efron's bootstrap is the default tool for estimating the sampling distribution of a statistic, yet it is prova

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理ファインチューニング

Hybrid Augmented Lagrangian Method for General Constrained Optimization via Evolutionary Algorithms

Constrained Optimization Problems are crucial in fields such as engineering, economics, and robotics, where hi

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

How to Build Marcus's Algebraic Mind: From Minsky's Emotion-Machine Viewpoint

In The Algebraic Mind, Marcus identified three cognitive components: operations over variables, recursively st

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

How to Build Marcus's Algebraic Mind: From Thagard's Brain--Mind Viewpoint

Two critiques of connectionist cognition converge on one missing capacity. In The Algebraic Mind, Marcus isola

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

ADMM-Based Safety-Critical Distributed NMPC for Cooperative Transportation by Quadrupedal Robots

This paper presents a safety-critical distributed nonlinear model predictive control (DNMPC) framework for coo

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

GLidE-SLAM: GL-Accelerated Indirect-Direct Embedded SLAM

With the growing demand for robotics, autonomous drones, and wearable extended reality systems, the deployment

CPUで試しやすい自然言語処理RAG画像

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

An Indoor Navigation System for the Visually Impaired based on UWB Positioning and D* Lite Path Planning Algorithm

This paper proposes an indoor navigation system for the visually impaired, leveraging Ultra-Wideband (UWB) pos

自然言語処理RAG検出画像

用途: 検出
難易度: Hard
コスト: Low

自然言語処理プロンプトエンジニアリング分類検出テキスト

Hazard or Anomaly? Evaluating VLMs for Understanding Dangers and Discrepancies

Modern safety-critical systems increasingly rely on human-robot interaction to reduce disaster risk and suppor

用途: 分類
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-18

DataFlow-Harness: A Grounded Code-Agent Platform for Constructing Editable LLM Data Pipelines

Large language models (LLMs) are increasingly used to automate data-processing workflows, yet coding agents ty

用途: 生成
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-18

Environment-free Synthetic Data Generation for API-Calling Agents

Training API-calling large language model (LLM) agents demands massive amounts of high-quality trajectories. H

用途: 生成
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-18

Can Multimodal Large Language Models Understand OCT?

Optical coherence tomography (OCT) imaging is essential for the diagnosis and treatment of retinal diseases. A

品質予測/異常検知自然言語処理大規模言語モデル分類QA画像

用途: 分類
難易度: Easy
コスト: High

Cluster-Aware Matching via Laplacian Optimal Transport

この論文では、インフラリーダーの観点から安全交通のアセスメントを行うためのフレームワーク、PRISAを提案する。このフレームワークは、道路状況の観測と交通安全の評価を提供し、交通渋滞や事故を予測することで安全交通の実現を

説明可能品質予測/異常検知自然言語処理RAG3D

用途: 交通渋滞や事故を予測するためのインフラリーダーによる監視技術の開発
難易度: Hard
コスト: High

Dimension-invariant uniform consistency of the empirical spatial distribution function and its associated spatial depth estimator

空間分布関数は、データ分析における重要な手法であるが、正確な空間分布関数を評価する方法が必要。この問題を解決するために、空間分布関数を評価する方法を提案。

用途: 空間分布関数の評価
難易度: Hard
コスト: High

ASK-NN: An Asymmetric Nearest-Neighbor Test that detects Distribution Drifts in Natural Language

Hallucinations and artificial text in LLM-generated outputs often appear as distributional deviations between

自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Hard
コスト: High

From Optimal Policies to Individual Differences: Rethinking Reinforcement Learning for Biology

Reinforcement learning (RL) is primarily known as a computational method for optimizing control tasks, but it

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Evolutionary Algorithm-Guided LLMs for Physics-Informed Neural Network Design

Physics-informed neural networks (PINNs) are unusually sensitive to interacting choices of architecture, activ

用途: 生成
難易度: Hard
コスト: High

Vision-Language-Motion Maps: An Open-Vocabulary, Uncertainty-Aware, Queryable Motion Attribute for 3D Scene Maps

この研究では、動的なシナリオを分析するために可視化した地図上にMotion Attributeを付与し、Language QueryによるMotion Attributeフィルタを使用して分析することができます。

自然言語処理大規模言語モデル3Dマルチモーダル

用途: 可視化した地図上での動的なシナリオの分析
難易度: Hard
コスト: High

A Morphing-Designed Hexarotor Prototype combining Practical Resilience and Efficiency

この研究では、多関節ロボットの制御と運動学を改良したHexarotorの制御を提案し、ロボットの制御と安全性を向上していることを示しています。

用途: 多関節ロボットの制御と運動学
難易度: Hard
コスト: Low

Data and Learning Where it Matters for Contact-Rich Manipulation

この研究では、接触の豊富なマニピュレーションを実現するための、データの収集と学習を改良した方法を提案し、ロボットの制御の精度を

自然言語処理RAG異常検知強化学習

用途: 接触の豊富なマニピュレーションのためのデータ収集と学習
難易度: Hard
コスト: Low

少数データ向き条件最適化自然言語処理RAG検出画像

Embodied Active Learning under Limited Annotation and Navigation Budget for Object Detection

この研究では、ロボットのナビゲーション時間と注釈時間の制約を考慮したオブジェクト検出フレームワークを提案します。

用途: オブジェクト検出を適応化
難易度: Hard
コスト: Low

Network-Induced Strategic Communication in Opinion Dynamics

Classical opinion dynamics typically assume a fixed mapping from private opinions to public signals, such as l

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Strategic Persuasion Through Information Timeliness

We study a dynamic strategic communication problem in which a sender controls the timing of truthful updates f

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

An Exam for Active Observers

Human vision is a closed loop: gaze is continuously redirected by intermediate hypotheses rather than a single

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

Apple-π: Benchmarking Thinking with Video Towards Law-Grounded Physical Intelligence

Modern video generation models are increasingly hailed as emerging world models with an internalized grasp of

自然言語処理大規模言語モデル生成動画

用途: 生成
難易度: Easy
コスト: High

Understanding Reasoning from Pretraining to Post-Training

Reinforcement learning (RL) has become central to improving large language models (LLMs) on complex reasoning

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

MI向き自然言語処理大規模言語モデル生成画像テキスト

S1-Omni: A Unified Multimodal Reasoning Model for Scientific Understanding, Prediction, and Generation

We present S1-Omni, a unified multimodal reasoning model for scientific understanding, prediction, and generat

用途: 生成
難易度: Easy
コスト: High

説明可能自然言語処理大規模言語モデル画像テキスト音声

Audio-Visual Flamingo: Open Audio-Visual Intelligence for Long and Complex Videos

We present Audio-Visual Flamingo (AV-Flamingo), a fully open state-of-the-art audio-visual large language mode

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

自然言語処理大規模言語モデル生成テキストマルチモーダル

generative-ai — Comprehensive resources on Generative AI, including a detailed roadmap, projects, use cases, interview preparation, and coding preparation.

ゼネレーティブAIに関連するリソースの一覧。

用途: ゼネレーティブAI
難易度: Easy
コスト: High

LLM-API-Key-Proxy — Universal LLM Gateway: One API, every LLM. OpenAI/Anthropic-compatible endpoints with multi-provider translation and intelligent load-balancing.

さまざまなLLMのゲートウェイとして使えるライブラリ。

用途: LLMのゲートウェイ
難易度: Easy
コスト: High

open_clip — An open source implementation of CLIP.

CLIPという画像認識モデルをオープンソースとして実装したライブラリ。

自然言語処理プロンプトエンジニアリング分類

用途: 画像認識
難易度: Easy
コスト: Low

clearml — ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

このリポジトリでは、高スループットと低メモリ消費のLLMインフェレンザエンジンであるVLLMを提供しています。

用途: 高スループットと低メモリ消費のLLMインフェレンザ
難易度: Easy
コスト: High

Awesome-Model-Merging-Methods-Theories-Applications — Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM Computing Surveys, 2026.

LLMのマージに関してのマニュアルです。理論、方法、応用などについての概要が記載されています。

用途: LLMのマージ
難易度: Easy
コスト: High

Design-Based Supervised Learning with Noisy Human Labels

noisyラベルを扱うための学習アプローチを提案し、それをテストした。

表形式向き自然言語処理RAG分類表形式教師あり

用途: noisyラベルの設計ベース学習
難易度: Hard
コスト: Low

Proactive Inpatient Bed Requests for Emergency Department Admissions

Emergency department (ED) boarding occurs when admitted patients remain in the ED while awaiting inpatient bed

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Data Driven Block Replacement Scheduling

この研究では、ブロック交換ポリシーを使用して $N$ 個の独立した同型マシンを維持するために、オペレーターが $K^*$ の最適な期間を決定するためのデータ駆動型法を開発しました。この法は、オペレーターが選択した期間に基

自然言語処理RAGテキスト

用途: メーカーの交換スケジュール決定
難易度: Hard
コスト: Low

cGAP: Generalized Association Plots with HOMALS-Guided Heatmaps for Visualization of High-Dimensional Categorical Data

高次元カテゴリデータを可視化するため、hierarchical optimizing linear assignment (HOMALS)を使用し、可視化に役立つ関連表

説明可能自然言語処理ファインチューニング分類画像

用途: 高次元カテゴリデータの可視化
難易度: Hard
コスト: Low

Tamed Stochastic Gradient Hamiltonian Monte Carlo

機械学習では、オプティマイザはモデルを最適化するために重要な役割を果たします。この研究では、オプティマイザの開発に基づいて、機械学習モデルの最適化を高速化する方法を提案した。

用途: 機械学習のオプティマイザの開発
難易度: Hard
コスト: Low

Moment-Resolved Readout and Reservoir Diversity in Nonequilibrium Langevin Computing

Nonlinear thermodynamic computers based on Langevin dynamics exploit thermal fluctuations as a physical substr

用途: 分類
難易度: Hard
コスト: High

huggingfaceHugging Faceあり2026-07-16

RESOURCE2SKILL: Distilling Executable Agent Skills from Human-Created Multimodal Resources

Skills are a useful abstraction for software agents, turning human and agent experience into reusable procedur

自然言語処理RAG画像テキスト動画

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-16

agent-lightning — The absolute trainer to light up AI agents.

最適なAIモデルを効率的に学習するためのオーサリングツール。Agent Lightningを使用して、トレーナーをセットアップし、データをトレーニングしてモデルを学習することができる。

用途: AI_AGENTのトレーナーを簡単にセットアップする
難易度: Easy
コスト: High

Heavy-Tailed Flow Matching via Random Clocks

重尾流を見つけるには、Standard diffusionとflowマッチングモデルの欠陥を解決するRandom Clocksを提案した。

品質予測/異常検知自然言語処理RAG画像

用途: 重尾流の検出
難易度: Hard
コスト: High

Analogical Deep Research: Retrieving and Integrating Historical Analogies for Foresight Analysis

述語学習における歴史的類推を推測し、歴史的類推を評価するためのアナロジーディープリサーチという新しいタスクを提案し、述語学習における歴史的類推が重要な役

用途: 述語学習で歴史的類推
難易度: Hard
コスト: High

Price of Fairness in Bandits: A Tight Minimax Characterization

最小限マージン定理を仮想環境における公平性と利益のトレードオフの分析に導入します。この定理により、公平性と利益のトレードオフを最小限の公平性確保することで解決することができます。

自然言語処理ファインチューニングテキスト

用途: 仮想環境における公平性と利益のトレードオフを分析するための最小限マージン定理の導入
難易度: Hard
コスト: Medium

How to Guide LLM Generation: Dual-Surrogate Guided Search for Automated Heuristic Design

Large language models (LLMs) have made automated heuristic design (AHD) increasingly practical by generating e

説明可能自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

S-CARD-CMSA: A Score-Aware Candidate Archive with Density-Filtered Reporting for Multimodal Optimization

Multimodal optimization aims to locate multiple globally optimal or near-optimal solutions in a single run. Th

自然言語処理RAGマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Reveal, Correct, Then Pay: Encrypted Mempools and Perpetual Funding Security

暗号化されたメンプールを用いて、特定のトランザクションに対して異なる可能性のある出目を検出する攻撃を防ぐ研究。

用途: 暗号化されたメンプールを用いた永続的な資金流入の保護
難易度: Hard
コスト: Low

Auctions with Contract Design

仕事によって得られる資格の特性に基づいてオークションの参加者を評価し、契約設計を含む複雑なオークションの枠組みについて述べる。

用途: 契約設計を含むオークション
難易度: Hard
コスト: Low

huggingfaceHugging Faceあり2026-07-15

Cura 1T: Specialized Model for Agentic Healthcare

Healthcare spans high-stakes communication, expert reasoning, and workflow execution, yet specialized LLMs tha

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-15

Partially Correlated Verifier Cascades in LLM Harnesses: Concave Log-Odds, Polynomial Reliability, and Blind-Spot Ceilings

Serial verification gates are a core reliability primitive in LLM harnesses: a candidate answer is returned on

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-15

ai-engineering-hub — In-depth tutorials on LLMs, RAGs and real-world AI agent applications.

このリポジトリには、LLM、RAG、およびオーソリティの認識を含む、AIエンジニアリングのための深いドキュメントがあります。

用途: 記事を理解するためのテキスト分析ツール
難易度: Easy
コスト: High

arxivPaper only2026-07-14

Accelerated Mixing Time of Randomized Hamiltonian Monte Carlo

ランダムハミルトニアンモンテカルロの混合時間の改善を検討し、既存の定理を強化し、より効率化されたアルゴリズムを提案

用途: ランダムハミルトニアンモンテカルロアルゴリズムの高速化
難易度: Hard
コスト: Low

arxivPaper only2026-07-14

On Incentivized Exploration beyond Bayesianism and Full-Information

We extend Incentive Compatible Exploration beyond the Bayesian full-information setting of Kremer et al. [2014

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

githubGitHubあり2026-07-14

Awesome-Embodied-Robotics-and-Agent — This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

Embodied AIやロボットとLarge Language Modelを組み合わせた研究のリポジトリ。

用途: Embodied AIやロボット研究
難易度: Easy
コスト: High

githubGitHubあり2026-07-14

agents-towards-production — End-to-end, code-first tutorials for building production-grade GenAI agents. From prototype to enterprise deployment.

AIエージェントの開発と実装を行うためのエンドツーマンド、コードファーストのチュートリアル。

用途: AIエージェントの開発と実装
難易度: Easy
コスト: High

githubGitHubあり2026-07-14

memvid — Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.

MemVidは、サーバーレスで単一ファイルの記憶層を提案し、AIエージェントが即時検索と長期的な記憶を持つようにする記憶層です。

自然言語処理大規模言語モデル生成テキスト動画

用途: AIエージェントの記憶を管理する
難易度: Easy
コスト: High

Falsifying Causal Graphs With Outlier Events

True causal relationships are rarely known, and inferring causal graphs from data is hard. A fundamental chall

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

センサ/時系列品質予測/異常検知自然言語処理RAG

Dynamic Online Processor-Native Inference for State Estimation

Sensor-rich data-driven applications increasingly use Bayesian approaches to infer latent states of dynamic sy

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Are we Merging the Right Models? Impact of Expert Training Duration on Model Merging for LLMs

Multi-task model merging combines separately trained expert models into a single model that handles all tasks

品質予測/異常検知自然言語処理大規模言語モデル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Markov Chain Monte Carlo with Diffusion Paths

この研究では、マルコフ連鎖モンテカルロ法を改良し、多モーダル分布からサンプリングする能力を高めるための新しいアプローチを提案した。このアプローチでは、微分のノイズパスを使用することで、モデルの収束を高速化し、多モーダル分

自然言語処理ファインチューニングマルチモーダル

用途: マルコフ連鎖モンテカルロ
難易度: Hard
コスト: High

Backpropagation as a Nilpotent Linear System

ニューラルネットワークの逆伝播計算の正確さを向上させるため、逆伝播の計算をより効率的に行う方法が必要になっています。この論文では、ニューラルネットワークの逆伝播計算を改善する方法を提案します。

用途: ニューラルネットワークの逆伝播計算を改善する
難易度: Hard
コスト: Low

huggingfaceGitHubありHugging Faceあり2026-07-13

RAGU: A Multi-Step GraphRAG Engine with a Compact Domain-Adapted LLM

Graph retrieval-augmented generation (GraphRAG) enhances large language models with structured knowledge, yet

自然言語処理大規模言語モデル検出生成要約

用途: 検出
難易度: Easy
コスト: High

githubGitHubあり2026-07-13

Awesome-Mixture-of-Experts — Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts (MoME)

Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts

用途: 実装・検証基盤
難易度: Easy
コスト: High

Reinforcement Learning for Execution under Dynamic Fees in a Closed-Loop DEX Simulator

Trader-facing dynamic fees are increasingly proposed for automated market makers (AMMs), but historical data d

表形式向き自然言語処理RAG表形式強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Sticky Jump Diffusions: A Unifying View of Masked, Continuous, and Hybrid Diffusion

We introduce Sticky Jump Diffusions (SJDs), continuous-time Markov processes on $\mathbb R^d$ whose discrete a

自然言語処理埋め込み・検索分類テキスト

用途: 分類
難易度: Hard
コスト: High

センサ/時系列品質予測/異常検知自然言語処理RAG検出異常検知時系列

Did We Actually Fix It? An Independent Adversarial Stress-Test of Post-Point-Adjustment Evaluation Metrics for Time-Series Anomaly Detection

Point-adjustment (PA), for years the default scoring protocol in time-series anomaly detection (TSAD), was sho

用途: 検出
難易度: Hard
コスト: Low

Bandit PCA with Minimax Optimal Regret

We study the bandit-feedback version of online principal component analysis (Bandit PCA): in each round $t = 1

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

GNet: A scalable and flexible Gaussian process network with nonparametric neurons

We develop GNet, a scalable and flexible Gaussian process network with nonparametric activation functions mode

少数データ向き自然言語処理RAG回帰

用途: 回帰
難易度: Hard
コスト: High

An Extreme Value Perspective on Learning Stress Laws

We introduce Self-Similar Generative Estimation (SS-GEN), a method for simulating multivariate tail events and

用途: 生成
難易度: Hard
コスト: Low

Learning from Local Walks on Dynamic Graphs with Bandit Feedback

We study stochastic multi-armed bandits on dynamic graphs, where arms correspond to the vertices of a network

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

huggingfaceHugging Faceあり2026-07-11

Beyond Euclidean Clipping: Overcoming Exploration Collapse in LLM RL via Riemannian Isometric Policy Optimization

Reinforcement learning (RL) has become a dominant paradigm for enhancing LLMs' reasoning capabilities. However

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-11

GigaChat Audio: Time-aware Large Audio Language Model

Temporal grounding in long recordings remains challenging for audio-conditioned LLMs. We present a time-aware

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High

githubGitHubあり2026-07-11

awesome-nlp — :book: A curated list of resources dedicated to Natural Language Processing (NLP)

このリポジトリは自然言語処理(NLP)に関するリソースをまとめたものです。

自然言語処理テキスト

用途: NLPリソースのまとめ
難易度: Easy
コスト: Medium

Manifold Constrained Conformal Prediction for Spatial Events

We introduce a new conformal prediction method that constructs calibrated prediction sets over collections of

自然言語処理RAG生成予測3D

用途: 生成
難易度: Hard
コスト: High

表形式向き自然言語処理ファインチューニング回帰表形式

A censoring-aware target interface for tabular foundation models in survival prediction

Time-to-event prediction from tabular patient data is central to prognosis and biomedical decision support, bu

用途: 回帰
難易度: Hard
コスト: Low

High-Dimensional Interpolators Can Be Fragile: Heavy Tails and High-Dimensional Large Deviations

High-dimensional interpolation is common in modern machine learning, but its tail risk is less understood than

自然言語処理RAG回帰

用途: 回帰
難易度: Hard
コスト: Low

Terminal Dimension Reduction for Time Series with Applications

Terminal embeddings have emerged as a powerful tool for dimension reduction. Given a set of points $P\subset \

センサ/時系列自然言語処理埋め込み・検索時系列

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Deep Learning for Dynamic Programming with Recursive Utility Using First-order Conditions

This paper proposes the certainty-equivalent first-order learning (CEFOL) algorithm, a deep learning algorithm

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Evolutionary Intelligence for Scientific Discovery: From Evolutionary Computation to Cumulative Discovery Systems

人工知能は、タスク固有のワークフローから、人工知能を組成して、実験とヒューマンフィードバックを組み込み、無尽蔵な候補空間内で自動的に調査を行うようにした。進化的計算は、実験的および人間のフィードバックに基づいて探索を導き

用途: 感覚的探索に基づく科学研究
難易度: Hard
コスト: Medium

A Knowledge-Based Multi-Agent Framework for Security Control Recommendation

Hardening IT on-premises environments can be a daunting task for teams without access to adequate cybersecurit

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

huggingfaceHugging Faceあり2026-07-10

OpenLongTail: Generative Scaling of Long-Tail Driving Data

Scaling robust driving policies is fundamentally bottlenecked by the scarcity of edge cases in curated dataset

自然言語処理RAG生成画像動画

用途: 生成
難易度: Easy
コスト: High

huggingfaceHugging Faceあり2026-07-10

REBASE: Reference-Background Subspace Elimination for Training-Free In-Context Segmentation

Training-free in-context segmentation enables new object categories to be introduced at inference time from a

品質予測/異常検知自然言語処理プロンプトエンジニアリング検出セグメンテーション画像

用途: 検出
難易度: Easy
コスト: High

githubGitHubあり2026-07-10

multimind-sdk — Your SDK solves all of this. One interface. Unified logic. Local + hosted models. Fine-tuning. Agent tools. Enterprise-ready. Hybrid RAG.Star 🌟 if you like it!

GUI操作自動化に伴う停止判定、復讐、再検索に関する問題を解決し、 GUI操作自動化を実現するためのフレームワークを開発します。

用途: GUI操作自動化ツール
難易度: Easy
コスト: High

githubGitHubあり2026-07-09

Awesome-Item-ID-Gen-RecSys — Updating curated list of research advancements on item identification and item tokenization in generative recommender systems. The survey is titled "A Survey of Item Identifiers in Generative Recommendation: Construction, Alignment, and Generation"

本研究では、生成推奨システムにおけるアイテムIDの構築、調整、生成の手法について、アイテムIDの構築方法を分析しています。

用途: 生成推奨システムのアイテムIDの問題解決
難易度: Easy
コスト: High

条件最適化自然言語処理ファインチューニングテキスト

Sampling on Random Subspaces under Limited Data in the Context of Exploratory Landscape Analysis

これは、Exploratory Landscape Analysisにおけるランダムサブスペースのサンプリングを使用するためのフレームワークであるSampling on Random Subspacesを提案している。

用途: ランドスケープの分析
難易度: Hard
コスト: Low

Intrinsic-Noise Consolidation: A Doob-Barrier-Conditioned Diffusion Turns Analog Device Noise into a Continual-Learning Resource

計算機による学習記憶を安定化させることができる、新しい方程式を開発した。新しい方程式により、計算機による学習記憶を正確にコンソリデーションさせることができる。

用途: 計算機による学習記憶のコンソリデーション
難易度: Hard
コスト: High

Institutional Red-Teaming: Deployment Rules, Not Just Models, Causally Shape Multi-Agent AI Safety

複数のエージェントの行動を分析するための方法を提案した。複数のエージェントの行動を

用途: 複数のエージェントの行動を分析する
難易度: Hard
コスト: High

FedMark-FM: Auditable, Risk-Adjusted Data Markets for Federated Foundation-Model Adaptation

Federated foundation-model adaptation increasingly relies on heterogeneous private artifacts (retrieval corpor

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

githubGitHubあり2026-07-08

nltk — NLTK Source

このリポジトリは、Natural Language Toolkit（NLT）のソースコードを収録しています。

用途: Natural Language Toolkit（NLT）ソース
難易度: Easy
コスト: Medium

arxivPaper only2026-07-07

Formalizing Scarf, Brouwer, and Nash in Lean

これはLeanの統合演算子に関する論文で、サーフの定理からブルワー定理までの論理的な導出を検討しています。

自然言語処理埋め込み・検索

用途: 統合演算子に関する論述
難易度: Hard
コスト: Low

arxivPaper only2026-07-07

Strategic Bargaining in Multi-Buyer Markets: Reinforcement Learning from Verifiable Rewards for LLM Negotiations

複数の買い手を持つ市場における交渉システムを構築します。マーケットの規模を知り切れていない場合、セラーの損失が生じます。セラーは市場の規模を測る必要がありますが、これは複数の買い手を持つ場合に困難です。

用途: 複数の買い手を持つ市場における交渉
難易度: Hard
コスト: High

enchanted — Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.

iOS、macOS用のアプリ「Enchanted」は、個人でホストした言語モデル（LLama2、Mistral、Vicunaなど）とのチャットを可能にする。

用途: 私家版の言語モデルとチャットするためのiOS、マックアプリ
難易度: Easy
コスト: High

DATAGEN — DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing.

AIドライブのマルチエージェント研究アシスタント。仮説の生成、データ分析、およびレポートの生成を自動化する。

用途: AI研究アシスタント
難易度: Easy
コスト: High

home-llm — A Home Assistant integration & Model to control your smart home using a Local LLM

home-llmは、ローカルLIMを使ってスマートホームの制御を可能にするHome Assistantの統合モデルです。

用途: スマートホームの制御
難易度: Easy
コスト: High

Awesome-large-language-model-for-Prognostics-and-health-management — 用于预测性维护与健康管理的大型语言模型（故障诊断；寿命预测）

この論文では、予測性維护と健康管理のための大型言語モデルが使用されます。この modelは、故障診断や寿命予測などの問題を解決するように設計されています。

用途: 故障診断と寿命予測の問題を解決
難易度: Easy
コスト: Medium

VLM-R1 — Solve Visual Understanding with Reinforced VLMs

この研究では、画像理解を強化する強化されたビジョンホルシックスモデル (VLM-R1) が提案されます。この modelは、画像を理解しやすくするように設計されています。

自然言語処理大規模言語モデル画像マルチモーダル

用途: 画像理解の問題を解決
難易度: Easy
コスト: High

arxivPaper only2026-07-06

LLM for the development of FCM

This article is about the development of a fuzzy cognitive map using a local large language model. In the ligh

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-07-05

Decentralized Aggregation of LLM Predictions via Wagering Mechanisms

It is increasingly common to aggregate predictions from multiple LLMs, each with domain expertise or access to

自然言語処理大規模言語モデル予測

用途: 予測
難易度: Hard
コスト: High

githubGitHubあり2026-07-05

llm-app — Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.

用途: AIパイプラインを構築する
難易度: Easy
コスト: High

arxivPaper only2026-07-04

Towards Self-Evolving Agents: A Human-Inspired Adaptive Exploration-Exploitation Framework for Genetic Network Programming

Recent advancements in agentic AI have increasingly moved toward graph-based methods, driven by the demand for

説明可能自然言語処理ファインチューニング

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-07-03

Rank-Order N-of-M Codes for Sparse Distributed Memory: Disentangling Representation and Learning Effects in Noise Robustness Against Contemporary Neuromorphic Architectures

Large language models remain limited as continual learning systems, motivating renewed interest in Sparse Dist

表形式向き自然言語処理大規模言語モデル埋め込みテキスト表形式

用途: 埋め込み
難易度: Hard
コスト: High

arxivPaper only2026-07-03

Congestion Games with Heterogeneous Valuations: An Optimal Transport Approach

In emerging urban mobility and logistics applications, such as advanced air mobility, electric vehicle chargin

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-07-03

Scaffolding the Strategist: Architecture-Dependent Reasoning Interventions in Hotelling Spatial Markets

We investigate whether structured reasoning interventions improve the strategic economic reasoning of large la

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-07-02

Hybridizing a Grouping Metaheuristic with Reinforcement Learning for the One-Dimensional Bin Packing Problem

1D バイナリングパッキング問題（1D-BPP）とは、さまざまな用途に多く応用される、分配不可能なNP困難な組合せ最適化問題である。この研究では、Falkenauerのハイブリッドグループゲンエイリアスアリファメント（

表形式向き品質予測/異常検知自然言語処理RAG生成表形式強化学習

用途: 1D バイナリングパッキング
難易度: Hard
コスト: Low

arxivPaper only2026-07-02

Constrained Distributed Heterogeneous Two-Facility Location Problems with Max-Variant Cost

連携分散H二階ロケーション問題下のmax-variantコスト最小化の問題を解決するための、アプローチを提示しました。

用途: 連携分散H二階ロケーション問題下のmax-variantコスト最小化
難易度: Hard
コスト: Low

arxivPaper only2026-07-02

Deep Reinforcement Learning to Master the Asymmetric Strategy of Baghchal

Baghchal is a two-player asymmetric board game with Nepali origins where four tigers are to capture goats and

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

githubGitHubあり2026-07-02

langextract — A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

LLMを使用して、自然言語処理における情報抽出を行うためのPythonライブラリです。

用途: 自然言語処理情報抽出
難易度: Easy
コスト: High

githubGitHubあり2026-07-02

learning — A log of things I'm learning

学習中のアイデアや知識を整理するための日記。

用途: 知識の学習記録
難易度: Easy
コスト: High

From Consistency to Collaborative Discovery: MFEA-CoD for Multitask Novelty Search

この研究では、多タスクの奇抜さを促進するために、エボリューション性の多タスク (EMT) を導入しました。EMT は、目標指向の最適化に焦点を当ててきましたが、共通性の構造を利用して、同時に複数の最適化問題を解決する能力

用途: 多タスクの奇抜さ検索
難易度: Hard
コスト: Low

Self-Organized Learning in Oscillatory Neural Networks with Memristive Signed Couplings

頻発するニューロンネットワーク（ONNs）は、相互作用する動的システムを利用して計算を実行し、相互関係の相差によって情報を表現する有力なニュロモーフ

用途: 認知的な記憶をサポートする脳ネットワークの発展
難易度: Hard
コスト: High

Fair Allocation under Conflict Constraints via Strong Colorability

この研究では、グラフ分配の公平性を研究します。分配には、互いに隣接する頂点が同じアーギェントに割り当てられることが含まれます。この研究では、公平性を考慮したグラフ分配を提案します。

用途: グラフ分配の公平性
難易度: Hard
コスト: Medium

Positive and Negative Determinant Strategies in Repeated Games with Behavior-Value Inconsistency

この研究では、直接相互作用を研究します。直接相互作用は、個人が他の個人が行う行動に影響を与えることがあります。この研究では、直接相互作用を考慮したゲーム理論の枠組みを提案します。

用途: 直接相互作用
難易度: Hard
コスト: Low

Fully Distributed Tâtonnement for Chores Markets

この研究では、価格調整を研究します。価格調

用途: 共働人の価格調整
難易度: Hard
コスト: Low

センサ/時系列品質予測/異常検知自然言語処理RAG検出生成異常検知

Distributed Hierarchical Temporal Memory with Shared Associative Memory for Cross-Entity Preemptive Warning

分散型時間関数記憶体を用いた異常検知システムを開発しました。このシステムは、関連のあるエンティティの予兆行動を共有メモリ空間に保存し、異常検知に役立ちます。このシステムは、異常検知に役立つ新しい方法を提供します。

用途: 分散型時間関数記憶体を用いた異常検知
難易度: Hard
コスト: Low

自然言語処理ファインチューニング分類埋め込み強化学習

Diffusing Blame: Task-Dependent Credit Assignment in Biologically Plausible Dual-Stream Networks

Biological neural circuits obey Dale's principle: each neuron's synapses are uniformly excitatory or inhibitor

用途: 分類
難易度: Hard
コスト: High

Guesswork Under Linear Constraints: Exact Exponent for Coset Decoding

データ解析のアルゴリズムを確立するための数値論理の研究で、データのguesswork(乱数解析)における線形制約の下での指数関数の成長率を研究している。この研究は、データ解析のアルゴリズムを開発するための基礎を提供し、デ

用途: データ解析のアルゴリズムを確立するための数値論理の研究
難易度: Hard
コスト: Low

A Lifecycle and Application-Stack Survey of Large Language Model Vulnerabilities: Attacks, Risks, Defenses, and Open Problems

LLMの不正行為に対する防御。この研究では、LLMの不正行為を防ぐための防御の枠組みを開発し、LLMの不正行為の危険性を分析する。

用途: LLMの不正行為に対する防御
難易度: Hard
コスト: High

githubGitHubあり2026-06-30

telegram-summary-bot — Summarize group chat with AI, LLM && query group chat, FREE to deploy your own, support img, link meta info, reply to, auto fold result, 支持中文检索.

telegramSummaryBotは、グループチャットをAIでサマライズすることができる。無料でデプロイして使用できる。

用途: グループチャットのサマリーサーバーをAIで構築
難易度: Easy
コスト: High

githubGitHubあり2026-06-30

mxcp — Model eXecution + Context Protocol: Enterprise-Grade Data-to-AI Infrastructure

データをAIに変換する基盤を構築することで、ビジネス上の問題を解決できます。この研究では、Model eXecution + Context ProtocolであるMXCPを提案し、データの変換を簡素化した上で、AIアプ

用途: データをAIに変換する基盤を構築することによって、ビジネスを改善する
難易度: Easy
コスト: High

githubGitHubあり2026-06-30

CV — ✅（已完结）超级全面的深度学习笔记【土堆 Pytorch】【李沐动手学深度学习】【吴恩达深度学习】【大飞大模型Agent】

深層学習のノート書。このノートには、土山さんのPytorchノート、おしうの「深層学習」を実践するノート、Wu's「深層学習」をテキスト化したノート、およびダフィンの「大モデルエージェント」のノートが含まれている。

用途: 深層学習ノート
難易度: Easy
コスト: High

Why can genetic algorithms work in high-dimensional search spaces?

この論文では、適応性の高いオプティマイザが使えるような新しいモデルを提案しました。このモデルは、エリティスト基準の遺伝アルゴリズムが実現するオプティマイザの挙動を解釈できるものでした。

用途: 適応性の高いオプティマイザの挙動の解釈
難易度: Hard
コスト: Low

品質予測/異常検知自然言語処理大規模言語モデル生成

Semantics-Aware Bilevel Co-Evolution: Towards Automated Multicomponent Algorithm Design

LLM-assisted evolutionary search (LES) has emerged as a promising paradigm for automated algorithm design. How

用途: 生成
難易度: Hard
コスト: High

Optimal Auction Design for Constrained Buyers

We study single-parameter, multi-buyer auctions in which buyers are subject to constraints that affect their b

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Rethinking Collaborative Trust for Verifiably Decentralized Blockchain Systems

Despite the promise of decentralization, measurement studies have identified a conspicuous lack of decentraliz

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-06-28

Travel-Oriented Reasoning Large Language Model via Domain-Specific Knowledge Graphs

Large language models (LLMs) demonstrate broad reasoning abilities but struggle with accuracy and reliability

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-28

Discovery by Dreaming: Cross-Domain Recombination in Artificial Memory

Dreams splice together people, places, and times that never met. Neuroscience suggests this recombination is n

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-06-28

Computational Complexity of Strong and Average Justified Representation

We study the approval-based multiwinner election problem where a set of $n$ voters cast approval-based ballots

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

githubGitHubあり2026-06-28

awesome-japanese-llm — 日本語LLMまとめ - Overview of Japanese LLMs

分析システムの性能を向上するための学習モデル開発を行う。

自然言語処理大規模言語モデル生成マルチモーダル

用途: 分析システムの性能を向上するための学習モデル開発
難易度: Easy
コスト: High

arxivPaper only2026-06-27

LLM Semantic Signaling Game and Mechanism Design: Systematic Blindness, Awareness Shaping, and Mindset Dynamics

Large language models (LLMs) increasingly mediate strategic interactions through natural language, making sema

自然言語処理大規模言語モデル検出テキスト

用途: 検出
難易度: Hard
コスト: High

arxivPaper only2026-06-26

Analysis of Parameter Settings for the Bat Algorithm Using Variance Evolution

Parameter settings in evolutionary algorithms and metaheuristics are important because such parameter values c

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-06-26

Which Nash Equilibrium? Solver-Dependent Selection on Zero-Sum Nash Polytopes

ゼロサムゲームにおけるNash均衡の選択を探った。研究者は、さまざまなソルバーによって異なる解が選択される現象を調べた。

表形式向き自然言語処理RAG表形式

用途: 零和ゲームの解決策を探す
難易度: Hard
コスト: Low

arxivPaper only2026-06-26

Triadic Werewolf: A Jester Role for Multi-Hop Theory of Mind in LLMs

これは、LARGE LANGUAGE MODELS (LLM) の理論心の評価を拡張し、三重なるWerewolfゲームを追加しました。

用途: 三重なるWerewolfゲーム
難易度: Hard
コスト: High

arxivPaper only2026-06-25

Surviving by Serving: Functional Relevance Drives Self-Organization in Complex Adaptive Systems

この研究では、複雑な適応システムの分析をしました。これは、システムの構造を分析することで、系統的な機構がどのように発生するかを理解するために行われた。

自然言語処理ファインチューニング生成

用途: 複雑な適応システムの分析
難易度: Hard
コスト: Medium

arxivPaper only2026-06-25

Random Walk on Bézier Curves for Global Optimization

この研究では、全域最適化アルゴリズムを開発しました。研究者は、Bézier カーブに基づく随伴的トレース検索というアルゴリズムを提案しました。

説明可能自然言語処理RAG

用途: 全域最適化アルゴリズムの開発
難易度: Hard
コスト: Low

arxivPaper only2026-06-24

Strategyproof Facility Location and Committee Selection with Mixed Max and Sum Agent Types

この研究

用途: 共有タスク割当の設計
難易度: Hard
コスト: Low

arxivPaper only2026-06-24

SidConArena: An Environment Evaluating Agents in Open-Ended,Positive-Sum Bargaining Game

Evaluating LLM agents requires dynamic environments that go beyond static reasoning and zero-sum games. Real-w

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-23

Distributed Quality-Diversity Search for Toxicity in Large Language Models

この研究では、多様性のあるトキシックテストを検索します。

用途: 多様性のあるトキシックテストの検索
難易度: Hard
コスト: High

arxivPaper only2026-06-23

Age of LLM: A Strategic 1v1 Benchmark for Reasoning, Diplomacy and Reliability of Large Language Models under Fog of War

大規模言語モデルの戦略1対1ベンチマークであるAge of LLMを紹介。マインスイーパーゲームを想定し、フォーゲットオブラーサー、マインスイーパー対戦、JSONスキーマへの従属性という三つのストレスアウトを設定。

用途: マインスイーパーゲーム用ベンチマーク
難易度: Hard
コスト: High

arxivPaper only2026-06-22

Decomposing Financial Market Dynamics via Mechanism Analysis in an Evolutionary Multi-Agent Simulation

Evolutionary agent-based markets (ABMs) couple several mechanisms -- who reproduces, how price forms, how bias

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-06-22

EMAgnet: Parameter-Space EMA Regularization for Policy Gradient Self-Play in Large Games

Recent work has established that regularized policy gradient methods such as PPO, when used in self-play, can

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-06-22

Each Judge Its Own Yardstick: Discovering Per-VLM Taxonomies for Physical Video Evaluation

Maintaining physical consistency in video generators and world models increasingly relies on vision-language m

自然言語処理大規模言語モデルテキスト動画マルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-21

Emergent Culture in Minimal LLM Systems

What happens when LLM agents operate with no context outside a turn, minimal prompting, and simple tools? Insp

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-21

Stationary Robust Mean-Field Games under Model Mismatches

Deploying multi-agent reinforcement learning (MARL) in the real world is often limited by model mismatches bet

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-21

Theorist Toolbox: Tools for Agent Based LLM-assisted economic theory Research

Empirical economists often start their projects with a toolbox. Shared packages, replication archives, and cir

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-20

Learning a Normal World Model for Few-Shot Boundary-Calibrated Abnormality Detection

Abnormality detection in complex systems faces two practical barriers: abnormal labels are scarce, and binary

少数データ向きセンサ/時系列自然言語処理プロンプトエンジニアリング検出テキスト

用途: 検出
難易度: Hard
コスト: Medium

arxivPaper only2026-06-19

Accelerated and Stable Convergence with Anchored Optimistic Method

We study first-order methods for solving monotone variational inequalities arising in min-max optimization. Cl

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-06-18

Formally Verified Code Synthesis for Structured Data Translation in a Medical Internet of Things

In this work we present a LLM powered, evolutionary code synthesis system for structured data translation in a

表形式向き自然言語処理大規模言語モデル生成表形式

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-06-18

Beyond Accuracy: Measuring Logical Compliance of Predictive Models

機械学習モデルを評価する手法を提案。既存の評価方法ではモデルが誤った結果を出してしまうため、これによりモデルが正確に評価できる。

品質予測/異常検知自然言語処理埋め込み・検索分類回帰

用途: 機械学習モデルを評価
難易度: Hard
コスト: High

arxivPaper only2026-06-17

Constraint-aware Optimization in Auto-Tuning

Automatic performance tuning, or auto-tuning, is a key technique in high-performance computing, enabling appli

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivPaper only2026-06-15

Evolution & Foundation: AI Shares Creative Control

AIが人間と協力して作り出すアイデアを評価するための新しい手法を提案し、創造性の評価を向上させた。

自然言語処理ファインチューニング生成画像3D

用途: AIの創造性を評価するための新しい手法
難易度: Hard
コスト: High

arxivPaper only2026-06-14

Runtime Analysis of Cartesian Genetic Programming in Evolving Boolean Functions

Cartesian Genetic Programming (CGP) is among the practical and popular forms of Genetic Programming as it uses

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-14

MSC-CMA-ES: Structure-Aware Restarts for CMA-ES via Cyclic Nearest-Better Basin Discovery

CMA-ES behaves, per restart, primarily as a local optimizer; multimodal search relies on restart strategies su

MI向き自然言語処理RAGマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-13

Large Language Model-Driven Cooperative Operator Ensemble Evolution for Permutation Flow Shop Scheduling

この研究では、PFSPのIterated Greedy (IG) アルゴリズムのパフォーマンスを改善するために、Large Language Model-Driven Cooperative Operator Ensem

少数データ向きCPUで試しやすい品質予測/異常検知自然言語処理大規模言語モデルテキスト

用途: PFSPのIterated Greedy (IG) アルゴリズムのパフォーマンスを改善すること。
難易度: Hard
コスト: High

ZIVARI-TLBO: A Zero-Cost Inter-Group Evaluated-Elite Relay Mechanism for Teaching-Learning-Based Optimization

ZIVARI-TLBO is a grouped Teaching-Learning-Based Optimization (TLBO) method that augments an existing populati

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Genetic Algorithm Based Coordination and Optimization Model for Generation Grid Load Storage in Active Distribution Networks

Create an optimization framework that combines fuzzy logic and genetic algorithms for risk assessment and coor

用途: 生成
難易度: Hard
コスト: Low

Operator Calculus for Population-Based Optimization: A Mean-Field Convergence Theory

並列問題の収束解析について、算術演算法則を利用した理論を提案し、一般化された分布オプティマイザを導入した。

MI向き自然言語処理RAG

用途: 並列問題の収束解析
難易度: Hard
コスト: Low

MeEvo: Metacognitive Evolution Combined with Natural Evolution for Automatic Heuristic Design

この研究では、自動補助関数設計（AHD）についての研究を行った。AHDは、マシン学習が可能になる以前から研究されていたトピックであり、マシン学習によって、AHDがさらに活用可能になった。この研究では、AHDにおけるメタ認

用途: 自動補助関数設計
難易度: Hard
コスト: High

Co-Evolved Spiking Neural Network Ensembles via Marginal Contribution Fitness

Evolutionary optimization of spiking neural networks (SNNs) becomes increasingly difficult as task complexity

自然言語処理RAG分類回帰

用途: 分類
難易度: Hard
コスト: Low

huggingfaceHugging Faceあり2026-05-07

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL

Recent growth in reinforcement learning (RL) has surfaced a need for diverse, specialized training environment

用途: 技術検証・論文読解補助
難易度: Easy
コスト: High