MLinfo | 機械学習・AI論文まとめ

Discovering Functionally Selective Brain Regions with a Deep Topographic Multimodal Model

この研究では、脳部帯域内のニューロンが同じ反応プロファイルを持つと仮定し、近接な脳部帯域内のニューロンの反応プロファイルを推論し、分野間の結合を特定しました。

自然言語処理RAG画像マルチモーダル

用途: 脳部帯域の研究
難易度: Hard
コスト: High

Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan

この研究では、低リソース言語NMTのために、データ合成方法を開発しました。これにより、データ合成されたコーパスを使用し、NMTモデルをパラメータ効率的にフィーヌチュン化できます。

深層学習軽量化・量子化生成翻訳テキスト

用途: NMT低リソースデータ合成
難易度: Hard
コスト: Medium

Algorithm for Contextual Queueing Bandits with Rate-Optimal Queue Length Regret

Contextual queueing bandits provide a framework for learning to schedule heterogeneous jobs under unknown cont

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Muon Learns More Robust and Transferable Features than Adam

Muon has recently emerged as a state-of-the-art optimizer for pretraining Large Language Models (LLMs) and vis

深層学習Transformer分類画像テキスト

用途: 分類
難易度: Hard
コスト: High

Streaming Interventions: Can Video Large Language Models Correct Mistakes as They Occur?

動画大規模言語モデルを使用した質問に対する回答を研究。モデルの能力と限界を調査し、質問に対する答えを生成するための方法を提案した。

深層学習軽量化・量子化テキスト動画マルチモーダル

用途: 動画大規模言語モデルを使用した質問に対する回答
難易度: Hard
コスト: High

Breaking the Tokenizer Barrier: On-Policy Distillation across Model Families

On-Policy Distillation (OPD) has become a core technique in the post-training of Large Language Models (LLMs)

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models

この研究では、機械学習モデルをプライバシー保護のための適応化する際、プライバシー保護の実験的な効果を分析することに関与します。

深層学習軽量化・量子化異常検知テキスト

用途: プライバシー保護のベンチマーク
難易度: Hard
コスト: High

PriFT: Prior-Support Guided Supervised Fine-Tuning

この研究では、プレトレーニング済みモデルを低レベルタスクに向けて適応化するためのPrior-Support ガイドされた超視覚的フィニートゥニング方法であるPriFT を提案しました。

深層学習軽量化・量子化生成QA強化学習

用途: 低レベルタスクの適応化
難易度: Hard
コスト: High

Distilling Safe LLM Systems via Soft Prompts for On Device Settings

この研究では、強力な防御ガードモデルと低パラメータのLLMを組み合わせたデュアルモデルシステムを導入し、安全なLLMのデプロイに使用できます。

用途: セーフなLLMのディストリビュート
難易度: Hard
コスト: High

Conan-embedding-v3: Fusing Modality-Specific Models for Omni-Modal Embedding

この研究では、テキスト、画像、ビデオ、アウディオ等の異なるモダリティのデータを統合したオムニモダル検索システムを構築します。

自然言語処理ファインチューニング回帰検索画像

用途: オムニモーダル検索
難易度: Hard
コスト: High

MI向き自然言語処理ファインチューニング画像テキスト

Orange Lab: Lowering Barriers to Data Mining through Embedded Interactive Workflows

この論文では、data mining におけるビジュアルプログラミングフレームワーク、Orange Lab を提唱しました。これにより、Webベースのデータ分析環境を提供し、ユーザーフェイシングの分析ツールとしてデータ分

用途: データ分析フロー
難易度: Hard
コスト: Medium

From Shortcuts to Reasoning: Robust Post-Training of Theory of Mind with Reinforcement Learning

理論的思考は、最新の基礎モデルシステムが安全かつ効果的に現実世界で動作するには必須のスキルであると考えられています。しかし、理論的思考の進進には、「ショートカット」問題が存在し、タスクは99％の正解率を達成するのに、ただ

自然言語処理RAGテキストマルチモーダル強化学習

用途: 理論的思考の強化問題
難易度: Hard
コスト: High

The Hidden Bias of Process Reward Models:PRISM for Rewarding the Right Reasoning

Process Reward Models (PRMs)は、セグメントごとにフィードバックを提供し、credit assignmentを改善します。しかし、我々はPRMsにおける偏微分に関する潜在的な偏りを識別し、この偏

用途: パフォーマンス評価モデルの偏微分問題
難易度: Hard
コスト: High

深層学習軽量化・量子化テキストマルチモーダル強化学習

Stage-1 Controls the Entropy Regime, Not the Outcome

Two-stage post-training -- a Stage-1 warm-start (supervised fine-tuning, SFT, or on-policy distillation, OPD)

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Families of Control-Cost-Parametrized Inverse-Optimal Universal Stabilizers

A classical universal stabilization formula offers the practitioner no design freedom: it is a single, paramet

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

PROBE-Web: An Interactive System for Probing Evaluation Landscapes of Knowledge Graph Completion Models

知識グラフの補完の評価を実現するために、新しいアプローチを提案します。

説明可能自然言語処理ファインチューニング

用途: 知識グラフの補完の評価
難易度: Hard
コスト: Medium

品質予測/異常検知自然言語処理大規模言語モデルテキスト

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

この論文では、エージェントの委譲能力を改善するために、新しいフレームワークを提案する。これにより、エージェントがより効率的にタスクを分割できる。

用途: エージェントの委譲
難易度: Hard
コスト: High

From 0-to-1 to 1-to-N: Reproducible Engineering Evidence for MetaAI Recursive Self-Design

Recursive self-design refers to AI-assisted modification of the mechanisms by which an AI system is built, eva

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Emergence of Context Characteristics Sensitivity in Large Language Models

During instruction fine-tuning (IFT), large language models (LLMs) learn to follow instructions by using the p

深層学習Transformerテキスト強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

A Finetuned SpeechLLM for Joint Multi-Granular L2 Assessment and Natural-Language Rationales

スピーチアセスメントを自動化するためのSpeechLLMが提案され、スピーチの質と能力を評価する。

説明可能自然言語処理大規模言語モデル音声

用途: L2スピーチアセスメントの実現
難易度: Hard
コスト: High

AliyunConsoleAgent: Training Web Agents in Real-World Cloud Environments via Distillation and Reinforcement Learning

Webエージェントを自動化するためのAliyunConsoleAgentが提案され、ドキュメントの検証とWebエージェントの開発を簡素化する。

深層学習軽量化・量子化テキスト強化学習

用途: Webエージェントの自動化
難易度: Hard
コスト: High

センサ/時系列深層学習Transformer生成画像音声

Physics-Guided Sequence-Based Generative Framework for Acoustic Metamaterial Inverse Design

可変化の帯域幅を考慮した、聴覚超材料の逆設計における新しいフレームワークである Physics-Guided Sequence-Based Generative Framework for Acoustic Metama

用途: 可変化の帯域幅を考慮した、聴覚超材料の逆設計
難易度: Hard
コスト: High

少数データ向き深層学習Transformer分類検出

Proposal Refinement for Few-Shot Object Detection

少ない例の問題のオブジェクト認識においては、オブジェクト認識の提案の精度を向上させることができる。

用途: オブジェクト認識における少ない例の問題に対する提案
難易度: Hard
コスト: High

Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models

Egocentric visionを使用して、ペダストリアンの歩く道に渡るのを予測する。Closed-ended visual question answering（VQA）問題に形式することで、ビジョン言語モデルを使用

深層学習TransformerQA画像テキスト

用途: ペダストリアンが歩く道に渡るのを予測する
難易度: Hard
コスト: High

MI向き深層学習軽量化・量子化テキストマルチモーダル強化学習

Personalization Meets Safety:Mechanisms,Risks,and Mitigations in Personalized LLMs

Large Language Models (LLMs) have enabled increasingly personalized interactions by adapting to users' prefere

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

NutriMLLM: Multimodal Large Language Models for Dietary Micronutrient Analysis

Comprehensive estimation of dietary micronutrients from food images could improve clinical nutrition care, but

自然言語処理大規模言語モデル生成画像テキスト

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング検出画像テキスト

Failure-Aware Refinement of Vision-Language Model for Lithography Defect Detection

Semiconductor lithography inspection requires reliable detection of small pattern defects such as bridge, burr

用途: 検出
難易度: Hard
コスト: High

MI向き品質予測/異常検知自然言語処理ファインチューニング分類生成テキスト

Quality-Diversity Search in Sound Generation: Investigating Innovation Engines for Audio Exploration

この研究では、音楽生成における多様性を促進するためのオープンソース・フレームワークを開発します。このフレームワークは、音楽生成における多様性の促進を支援するために、進化的プロセスと多様性促進アルゴリズムを組み合わせたもの

用途: 音楽生成における多様性の促進
難易度: Hard
コスト: Low

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

この研究では、マルチモーダル言語モデルの評価のためのフレームワークを開発します。このフレームワークは、マルチモーダル言語モデルの生成性とコントロール性を評価することができます。

品質予測/異常検知深層学習軽量化・量子化テキスト

用途: マルチモーダル言語モデルの評価
難易度: Hard
コスト: High

Automated IEP Generation from Traditional Chinese Parent-Teacher Interviews via Corpus-Grounded Feature Diffusion

Writing Individualized Education Programs (IEPs) is a high-labor, knowledge-intensive document burden; English

深層学習Transformer生成テキスト

用途: 生成
難易度: Hard
コスト: High

センサ/時系列深層学習Transformerテキスト音声

Overcoming Decoder Inconsistencies in Whisper for Dravidian and Low-Resource Languages

WhisperのようなマルチリンガルASRモデルの音声認識能力をDravidian言語で向上させるために、データセットと言語分析を用い、モデルをフィネチュアリングし、デコーダの不平衡を解消し、音声認識誤差を低減した。

用途: Dravidian言語の音声認識を改善する
難易度: Hard
コスト: Medium

DECSELFMASK: Leveraging Unlabeled Text via Self-Relevance-Guided Masking for Decoder-Only Classification

予備情報が少ない場合や医療分野などの特定の分野の場合、分類タスクは難しいようになるが、この研究では、モデルが未分類データを操作して、分類モデルの性能を向上させる方法である、DecSelfMaskを提案した。

自然言語処理RAG分類生成テキスト

用途: 分類タスクの性能向上
難易度: Hard
コスト: High

Emergent Misalignment Can Be Induced by Sycophancy and Reversed via Alignment Gating

Prior work has shown that fine-tuning large language models on malicious or incorrect outputs in narrow domain

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

CRANE: Knowledge Editing for Reasoning MLLMs

The emergence of reasoning multimodal large language models (MLLMs), which generate explicit chain-of-thought

自然言語処理大規模言語モデル異常検知画像テキスト

用途: 異常検知
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル要約テキスト

arxivGitHubあり2026-06-08

Language-Aware Token Boosting: LLM Language Confusion Reduction Without Tuning

Large language models (LLMs) sometimes exhibit language confusion when generating non-English text. Existing a

用途: 要約
難易度: Hard
コスト: High

SoccerNet 2026 Player-Centric Ball-Action Spotting:Retraining and Post-Processing Extensions to the FOOTPASS Baselines

We describe our system for the SoccerNet 2026 Player-Centric Ball-Action Spotting Challenge, which requires pr

深層学習グラフニューラルネット画像テキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能深層学習Transformerテキスト動画

MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

The dominant paradigm in video retrieval relies on embedding-based full-corpus scanning, which suffers from in

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理大規模言語モデル画像テキスト動画

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

Image and video captioning are fundamental tasks that bridge the visual and linguistic domains, playing a crit

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Self-supervised Learning Matters: A Simple Ensemble Solution for Micro-Gesture Recognition

In this paper, we present XInsight Lab's solution to the micro-gesture classification track of the 4th MiGA Ch

自然言語処理ファインチューニング分類埋め込み動画

用途: 分類
難易度: Hard
コスト: High

品質予測/異常検知深層学習Transformer生成テキスト動画

LiteVSR: Lightweight Adaptation of Frozen Diffusion Transformers for Video Super-Resolution

Adapting large-scale pre-trained video generators for Video Super-Resolution (VSR) in novel domains remains co

用途: 生成
難易度: Hard
コスト: High

品質予測/異常検知深層学習Transformer生成3D

KPGrasp: Scalable Keypoint Flow Matching for Dexterous Grasp Generation

Generating high-quality dexterous grasps remains challenging for learning-based methods, which often depend on

用途: 生成
難易度: Hard
コスト: High

Back to the Familiar Future: Failure Recovery for VLA Policies via Pre-Imagined Milestone Selection

Vision-language-action (VLA) policies can deviate from nominal trajectories during manipulation, even when tas

自然言語処理RAG画像マルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

A Comparison of SSL-Based Feature Extractors and Back-End Classifiers for Spoofing Detection: A Multi-Corpus Training and Cross-Linguistic Analysis

Voice biometric systems face growing threats from spoofing attacks, yet the evaluation of detection models rem

深層学習CNN分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

FiberTune: Preserving Action-Fiber Visual Residuals in Vision-Language-Action Fine-Tuning

Action-supervised fine-tuning of vision-language-action (VLA) policies fits demonstrations effectively but con

深層学習Transformer画像マルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

深層学習Transformer予測テキスト強化学習

Towards Long-Horizon Vessel Trajectory and Destination Forecasting with Reasoning Large Language Models

Long-horizon maritime trajectory prediction is important for shipping management, logistics planning, and mari

用途: 予測
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化マルチモーダル強化学習

Reinforcement Learning for Flow-Matching Policies with Density Transport

We present an online reinforcement learning (RL) algorithm for fine-tuning flow-matching policies in continuou

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

センサ/時系列自然言語処理大規模言語モデルテキスト時系列

arxivGitHubあり2026-06-07

Lost in the Non-convex Loss Landscape: How to Fine-tune the Large Time Series Model?

Recently, large time series models (LTSMs) have gained increasing attention due to their similarities to large

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知深層学習軽量化・量子化テキスト強化学習

Autonomous Aerial Manipulation via Contextual Contrastive Meta Reinforcement Learning

Unmanned aerial vehicles (UAVs) are increasingly being deployed in logistics, service robotics, and other real

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

Benchmarking Vision-Language-Action Models on SO-101: Failure and Recovery Analysis

Vision-Language-Action (VLA) models have demonstrated strong generalization in robotic manipulation, yet exist

自然言語処理ファインチューニングマルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向き深層学習Transformer生成画像テキスト

ZIPP:Zero-shot Image Personalization from Personas

Text-to-image diffusion models are increasingly deployed in open-ended creative contexts, yet their outputs re

用途: 生成
難易度: Hard
コスト: High

Cross-Source Reasoning-based Correction for Author Name Disambiguation

Author name disambiguation is a critical challenge in academic search systems, often addressed through from-sc

自然言語処理RAG

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

arxivGitHubあり2026-06-07

Multilingual Fact-Checking at Scale: Fine-Tuned Compact Models vs LLMs

We present a multilingual fact-checking system deployed at Factiverse, designed for high-throughput and low-la

深層学習Transformer分類検出

用途: 分類
難易度: Hard
コスト: High

Detection and Interpretability Analysis of Quotation Errors by Large Language Models

Purpose - Quotation error refers to the inconsistency between cited information and its original source. This

説明可能深層学習軽量化・量子化検出テキスト

用途: 検出
難易度: Hard
コスト: High

Ishigaki-IDS: An Open-Weight Verifier-Aware Model for Information Delivery Specification Drafting in Building Information Modeling

Building Information Modeling (BIM) projects require information requirements to be described as machine-check

自然言語処理大規模言語モデル生成強化学習

用途: 生成
難易度: Hard
コスト: High

Friend or Foe? Language as an ideological switch in open-weight LLMs under Russian disinformation stress

As Russia's war against Ukraine extends into generative AI, large language models (LLMs) adapted for local pos

MI向き自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

Large language models answer knowledge-intensive questions using both parametric memory and retrieved evidence

自然言語処理大規模言語モデル生成テキスト

用途: 生成
難易度: Hard
コスト: High

CSFlow: Aligning Flow Matching with Human Contrast Sensitivity

We introduce Contrast Sensitive Flow (CSFlow), a weighting scheme that connects the human eye's Contrast Sensi

深層学習Transformer生成画像

用途: 生成
難易度: Hard
コスト: High

Shift-Dependent Asymmetry: Orthogonal Inverse Low-Rank Adaptation for Federated Medical Segmentation

Low-Rank Adaptation (LoRA) enables efficient federated fine-tuning of segmentation foundation models for medic

深層学習軽量化・量子化セグメンテーション

用途: セグメンテーション
難易度: Hard
コスト: Medium

SSAFE: Simple and Strong AI-Generated Image Detection via Frozen Vision Encoders

The rapid advancement of generative models has blurred the boundary between synthetic and real imagery, creati

自然言語処理ファインチューニング分類検出生成

用途: 分類
難易度: Hard
コスト: High

CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs

Vision-language models (VLMs) pretrained on large-scale image-text pairs demonstrate strong image-level unders

深層学習CNN検出生成セグメンテーション

用途: 検出
難易度: Hard
コスト: High

自然言語処理ファインチューニング異常検知画像テキスト

Two Bridges, One Pathway: From VLMs to Generalizable VLAs with Embodied Trajectory-Coupled Data

Vision-language models (VLMs) are powerful general-purpose reasoners, yet converting them into robot control p

用途: 異常検知
難易度: Hard
コスト: High

ActProbe: Action-Space Probe for Early Failure Detection of Generative Robot Policies

Generative robot policies fail unpredictably at deployment: they hesitate at critical moments, drift off-task,

深層学習RNN / LSTM検出生成画像

用途: 検出
難易度: Hard
コスト: Low

TLRD: Teaching LLMs to Reason over Tabular Data with Tri-Level Rationale Distillation

Tabular data is a primary medium for storing real-world information, driving many industrial applications of m

表形式向き深層学習軽量化・量子化テキスト表形式

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

AgriGov: A Structured Multilingual Dataset Curation for Indian Government Schemes for Farmers

AgriGov is a curated, trilingual (English-Hindi-Marathi) dataset designed to address the scarcity of domain-gr

表形式向き自然言語処理RAG翻訳要約QA

用途: 翻訳
難易度: Hard
コスト: Low

SSR: Can Simulated Patients Learn to Stigmatize Themselves? Modeling Self-Stigma through Internal Monologue

Simulating patients with large language models (LLMs) is a promising tool for mental health training, but exis

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向き品質予測/異常検知深層学習軽量化・量子化生成テキスト

ZAS-SQL: Distilling Rules from Failures for Zero-Shot Text-to-SQL

Text-to-SQL translates natural language into executable SQL queries. Few-shot in-context learning methods buil

用途: 生成
難易度: Hard
コスト: High

Building Comparative Motivation Profiles with Instrumental Interventions

Safety evaluations often infer latent motivations from behavioral patterns, but the construct validity of thes

深層学習Transformerテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知画像検査深層学習軽量化・量子化テキスト

AlignFed: Alignment-Aware Asynchronous Federated Fine-Tuning for Large Language Models in Heterogeneous Edge Environments

Large Language Models (LLMs) have significantly propelled the advancement of edge intelligence and have been w

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向きMI向き条件最適化自然言語処理ファインチューニングテキストマルチモーダル

CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning

Enabling robots to understand and execute tasks from natural language commands while maintaining data efficien

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理大規模言語モデル画像テキストマルチモーダル

arxivGitHubあり2026-06-06

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet the

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

When Behavioral Safety Evaluation Fails: A Representation-Level Perspective

Large Language Model (LLM) safety has often been evaluated at the behavior level, which provides limited evide

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

説明可能品質予測/異常検知自然言語処理ファインチューニング生成要約画像

IEA: Amateur-Friendly Conversational Image Editing Agent via Three Stages of Multitask Alignment

Current image editing software often hinges on fixed filters or expert tuning, leaving a gap between amateur u

用途: 生成
難易度: Hard
コスト: Medium

Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs

Backdoor attacks in large language models (LLMs) are often treated as isolated trigger-response failures, moti

深層学習軽量化・量子化分類検出テキスト

用途: 分類
難易度: Hard
コスト: High

MemToolAgent overview with a simple restaurant booking scenario where the agent retrieves similar memories, receives feedback on an invalid time format, and generates a reflection to update its memory

Modern large language model (LLM) agents can use external tools to help users solve complex tasks. However, fo

自然言語処理大規模言語モデルテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

Revisiting Articulated Parts Perception in Robot Manipulation

We are surrounded by various objects with movable, articulated parts, e.g., box, handle, door. An accurate and

品質予測/異常検知深層学習軽量化・量子化画像

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

Uncertainty-Aware Intention Prediction for Human-to-Robot Assembly Teleoperation

In assisted teleoperation for human-robot collaboration, accurate intention prediction is critical for enablin

自然言語処理RAG分類検出セグメンテーション

用途: 分類
難易度: Hard
コスト: High

Ego-Pi: VLA Fine-Tuning for Ego-Centric Human and Robot Data

Robotics faces a fundamental challenge of data scarcity. Unlike language or vision research, there is no inter

自然言語処理RAG

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Low

IntentNav: Learning Spatial-Visual Object Navigation from Human Demonstrations

Object navigation requires a robot to search for an unobserved target in an unknown environment by deciding wh

自然言語処理RAG画像3D

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

少数データ向き深層学習軽量化・量子化生成マルチモーダル強化学習

Q-VGM: Q-Guided Value-Gradient Matching for Flow-Matching VLA Policies

We propose Q-Guided Value-Gradient Matching (Q-VGM), an off-policy reinforcement learning (RL) method that tac

用途: 生成
難易度: Hard
コスト: High

説明可能自然言語処理大規模言語モデル分類画像テキスト

arxivGitHubあり2026-06-05

LLM-Guided Evolution for Medical Decision Pipelines

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt

用途: 分類
難易度: Hard
コスト: High

arxivPaper only2026-06-05

MinNav: Minimalist Navigation Using Optical Flow For Active Tiny Aerial Robots

Navigation using a monocular camera is pivotal for autonomous operation on tiny aerial robots due to their per

自然言語処理ファインチューニング動画

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-06-05

Robotic Policy Adaptation via Weight-Space Meta-Learning

Vision-Language-Action (VLA) models are emerging as a promising paradigm for robotic manipulation, enabling ge

自然言語処理ファインチューニング動画マルチモーダル

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング

Effective Dimensionality as an Operator Invariant for Physics-Preserving Constraint Adaptation in Physics-Informed Neural Networks

本論文では、物理制約に基づく制約アダプタ

用途: 物理に基づく制約アダプタ
難易度: Hard
コスト: High

IDDMBSE: Integrating Data-Driven and Model-Based Systems Engineering for Trusted Autonomous Cyber-Physical Systems

Autonomous cyber-physical systems (CPS) sit at the intersection of Model-Based Systems Engineering (MBSE) and

センサ/時系列自然言語処理ファインチューニング

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers

HANDOFFは、人間を模倣するロボットの制御を実現するために構築されたフレームワークです。ロボットはタスクを認識し、動作を生成します。HANDOFFは、タスクに合わせて動作を生成するアジエントを形成するために、教師と学

用途: 人間臭いアジентыのロボット制御を実現
難易度: Hard
コスト: Medium

Meridian: Metric-Semantic Primitive Matching for Cross-View Geo-Localization Beyond Urban Environments

この研究では、地位認識を改善するために、地位認識と位置推定を統合した Meridian を提案します。

自然言語処理RAG検出画像

用途: 地位認識の改善
難易度: Hard
コスト: High

Synthetic Data Generation and Vision-based Wrinkle and Keypoint Detection for Bimanual Cloth Manipulation

布物操作の学習システムを開発しました。このシステムは、人間が布物操作を学習できます。

品質予測/異常検知深層学習CNN検出生成画像

用途: 布物操作の学習
難易度: Hard
コスト: Medium

TAM: Torque Adaptation Module for Robust Motion Transfer in Manipulation

policyがrobotによって異なり、sim-to-real gap、unknown payloads、同じロボットの異なるインスタンスの動的特性により、contact-rich、動的マニュピュレーションのためのpol

用途: Robust Motion TransferのためのRobotic Motion Policyのトルク適応
難易度: Hard
コスト: High

ActiveMimic: Egocentric Video Pretraining with Active Perception

この研究では、人々が実際に操作を行っている場合に、人が視点を変更してカメラ動きを生み出しながら学習することの重要性を認識し、ActiveMimicというプレトランジングフレームワークを提案します。

自然言語処理ファインチューニング動画

用途: エゴセンティック動画
難易度: Hard
コスト: High

Learn to Match: Two-Sided Matching with Temporally Extended Feedback

Two-sided matching markets often involve information that unfolds over time through interviews, repeated inter

自然言語処理ファインチューニング生成

用途: 生成
難易度: Hard
コスト: Medium

arxivPaper only2026-06-03

Offline-to-Online Learning in Linear Bandits

オンライン学習でオフラインデータを使用できるように、線形バンディットのアルゴリズムを提案する。

用途: オンライン学習におけるオフラインデータの利用
難易度: Hard
コスト: Medium

センサ/時系列品質予測/異常検知自然言語処理ファインチューニング検出異常検知テキスト

TPA-AD: A Two-Stage Pseudo Anomaly-Guided Method for Bearing Time-Series Anomaly Detection

This paper proposes a two-stage pseudo anomaly-guided anomaly detection method (\textbf{T}wo-stage \textbf{P}s

用途: 検出
難易度: Hard
コスト: High

品質予測/異常検知自然言語処理ファインチューニング

Resource-Constrained Adaptive Inference for Sequential Pricing

Resource-constrained pricing controllers can make fixed-price inference impossible: the controller's resource

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

少数データ向きCPUで試しやすい深層学習RNN / LSTM

Few-Shot Prediction for Pulsar Noise with Long Short-Term Memory Network

パルサーのタイミング残差を予測するために、長短期記憶ネットワーク (LSTM) を用いて提案しました。このLSTMは、小数数の時間残差に最も適しています。しかし、LSTMのトレーニングには時間がかかり、パルサー数が豊富な

用途: パルサーのタイミング残差を予測するためのLSTM
難易度: Hard
コスト: Medium

Do Real-World Datasets Contain Natural Experiments? An Empirical Study Using Causal Feature Selection

実用的なデータセット内の自然的実験の特定と分析と、それを用いた実験結果について論じます。

用途: 実用的なデータセット内の自然的実験の特定と分析
難易度: Hard
コスト: Medium

PrimeSVT: An Automated Memory-aware Pruning Framework with Prioritized Compression Policy for Spiking Vision Transformers

スパイキングビジョントランスフォームの量子の減少のために、物理学的に基づいた量子の削減方式を提案する。この方式は、モデルを物理的に削減する際に発生する不均衡を補正するために、モデルに特化した前処理と後処理を実施する。

深層学習Transformer

用途: スパイキングビジョントランスフォームの量子の減少
難易度: Hard
コスト: Low

PSViT: A Methodology for Structurally Pruning Spiking Vision Transformers

スパイク式ビジョン変換模型（SVM）を圧縮するための削減法の開発と、それを用いた実験結果について論じます。

深層学習Transformer画像

用途: スパイク式ビジョン変換模型（SVM）を圧縮するための削減法の開発
難易度: Hard
コスト: Medium

arxivPaper only2026-06-01

When Tabular Foundation Models Transfer Across Modalities: A Systematic Evaluation Across 95 Datasets, 7 Modalities, and Two Regimes

We present a single classification pipeline that combines an Equiangular Tight Frame (ETF) preprocessing stage

表形式向きセンサ/時系列品質予測/異常検知深層学習軽量化・量子化分類テキスト音声

用途: 分類
難易度: Hard
コスト: High

表形式向き自然言語処理ファインチューニング回帰表形式

arxivGitHubあり2026-05-31

On the Uncertainty Quantification Ability of Tabular Foundation Models

Foundation models (FMs) have achieved substantial success in generalizing across tasks without problemspecific

用途: 回帰
難易度: Hard
コスト: High

arxivPaper only2026-05-30

Meta-Black-Box Optimization with Ensemble Surrogate Modeling for Robustness-Accuracy Trade-off within SAEA

Surrogate-assisted evolutionary algorithms (SAEAs) have been widely used for expensive black-box optimization

説明可能条件最適化自然言語処理ファインチューニング強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-29

Used Car Salesbots? Honesty and Credulity of LLMs as Bargaining Agents under Partial Information

In this work we study agents in simulated bargaining scenarios, where a buyer and a seller communicate through

用途: 技術検証・論文読解補助
難易度: Hard
コスト: High

arxivPaper only2026-05-24

Anarchy in the swarm: Testing informed and uninformed diversity-enhancing mechanisms within PSO framework

Particle Swarm Optimization (PSO) frequently suffers from premature convergence. This paper introduces a famil

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-22

Prudent-Banker: No Extra Fees for Baseline Safety in Adversarial Bandits With and Without Delays

We study adversarial multi-armed bandits with and without delayed feedback under a safety-aware goal: achievin

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-21

Quantum Genetic Optimization for Negative Selection Algorithms in Anomaly Detection

この論文では、アノマリーディテクションにおける負の選択アルゴリズムを最適化するために、量子遺伝アルゴリズムを導入します。这により、検出器の生成効率を向上させることができ、精度も向上します。

品質予測/異常検知自然言語処理ファインチューニング検出生成異常検知

用途: アノマリーディテクションにおける負の選択アルゴリズムの最適化
難易度: Hard
コスト: Medium

arxivPaper only2026-05-15

Towards Code-Oriented LM Embeddings for Surrogate-Assisted Neural Architecture Search

これは、パフォーマンスの高いモデルサイズの減少を実現するために、Perforated Neural Networkがキーワード検出タスクに適用されていることを検証したり、Edge Impulseで動作するキーワード検出シ

説明可能品質予測/異常検知深層学習軽量化・量子化回帰テキスト

用途: キーワード検出
難易度: Hard
コスト: Low

arxivPaper only2026-05-15

Misspecified Estimate-then-Optimize Leads to Supra-Competitive Prices

We study whether simple algorithmic pricing systems can systematically produce collusive-like prices in multi-

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-14

Learning to Persuade a Biased Receiver

We study a repeated information design setting in which the receiver, who is also the decision-maker, updates

自然言語処理ファインチューニングテキスト

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium

arxivPaper only2026-05-10

Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution

Large Language Models exhibit mode collapse, producing homogeneous outputs that fail to explore valid solution

品質予測/異常検知深層学習軽量化・量子化生成テキスト

用途: 生成
難易度: Hard
コスト: High

arxivPaper only2026-05-10

Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain

Reinforcement learning (RL) has enabled robust quadruped locomotion over complex terrain, but most learned con

自然言語処理ファインチューニング強化学習

用途: 技術検証・論文読解補助
難易度: Hard
コスト: Medium