BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling
.deep learningモデルの重み操作を可能にするツールであるBrainSurgeryを提案し、層の再構成、精度のキャスティング、低ランク分解、構造的デバッグの実行を示します。
- 用途
- モデルの修正
- 難易度
- Hard
- コスト
- Low
「SHAP」の検索結果
98 件.deep learningモデルの重み操作を可能にするツールであるBrainSurgeryを提案し、層の再構成、精度のキャスティング、低ランク分解、構造的デバッグの実行を示します。
この研究では、ゼロショット セマンティック再特定の基準を設定し、画像のセマンティック特定を自動化します。
Understanding tactical organisation of association football, hereafter referred to as football, requires ident
分子設計のための機械学習モデルを作成することで、効率的な合成が可能になり、薬剤開発などの分野で大きく貢献することが可能です。
エージェントの安全性を向上するために、ハッカーのフェイクオートを作成して、リスクを評価するための新しいアプローチを提案します。
巨大なAIデータセンターは、電力系統のプランニングや運用において構造的に大きな課題を引き起こします。21つのAI成長シナリオを含むヨーロッパの空間的explicitな最適化モデルを使用して、DCsの追加電力需要、容量要件
マルチロボットシステムを用いた物体の輸送は多くの分野、産業から家庭までで不可欠なタスクです。一度の輸送タスクをロボット数台の輸送タスクに分割しそこまでの各タスクを個別に解決します。物体は実際には形状や質量分布が非一様で、
During instruction fine-tuning (IFT), large language models (LLMs) learn to follow instructions by using the p
マテリアルの非破壊検査を目的としたContext-Aware Deep Learningが提案され、エアロックの欠陥を検出する。
自動走行に利用されるAIエージェントを、超バイクの自律走行の実現に使えるようにするフレームワークを提案し、超バイクの自律走行を実現している。
multi-GPUでMLワークロードを学習する際に発生するコミュニケーションのオーバーヘッドを削減するためのフレームワークを提案している。
強化学習(RL)では、与えられた問題に対して、正しいアクションを見つけることを目的としたことが多いが、人間のフィードバックから学習する場合、人間の意思決定の選択のための意思決定のフレームワークを構築する必要性から、可否決
AIが持続可能性に関与する役割を理解し推進する必要性が増している。AIと持続可能性の研究の交差点をマップし、持続可能性とAI研究の両面に必要な、課題がある、可能性がある交差点を特定することを目的とする。
This report summarizes the CHIIR 2026 Workshop on Generative AI and Academic Search (GAI\&AS), which examined
この研究では、大規模言語モデルの安全性を評価するためのフレームワーク、PsychoSafe を開発します。このフレームワークは、大規模言語モデルの安全性を評価し、潜在的なリスクを軽減することができます。
As AI assistants serve millions of users daily, evaluating user experience (UX) beyond general model capabilit
LLMベースのエージェントは、環境と連携するハーネスの設計により動作が形作られるが、これらのハーネスは現状ほぼ人間による設計のみである。この研究では、LLMベースのエージェントがハーネスを自ら改善できるメカニズムであるS
End-to-end co-optimization of optical front-ends (e.g. metasurfaces) and neural network back-ends has been wid
3D semantic scene generation is crucial for autonomous driving applications, yet most methods rely on complex
With the growing demand for realistic virtual humans, parametric body models have become a cornerstone of mode
Diffusion models have advanced 3D shape generation, yet most methods still denoise in high-cardinality spaces
Autonomous mobile robots operating in tight environments require motion planning frameworks that account for t
Humanoid robots can fall on slopes, gravel, and uneven ground in unstructured environments. We target integrat
We study a capability the dominant paradigm in synthetic tabular data does not provide: exact satisfaction of
Deep learning EEG denoising architectures have scaled from tens of thousands to tens of millions of parameters
Tensor networks provide efficient representations for compressing large neural networks. By carefully designin
Quality-diversity reinforcement learning (QD-RL) aims to construct policy repertoires that contain both high-p
Large language models are rapicly replacing search engines as the primary interface between people and informa
Purpose - Quotation error refers to the inconsistency between cited information and its original source. This
Reinforcement Learning with Verifiable Rewards (RLVR) has become an effective paradigm for improving the reaso
Diffusion-based visuomotor policies operating directly in raw action spaces conflate scene comprehension with
Reinforcement learning (RL) has become a powerful paradigm for robot learning, particularly in sim-to-real set
Autonomous Underwater Vehicles (AUVs) traditionally rely on complex, heavily engineered pipelines for percepti
Reinforcement Learning from Verifiable Rewards (RLVR) has recently become a key paradigm for improving the rea
Symbolic benchmarks have emerged as a key approach to assess model robustness under minor modifications to STE
Humans increasingly turn to Language Models (LMs) in ways that shape beliefs and drive decisions, including di
Neural fields parameterize data as functions from coordinates to values, providing a unified framework for rep
Facial rigging - creating FACS-based blendshapes together with inner-mouth geometry (teeth, gums, and tongue)
Soft-bodied organisms such as octopuses and elephant trunks exhibit remarkable morphological adaptability, dyn
Variational autoencoders (VAEs) learn low-dimensional latent representations of high-dimensional data. When th
The transformer's emergent ability to perform in-context learning (ICL) has sparked a wide range of studies de
Navigation using a monocular camera is pivotal for autonomous operation on tiny aerial robots due to their per
Reinforcement learning has become the prevailing approach to humanoid locomotion control: policies transfer re
Assistive robots operating under shared autonomy must balance user control with autonomous assistance. Because
Deep research agents have demonstrated remarkable capabilities in complex information-seeking tasks, yet this
Understanding what generative models retain from training data remains challenging, with implications for copy
高次元データセットの局所的平均曲率を推定することは、幾何学的認識アライメントのような幾何学的認識アルゴリズム、特に Mean Curvature Boundary Points (MCBP) メソッドで重要となる。ナイー
Follow-the-regularized-leader framework has shown effectiveness and flexibility in online learning problems, w
Neural network (NN)-based nonlinear causal discovery methods recover DAG structure but leave each causal mecha
この研究では、シナリオベースのエネルギー消費予測を可能にするアルゴリズムを提案した。
The ratio of voting power between a permanent member and a non-permanent member of the United Nations Security
Latent visual reasoning (LVR) inserts supervised latent tokens between perception and answer generation in vis
Evaluating LLM mediators remains challenging, as mediation unfolds as a real-time trajectory shaped by disputa
Multimodal Large Language Models (MLLMs) excel at 2D semantic understanding but lack intrinsic 3D awareness, r
Autonomous driving requires reasoning about how ego actions shape the evolution of the surrounding world. Howe
機械学習モデルの出力を説明するゲーム理論的手法を取ったライブラリ。
オミクスデータを用いた炎症性腸疾患の亜型を検出するための、新しい方法、Tri-SfSVDを提案した。
Muon improves training efficiency over Adam in large language-model training by about two times, but the local
Learning representations of CAD models is a largely open problem. While 3D representation learning has flouris
Autoregressive mesh generation has gained attention by tokenizing meshes into sequences and training models in
Koopman theory turns nonlinear dynamics into a linear spectral problem. In computation, however, everything de
Selecting a clustering algorithm and its hyperparameters without labels is a common difficulty in engineering
Existing analyses of the edge of stability (EoS) treat it as a global property of optimization. We show that i
3D vision has rapidly evolved, driven by increasingly diverse data representations, learning paradigms, and mo
Few-step distillation has become an effective strategy for accelerating advanced visual generative models, yet
Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spe
Shapley values are a principled attribution measure widely used in interpretable machine learning, but their e
Implicit-process priors define distributions over functions through flexible generative mechanisms, making the
Laws and institutions shape individual outcomes through complex interactions with citizens' diverse circumstan
We study local pure coordination games on finite social networks, continuing the framework of Hutchcroft, Rosp
Estimating the economic contribution of a single patent inside a product that embodies tens of thousands of pa
How can a population of agents self-orchestrate and self-adapt into stronger collective intelligence without c
The GDP of a country is modelled as the relative interaction between two agents - working hours, reflecting th
Recent publications have suggested using the Shap- ley value for sensor anomaly/attack localization. We study
Biological and neuromorphic recurrent neural networks (RNNs) are subject to spatial and temporal locality cons
Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between i
Reinforcement Learning with Verifiable Rewards (RLVR) has recently emerged as the cornerstone for shaping the
次世代LLMモデルの協力性に影響を与える要因について調査した。ChatGPT-4oとClaude 3.5 Sonnetは共通の協力性を持っていたが、提供元は違いだった。
This paper studies preference-shaped expected improvement criteria for Bayesian multiobjective optimization. W
In large-scale benchmarking of stochastic optimization algorithms, the key challenge is no longer whether repe
Humans are the bottleneck in building and improving AI. Both the models and the agents that wrap them are writ
Cooperative multi-agent systems require robust mechanisms for credit assignment under uncertainty. Here we int
Spatial and temporal resource constraints are critical for both biological and artificial intelligent systems.
The dominant artificial intelligence paradigm trains neural architectures via gradient descent against proxy o
We investigate how internal representations emerge across hierarchical processing systems by introducing a neu
この研究では、Dynamic Shapley Computation を開発し、データ価値を迅速に推定できるようにしました。
This work presents a novel variant of the Firefly Algorithm (FA) for data clustering, addressing limitations o
Human-generated randomness is constrained by cognitive, motor, and strategic biases. This study examines how t
We introduce a class of cooperative games induced by weighted directed graphs. Specifically, the coalitional v
Strategic multi-agent systems are fundamentally characterized by decentralization, uncertainty, and ambiguity.
The space L of linear value maps on a finite-player cooperative game G^N is finite-dimensional, and admits a c
Spiking Neural Networks (SNNs) are a promising framework for event-driven temporal processing. Prior work has
Costly cooperation and costly signaling are both difficult to reconcile with simple fitness maximization, yet
Negotiation is a central mechanism of economic exchange, shaping markets, procurement, labor agreements, and r
再帰的ネットワークは複雑なプロセッサを持つため、最適化は難しい。計算資源に制限がある場合、パラメータを分配する際のバランスを取る必要がある。
この研究では、SNNのニューロンモデルが勾配を利用できるように、またダイナミクスが豊かで活動が疎密であるようなフレームワークが開発された。
Architectural choices inside the Transformer feedforward network (FFN) block do not merely affect the block it
Distributed computational substrates rely on two elementary operations: bundling, the act of populating a shar