LabBuilder: Protocol-Grounded 3D Layout Generation for Interactable and Safe Laboratory

Jianbao Cao*1,2, Zhangrui Zhao*1,3, Bohan Feng*1, Zixuan Hu4, Rui Li1, Haiyuan Wan5, Chenxi Li1,
Jingyuan Li6, Wenzhe Cai1, Lei Bai1, Wanli Ouyang1,7, Lingyu Duan4, Di Huang3, Minting Pan1, Sha Zhang7,
Xinzhu Ma3, Shixiang Tang†,7, Dongzhan Zhou†,1
1Shanghai Artificial Intelligence Laboratory, 2Wuhan University, 3Beihang University, 4Peking University,
5Tsinghua University, 6Shanghai Jiaotong University, 7The Chinese University of Hong Kong
*Equal Contribution    †Equal Advising
ICML 2026
LabBuilder Overview

Figure 1: LabBuilder system overview. Given a free-form experimental description, LabBuilder generates 3D laboratory layouts satisfying navigation feasibility, geometric compliance, and chemical safety.

Abstract

Automated laboratories hold the promise of significantly accelerating scientific discovery, yet the design of virtual experimental environments remains a core bottleneck limiting their scalable deployment. Existing 3D scene generation methods, designed primarily for household scenarios, fall short in laboratory settings due to three fundamental gaps: lack of protocol-level semantic understanding, neglect of chemical safety constraints, and absence of robot navigability guarantees.

We introduce LabBuilder, an end-to-end system that generates and verifies 3D laboratory layouts from concise textual specifications. LabBuilder operates through three tightly coupled components: LabForge curates a meta-dataset of annotated assets and chemical knowledge, translating natural language into structured protocols; LabGen synthesizes layouts via hierarchical initialization, geometric & chemical optimization, and navigation-aware refinement; LabTouchstone evaluates results across geometric compliance, feasibility, chemical safety, and semantic plausibility. On a benchmark of 30 chemical experiments, LabBuilder achieves a navigation success rate of 0.966 and substantially outperforms all baselines across all evaluation dimensions.

Highlights

  • End-to-End Pipeline: From a single sentence of experimental description to a fully realized, safe, navigable 3D laboratory layout — no manual intervention required.
  • LabForge: Integrates an Asset Knowledge Base (176 annotated lab entities) and a Chemical Knowledge Base covering 6 reaction types, enabling protocol-grounded layout generation.
  • LabGen: A hierarchical generation framework with iterative optimization (FastRepair + LLMAdjust) and A*-based navigation verification, ensuring geometric, chemical, and navigability constraints are satisfied.
  • LabTouchstone: A standardized four-dimensional evaluation benchmark (geometric compliance, feasibility, chemical safety, semantic plausibility) enabling rigorous cross-method comparison.
  • State-of-the-Art Results: Navigation success rate of 0.966, chemical safety scores 3x higher than baselines, with no sacrifice in semantic quality.

Method

LabBuilder Framework

Figure 2: LabBuilder framework. LabForge constructs knowledge bases and synthesizes structured protocols. LabGen performs hierarchical initialization, geometric & chemical optimization, and navigation-aware refinement.

LabForge: Protocol Synthesis

LabForge integrates two complementary knowledge sources: an Asset Knowledge Base cataloging 176 laboratory entities with geometry, semantics, and chemical hazard annotations, and a Chemical Knowledge Base covering substitution, cyclization, redox, functional group transformation, condensation, and alkylation reactions. An LLM compiler translates free-form descriptions into structured protocols with operation sequences, required assets, and safety constraints.

LabGen: Hierarchical Layout Generation

LabGen operates in three stages: (1) Hierarchical Initialization with room-level zoning and desktop-level organization; (2) Geometric & Chemical Optimization using FastRepair for boundary/collision fixes and LLMAdjust for safety refinement; (3) Navigation-Aware Refinement projecting the 3D scene into 2D occupancy grids and verifying path feasibility with A* search.

LabTouchstone: Four-Dimensional Evaluation

LabTouchstone quantifies layout quality across: Geometric Compliance (boundary violations, collisions), Feasibility Success Rate (asset availability, navigation), Chemical Safety (flammable isolation, incompatibility separation, storage), and Semantic Plausibility (realism, layout rationality, completeness via LLM scoring).

Results

Qualitative Comparison

Figure 3: Qualitative comparison with existing methods on the same chemical reaction. Green marks indicate passed checks; red marks indicate failures.

Quantitative Results

Table 1: Quantitative comparison on 30 chemical experiment protocols across geometric, feasibility, chemical safety, and semantic dimensions.

Video

Coming soon.

BibTeX

@inproceedings{cao2026labbuilder,
  title={LabBuilder: Protocol-Grounded 3D Layout Generation for Interactable and Safe Laboratory},
  author={Cao, Jianbao and Zhao, Zhangrui and Feng, Bohan and Hu, Zixuan and Li, Rui and Wan, Haiyuan and Li, Chenxi and Li, Jingyuan and Cai, Wenzhe and Bai, Lei and Ouyang, Wanli and Duan, Lingyu and Huang, Di and Pan, Minting and Zhang, Sha and Ma, Xinzhu and Tang, Shixiang and Zhou, Dongzhan},
  booktitle={Proceedings of the 43rd International Conference on Machine Learning (ICML)},
  year={2026}
}