Convergent Functions, Divergent Forms

Hyeonseong Jeon*^1,2, Ainaz Eftekhar*^1,3, Aaron Walsman⁴, Kuo-Hao Zeng³ Ali Farhadi^1,3, Ranjay Krishna^1,3

* Equal Contribution
¹University of Washington ²Seoul National University, ³Allen Institute for AI, ⁴Kempner Institute at Harvard University

Paper Code

LOKI (Locally Optimized Kinematic Instantiations) is a compute-efficient co-design framework that discovers diverse, high-performing robot morphologies (divergent forms) using shared control policies (convergent functions) and dynamic local search [1] instead of mutation.

The policy reuse allows us to explore $\sim780\times$ more designs using 78% fewer simulation steps and 40% less compute per design (see Efficiency and Coverage Comparison).

Beyond being substantially more efficient, LOKI discovers both genetically and behaviorally diverse high-performing morphologies that effectively generalize across various unseen tasks. (see Qualitative Results).

(A) UNIMAL [2], an expressive design space encompassing approximately 10¹⁸ unique morphologies with fewer than 10 limbs, is partitioned into 40 clusters based on morphological similarity in a learned latent space.

(B) A shared control policy is trained within each cluster on a dynamic pool of elite morphologies, for a simple flat-terrain locomotion task.

(C) Morphologies co-evolve with the shared policy as elites are iteratively refined through dynamic local search. The top 2–3 agents from the final training pool of each cluster are selected to form the final population.

Qualitative Results

Evolved Agents for Flat-Terrain Locomotion

LOKI LOKI Logo discovers high-performing morphologies with diverse locomotion behaviors (e.g. Spinners, Crawlers, Bipedals, etc.).
This is due to localizing the competition within each cluster and having a broader exploration within each cluster (using Dynamic Local Search). (see Quality Diversity Metrics)

Transitioning to Unseen Tasks

LOKI LOKI Logo produces a diverse set of solutions that are better suited for adaptation to various unseen tasks and environments, reducing the need to re-evolve agents for each setting. The final set of evolved designs are trained on a suite of downstream tasks across three domains: agility (bump, obstacle, exploration), stability (incline), and manipulation (push box incline, manipulation ball). (see Morphology-Level Task Adaptability).

Bump

Obstacle (n=150)

Exploration

Incline

Push box incline

Manipulation ball

Multi-design Control Policies

We leverage structural commonalities within the design space by training multi-design policies for clusters of morphologies that share structural and behavioral similarities. These cluster-specific policies serve as surrogate scoring functions, enabling efficient evaluation of a large number of designs within each cluster without retraining, thereby significantly improving search efficiency.

LOKI LOKI Logo explores a much larger design space while requiring significantly fewer simulation steps and less training FLOPs per evaluated design.

Diverse High-performing designs (Quality-Diversity Metrics)

LOKI LOKI Logo maintains both quality and diversity. It achieves the highest QD-Score [3] among all baselines, demonstrating its ability to find high-performing solutions across a wide range of niches. Furthermore, LOKI evolves significantly more sparse solutions (measured by the average distance to k-nearest neighbors). This is due to using Dynamic Local Search for sampling designs, rather than mutations.

Morphology-level Task Adaptability

We evolve diverse morphologies on flat terrain that generalize more effectively to unseen tasks requiring varied skills. DERL [2] morphologies are overfitted to flat terrain and perform best on obstacle (n=50) and incline, which are structurally similar. In contrast, LOKI LOKI Logo shows significantly better adaptability on bump (981 $\rightarrow$ 1908), push box incline (1519 $\rightarrow$ 3148), manipulation ball (142 $\rightarrow$ 172) enabled by its morphological diversity (e.g., crawlers, crabs) and the emergence of more complex behaviors (e.g., spinning, rolling).

Policy-level Task Adaptability

Our co-evolution framework not only produces a diverse set of morphologies but also cluster-specific policies that generalize more effectively to unseen tasks. Each cluster captures distinct morphologies and behaviors, enabling its policy to better adapt to tasks aligned with those traits.

Reference

[1]: Holger H. Hoos and Thomas Stützle. Stochastic Local Search Algorithms: An Overview, pages 1085–1105. Springer Berlin Heidelberg, Berlin, Heidelberg, 2015.

[2]: Agrim Gupta, Silvio Savarese, Surya Ganguli, and Li Fei-Fei. Embodied intelligence via learning and evolution. Nature communications, 12(1):5721, 2021.

[3]: Justin K Pugh, Lisa B Soros, and Kenneth O Stanley. Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI, 3:40, 2016.