Convergent Functions, Divergent Forms

* Equal Contribution
1University of Washington 2Seoul National University, 3Allen Institute for AI, 4Kempner Institute at Harvard University


LOKI (Locally Optimized Kinematic Instantiations) LOKI Logo is a compute-efficient co-design framework that discovers diverse, high-performing robot morphologies (divergent forms) using shared control policies (convergent functions) and dynamic local search [1] instead of mutation.

The policy reuse allows us to explore $\sim780\times$ more designs using 78% fewer simulation steps and 40% less compute per design (see Efficiency and Coverage Comparison).

Beyond being substantially more efficient, LOKI LOKI Logo discovers both genetically and behaviorally diverse high-performing morphologies that effectively generalize across various unseen tasks. (see Qualitative Results).

header-image.

(A) UNIMAL [2], an expressive design space encompassing approximately 1018 unique morphologies with fewer than 10 limbs, is partitioned into 40 clusters based on morphological similarity in a learned latent space.

(B) A shared control policy is trained within each cluster on a dynamic pool of elite morphologies, for a simple flat-terrain locomotion task.

(C) Morphologies co-evolve with the shared policy as elites are iteratively refined through dynamic local search. The top 2–3 agents from the final training pool of each cluster are selected to form the final population.


Qualitative Results

Evolved Agents for Flat-Terrain Locomotion

LOKI LOKI Logo discovers high-performing morphologies with diverse locomotion behaviors (e.g. Spinners, Crawlers, Bipedals, etc.).
This is due to localizing the competition within each cluster and having a broader exploration within each cluster (using Dynamic Local Search). (see Quality Diversity Metrics)



Transitioning to Unseen Tasks

LOKI LOKI Logo produces a diverse set of solutions that are better suited for adaptation to various unseen tasks and environments, reducing the need to re-evolve agents for each setting. The final set of evolved designs are trained on a suite of downstream tasks across three domains: agility (bump, obstacle, exploration), stability (incline), and manipulation (push box incline, manipulation ball). (see Morphology-Level Task Adaptability).


Bump


Obstacle (n=150)


Exploration


Incline


Push box incline


Manipulation ball

Multi-design Control Policies

header-image.

We leverage structural commonalities within the design space by training multi-design policies for clusters of morphologies that share structural and behavioral similarities. These cluster-specific policies serve as surrogate scoring functions, enabling efficient evaluation of a large number of designs within each cluster without retraining, thereby significantly improving search efficiency.


Efficiency and Coverage Table

LOKI LOKI Logo explores a much larger design space while requiring significantly fewer simulation steps and less training FLOPs per evaluated design.

Diverse High-performing designs (Quality-Diversity Metrics)

header-image.

LOKI LOKI Logo maintains both quality and diversity. It achieves the highest QD-Score [3] among all baselines, demonstrating its ability to find high-performing solutions across a wide range of niches. Furthermore, LOKI LOKI Logo evolves significantly more sparse solutions (measured by the average distance to k-nearest neighbors). This is due to using Dynamic Local Search for sampling designs, rather than mutations.

Morphology-level Task Adaptability

header-image.

We evolve diverse morphologies on flat terrain that generalize more effectively to unseen tasks requiring varied skills. DERL [2] morphologies are overfitted to flat terrain and perform best on obstacle (n=50) and incline, which are structurally similar. In contrast, LOKI LOKI Logo shows significantly better adaptability on bump (981 $\rightarrow$ 1908), push box incline (1519 $\rightarrow$ 3148), manipulation ball (142 $\rightarrow$ 172) enabled by its morphological diversity (e.g., crawlers, crabs) and the emergence of more complex behaviors (e.g., spinning, rolling).

Policy-level Task Adaptability

header-image.

Our co-evolution framework not only produces a diverse set of morphologies but also cluster-specific policies that generalize more effectively to unseen tasks. Each cluster captures distinct morphologies and behaviors, enabling its policy to better adapt to tasks aligned with those traits.

Reference

[1]: Holger H. Hoos and Thomas Stützle. Stochastic Local Search Algorithms: An Overview, pages 1085–1105. Springer Berlin Heidelberg, Berlin, Heidelberg, 2015.

[2]: Agrim Gupta, Silvio Savarese, Surya Ganguli, and Li Fei-Fei. Embodied intelligence via learning and evolution. Nature communications, 12(1):5721, 2021.

[3]: Justin K Pugh, Lisa B Soros, and Kenneth O Stanley. Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI, 3:40, 2016.