I, octopus

What reinforcement learning might teach us about octopus perception

Jul 18, 2022

This is the second post in a loose series of obscure/ambitious machine learning ideas that began with a proposal to decipher the Voynich manuscript. In this post, I’ll talk about octopus perception and how reinforcement learning might help us to understand it.

A couple of years ago, some combination of watching My Octopus Teacher and reading Other Minds got me interested cephalopods. Their intelligence despite their short lifespan, their stunning colour changes, their sense of alien embodiment all make them appealing.

The question that most stuck with me was the reported paradox: despite being able to almost perfectly match their skin to the colour of their surroundings, and seemingly using skin colour as a communication channel, cephalopods are supposed to be colourblind. A beautiful mystery to open up a world of daydreaming.

To my knowledge, there have been two proposed solutions to this paradox. One possibility is that cephalopods reshape their pupils and use the resulting wavelength-dependent chromatic blur to infer colour information. The other is that cephalopods are to some extent able to see with their skin which contains light-sensitive proteins. There is an idea that cephalopods can use their skins colour changing machinery (see e.g. here) to manipulate the light reaching these sensors and thereby sense colour.

It is the second possibility which intrigues me most, since it wraps up together the questions of perception and communication. The use of their skin as a canvas gives cephalopods a huge scope for exchanging messages with their fellows and other surrounding animals. How they make use of these channels is still not understood in detail. If skin colour changes are also involved in perception, this would result in an elegant but daunting two-way communication channel.

Either or both of these possibilities might be true. Or neither! It’s also been suggested that cephalopod camouflage might rely solely on disruptive patterning which does not rely on exact colour perception to hide the animal. In any event, it seems difficult to reach any firm conclusions about how these perception channels are used by real cephalopods beyond raising the possibility.

I think it would be interesting to use reinforcement learning as a different angle on these questions. For example, to learn more about how skin patterning might be related to colour perception we might develop up a simulated octopus arm, design an environment and task where colour perception should provide an advantage (such as reaching for food while avoiding threats), and seeing what kinds of patterns the agent learns to produce.

These kinds of experiments are limited, in the sense that the patterns produced will depend on the simulation. But this itself can be seen as a feature, making it possible to ask questions about how the simulation alters what the agent learns. How do the patterns learned by a two-dimensional arm differ from those learned in three dimensions? What is the impact of including or omitting specialised reflector cells? How simple can the simulation be made before it is unable to learn useful patterns at all? This kinds of exploration could be helpful, I think, in the journey to understand the mysteries of octopus perception.

Obliqueville

Discussion about this post