Life

Fruity or fermented? Algorithm predicts how molecules smell

28 October 2016

People in lab coats sniff chemicals on test strips — Eric Feferberg/AFP/Getty Images

It’s not something to be sniffed at. Computers have cracked a problem that has stumped chemists for centuries: predicting a molecule’s odour from its structure. The feat may allow perfumers and flavour specialists to create new products with much less trial and error.

Unlike vision and hearing, the result of which can be predicted by analysing wavelengths of light or sound, our sense of smell has long remained inscrutable. Olfactory chemists have never been able to predict how a given molecule will smell, except in a few special cases, because so many aspects of a molecule’s structure could be important in determining its odour.

Andreas Keller and Leslie Vosshall at Rockefeller University in New York City decided to crowdsource the power of machine learning to address the problem. First, they had 49 volunteers rate the odour of 476 chemicals according to how intense and how pleasant the smell was, and how well it matched 19 other descriptors, such as garlic, spice or fruit.

Then they released the data for 407 of the chemicals, along with 4884 different variables measuring chemical structure, and invited anyone to develop machine-learning algorithms that would make sense of the patterns. They used the remaining 69 chemicals to evaluate the accuracy of the algorithms of the 22 teams that took up the challenge.

The unsung sense: How smell rules your life

The best algorithms proved far more accurate than any previous efforts in predicting the volunteers’ descriptions of the test chemicals. They were not perfect, partly because people rarely rate the same odour identically when tested a second time.

“If you ask someone how burnt a smell is and they give it a 17, and then you come back half an hour later and ask again and they give it a 10,” says contest winner Rick Gerkin, a neuroscientist at Arizona State University in Tempe. “The best a model can do is be a little bit wrong in both cases.” Even so, Gerkin’s algorithm predicted the volunteers’ scores nearly as well as their previous ratings of a given odour did.

Real odours span many more than just 21 descriptors, of course, but Gerkin thinks it would be straightforward, though time-consuming, to tackle a wider set of descriptors. This could help perfumers and flavour specialists sort through the billions of scented molecules to find ones with a particular, desired odour, says Robert Sobel, vice president for research at FONA International, a flavour company in Geneva, Illinois.

Even if the predictions aren’t perfect, they can help narrow the field when you’re after a particular scent or flavour, says Gerkin. “Eventually, you can use a database like that and say OK, pick out the top 100 hits out of a billion molecules. A hundred molecules are easier to test than a billion.”

The next challenge is working out what scents will arise from mixtures of chemicals. “What you’re doing here is rating individual molecules, says Avery Gilbert at Synesthetics, a sensory consultancy in Fort Collins, Colorado. “What’s more useful is knowing which ingredients play nicely together.”

bioRxiv DOI: 10.1101/082495

Topics: