When the cells inside your body lay down new tracks of DNA, they work hard not to deviate from the original blueprint—too many mistakes could lead to cancer or other diseases. Biologists synthesizing DNA in a lab are no different, only the custom-designed genetic materials their machines piece together arguably have more street value. They often contain the recipes for making new medicines and novel materials. Like, say, lab-grown pot. Or life-saving antivenoms. Or the kinds of crystals that can bend a smartphone screen.
These products represent cutting-edge academic research, and the exact formulas are often corporate secrets. Which is why operators usually keep DNA synthesizers offline, to prevent a cyber-heist of those precious strings of As and Ts and Cs and Gs that spell out instructions for lucrative new biological functions. But one group of biohackers has demonstrated for the first time that it’s possible to steal and reverse-engineer the genetic code stitched together by DNA synthesizers by simply recording the sounds they make.
In new work they presented at last week’s Network & Distributed System Security Symposium, a team of researchers from UC Irvine and UC Riverside unveiled a so-called acoustic side-channel attack on a popular DNA-making machine, a vulnerability they say could imperil the up-and-coming synthetic biology and DNA-based data storage industries. It could also have important potential counterterrorism applications—for monitoring suspect machines to see if they’re manufacturing deadly pathogens or other biological weapons.
DNA synthesizers string together genetic sequences by facilitating a complicated set of chemical reactions. Using an audio recorder, the researchers collected the noise involved in the process—valves opening, liquids being injected, plastic pipes vibrating—and then used machine learning models to pick out unique acoustic features for each A, G, T, and C as those nucleotides are added to the sequence.
Two days’ worth of recordings was enough to train algorithms that could surmise unknown strings of DNA with 86 percent accuracy. By combining them with off-the-shelf DNA sequencing software, the researchers boosted the accuracy to almost 100 percent, especially for longer sequences. Some members of the team tested the hack, which they call Oligo Snoop, on DNA sequences chosen by the other members. They included genetic instructions for making human insulin, a binding peptide commonly used in drug development, and conotoxin, a lethal protein found in the venom of cone snails.
While the eavesdropping attack is far from practical for any run-of-the-mill corporate spy or would-be bioterrorist, it’s one the researchers warn could become more likely over time, as biology emerges as a powerful computing platform, and hackable listening devices like Nest cams and voice assistants become increasingly pervasive in automated lab settings. And perhaps more to the point, it’s a provocative demonstration of the ways in which the walls between the physical biological world and the digital one are crumbling toward one another.
“Over the last century, whether it was from computers or mobile phones, stealing data was all about directly stealing zeros and ones,” says Mohammad Abdullah Al Faruque, a computer scientist at UC Irvine whose lab led the latest eavesdropping efforts. But as better sensing technologies and machine learning make it increasingly possible to glean cyber information from its physical manifestation, Al Faruque says, the old ways of building digital fortresses around sensitive data are no longer enough. “If I can measure the acoustics, or a heat signature, or any kind of physical emission of a system, I can use that to steal its computing data, whether its zeros and ones or strings of DNA.”
You may be asking why anyone would go to all that trouble for a few lousy base pairs. But reading and writing DNA is big business. Yeast and bacteria and algae are increasingly put to work as microscopic factories—programmed with synthetic DNA to spit out everything from rocket fuel to prescription opioids. Companies spend tens of millions of dollars to develop these products, and it can take decades for them to get to market. Firms that engineer microbes to produce novel molecules using proprietary genetic code hauled in close to $2 billion in VC funding within the past year. When I attempted to visit one of the largest of these—California-based Zymergen—last month, my scheduled lab tour was canceled when I declined to sign a nondisclosure agreement. Zymergen’s CEO Joshua Hoffman explained the abundance of caution to me this way: “We work on a bunch of stuff that is material nonpublic information for clients. And they take security incredibly strongly. I may be the only person who can go into every room in the entire company.” Hoffman says that’s because some of those rooms are filled with microscopic, self-replicating pieces of intellectual property, each worth somewhere between $250 million and $1 billion.
The researchers’ findings also have important consequences for the nascent DNA-based data storage industry. Philip Brisk, a computer engineer at UC Riverside, first got the idea for testing the hackability of DNA synthesizers in 2016 after Microsoft announced it was exploring the feasibility of replacing silicon with DNA for some of its long-term archival storage. Later that year, at a National Science Foundation meeting, he happened to be presenting posters next to Al Faruque, who had most recently used the sounds of a 3D printer in action to determine its design schematics. “We got to talking about, in a world where there’s DNA storage, what does it mean to keep data safe?” Brisk says.
Brisk and Al Faruque decided to carry out their attacks on a DNA synthesizer made by Applied Biosystems because of its popularity as a workhorse of the synthetic biology world. According to the researchers, it and another model made by the same company account for 90 percent of the industry’s DNA-making machines. Thermo Fisher, which owns Applied Biosystems, did not respond to a request for comment.
While DNA synthesizer snooping is a problem, or could be in the future, newer DNA machines might be less hackable. In a pilot of its DNA data storage project, Microsoft has most visibly conscripted the services of Bay Area DNA-maker Twist Biosciences, which uses a newer type of printing technology that is much faster than older techniques—piping out hundreds or thousands of bases at the same time. Other startups are investigating ways to make DNA without the harsh chemicals that are used now—instead relying on enzymes just like your own body does. The researchers say acoustic side-channel attacks would likely be more difficult with either of these manufacturing systems, but not necessarily impossible. For anyone pursuing DNA-based computer storage or engineering designer microbes to burp out the next wonder drug, the message is clear: before you write anything in genetic code, check and see who else might be listening.