From the what-could-possibly-go-wrong department: Scientists have now managed to write executable code into DNA that is theoretically capable of infecting any computer that reads it.
It’s not quite accurate to call it a virus, though this could be the closest to a virus software has ever come.
It consists of replication instructions, encoded in a snippet of DNA that can deliver a payload capable of assuming control of any computer that reads the strand. It has to integrate itself into the host system to propagate. All it needs is a capsid, although the file metadata and header might qualify.
To write executable code into a DNA strand, the researchers had to decide on an exploit. It wasn’t by mere chance ,the scientists picked C for their exploit.
C has a well-known set of vulnerabilities in some functions that leave systems open to a classic buffer-overflow attack.
Then, they encoded their snippet of C in a simple cipher, using nucleobases for binary pairs: A = 00, C = 01, G = 10, T = 11.
Because , computers run on a binary stream of electrical impulses that alternates between OFF and ON: 0 and 1. As a consequence, executable code has to go through the binary state on some level.
Reading the DNA sequence , the malicious code was able to embed into the computer that was analyzing it. From there, it took advantage of a buffer overflow and got loose in the system to grab for privileges.
“The conversion from ASCII As, Ts, Gs, and Cs into a stream of bits is done in a fixed-size buffer that assumes a reasonable maximum read length.
According to Karl Koscher from the University of Washington’s Molecular Information Systems Lab and the Security and Privacy Research Lab the DNA exploit for sequencing was 176 bases long,” “The compression program translates each base into two bits, which are packed together, resulting in a 44 byte exploit when translated.”
“Most of these bytes are used to encode an ASCII shell command,” he continued. “Four bytes are used to make the conversion function return to the system function in the C standard library, which executes shell commands, and four more bytes were used to tell the system where the command is in memory.”
In other words: feed this strand of DNA into a compiler and it’s all of a sudden in 176 nucleobases.
Even though the possibilities for destructive interference with law enforcement and scientific/corporate espionage clearly abound, the fact that buffer overflows are so notorious — and so common — means that programmers have been looking out for this kind of attack for a long time.
Heartbleed , for example was a buffer overflow attack. There are existing boilerplate wrappers that check code for this kind of bug, and quit if the program experiences such an error.
Furthermore, since it’s a DNA-based exploit, there are some problems in the mechanism. For example , the strand can fragment, and because DNA can be read in both directions, the code can be transcribed backwards. But no worries: the study authors remark that a clever future assailant could write the code as a palindrome.
To look for this kind of emergent threat is extremely important. “We know that if an adversary has control over data a computer is processing, it can potentially take over that computer,” said professor Tadayoshi Kohno, who led the project.
Kohno’s background is in looking for attacks that come from left field — attempts to hack embedded systems like pacemakers, for example. “That means when you’re looking at the security of computational biology systems, you’re not only thinking about the network connectivity, USB drive ,and user ,but also the information stored in the DNA sequenced. It’s about considering a different class of threat.”
sources : Extreme Tech, Tech Crunch