Biology is messy, tangled, and sloppy system built over a billion years under evolutionary pressure. There's no clear analogy to intelligently designed software, and anytime you make an analogy, like DNA == Source code, there's a mechanism which would destroy it's predictive power to explain biological phenomena. Like with DNA, computer software doesn't create the machine it's executed on, code is 1d, while DNA is definitely multi-dimensional, where it's folding, epigentic modifications, and other modifications matter a lot.
All the interesting biology for complex animals happens during the first few stages of development. There's no computational equivalent to this recursively constructive process. Additionally, biology has a single guiding principle through which we understand everything: evolution, and using computer analogies really diminish that.
Therefore, biology is biology. It's not analogous to a Von Neuman architecture machine, or any other computing device we've created. The first principles are simple different.
My career is in both fields + PhD in one. Tortured analogies of biology as computers make me cringe, they're misleading at best. Sure the ribosome superficially looks like a FSM, but that gets you basically nowhere.
Comp-sci people: If you're curious about biology, spend some quality time at Kahn or edX, watch some university intro-bio lectures on youtube, read an intro-level undergrad biol textbook, etc.
Everything this was just metaphors for non experts, irrelevant contexts and making money on pseudo science books.
Nope. We don't really know yet what exactly the brain is "doing".
But again, in _some_ contexts it is better than nothing.
Unfortunately good metaphors stick, regardless of their relevance and accuracy :(
Whenever I think of how planets move around a star, I always think "Ok. So, I imagine that at some point the universe should know the distance between planet X and star Y, and also, probably, maybe, it should know their masses... but that data is nowhere defined (that we know). So, is the data computed 'on the fly' every $minimal-unit-of-time?). Perhaps the universe doesn't need to know distances nor masses, though.
I guess, to generalize, my point is that things are as they are as a consequence of the physics that they're subject to. A rock has its particular color, weight, and texture due to its elemental composition. A molecule doesn't need to know what it is in order to reflect or absorb certain wavelengths of light, that's what naturally occurs when those wavelengths interact with particles that have a certain composition and state. There's no conscious direction happening at any point, nor is there any data being computed.
The universe is essentially a medium that exists with certain properties and everything in it is stuff that also exists with certain properties. We call the manner in which these properties interact physics. So in summation I'd say that there are intrinsic properties and emergent phenomena based on those properties, that's where a rock gets its color, weight, and texture.
I don't like this analogy too much because it collapses "DNA" into a singular thing that's somehow related to classical computing, even though there's no indication that DNA and the multiple layers of systems interacting with it are limited in the same ways as classical computers.
I prefer to think of the DNA as a medium for memory, one of many mediums for memory.
Memory is everywhere, whether or not anyone or anything remembers it.
That one is beautiful.
This has gone on for several billion years, resulting in a largely stable model within which experiment are continuing to be run.
As with LLMs, DNA doesn't know anything about the data it is being trained on ("Nature"). And that model continues to change even as the experiments are run.
So DNA is a "blockchain" record of previous and current hypotheses on which traits enable an organism to live to viability. Some of these hypotheses are "dead code," as the environment no longer contains the pressure which made them critical. Some of them are essential to viability. Some of them are experiments whose value has not yet been determined.
Assuming your question is whether IaC could learn from patterns in DNA, I think that's a very interesting idea. Certainly we desire that every loadbalancer, database, and iam policy be capable of self-defense, and be the hardiest, most fit version of itself possible.
Where the analogy struggles is that people writing IaC are more in the business of designing "natures" than they are designing individual organisms which would survive a chaotic and hostile "nature" being enforced on them. And people who write IaC might be unhappy to hear that getting to a "viable" database would require launching several thousand databases in an environment and, after some period of changes, seeing which one is performing best so they can clone that "best" database configuration when new databases are needed.
Referring to such deeply established foundation is normal -- I'm sure many of us remember when novel AI approaches like perceptrons and evolutionary algorithms were described by analogy to biology -- but skipping over the established literature to talk about nascent concepts makes you sound less like you're contributing to an educated discussion and more like you're courting angel investors with buzzwords.
At the very least it could be "here's the analogy to <established thing>, and to help round out the concept, <new thing> is also analogous to <established thing> with <important differences>". That makes both a stronger argument and a more educational essay than skipping straight to the new thing with no foundation.
For example, one very important difference is that DNA can be edited and lose history, unlike blockchain, but an intact fossil record could be used to infer how those edits came in over time and space, kind of like an incomplete distributed ledger.
That is a very good point.
It is true that editing blockchain completely destroys the chain, and editing DNA in very specific ways does not. I was thinking (when I wrote it) that because of the way the genes interact it can be completely destructive to take only a single piece of DNA. But, as you correctly point out, we do that all the time. So, yes, the blockchain analogy only partially fits.
Thank you.
1) DNA = source code on disk
2) RNA polymerase = disk read head
3) RNA = source code / functions loaded to memory
4) Ribosome = JIT compiler
5) Proteins = small, single purpose executables (like unix commands)
6) Proteins once outside the cell = execution
If you think of the body as the hardware, then yes, there is some merit to thinking of DNA as infrastructure as code, operating system and application software.
While you could think of it as encoding infrastructure AND code (as in IaS) you'd need to go beyond that to include the hardware for computing AND physical function (like a whole car + computer) in that conception, which is not what IaS means.
The hardware side of DNA is easy to overlook since we don't yet have the necessesary (CAD) design tools to easily understand the shape and mechanics of proteins just from reading a DNA sequence like we do for macroscale 3D models. But there are hard technological reasons for this.
DNA encodes information, but instead of binary organized into 8-64 bit bytes (10010110) it uses four base pairs (ATCG) organized into 3 letter codons, each of which represents one amino acid.
The cell assembles chains of amino acids which are then placed in an "oven" where the string of molecules folds back on itself to assemble a complicated and functional 3D shape.
When we say complicated, we really do mean complicated. Even the fastest modern super computers are unable to determine the shape of these protein based only on the DNA sequence input. Further, we are unable to simulate the way that a folded protein will interact with other molecules reliably.
Fortunately these kinds of problem will someday be easily solved by quantum computers, but for now we are stuck with approximations of questionable accuracy.
But there are very computer code-like elements to how cells work. Unfortunately it is all spaghetti code. One section of DNA often codes for proteins which bind to one or more other sections of DNA either increasing or decreasing the activity production of the proteins from those locations.
Additionally, some DNA sections code not for protein but RNA strings which are used mechanically by themselves or as part of proteins like CRISPER. RNA is always created as an intermediate step between DNA and Protein, but in this case it is used directly as fRNA (functional RNA). RNA can even fold on itself and act similar to proteins though it is much more fragile.
The many interactions between protein, DNA and RNA perform a kind of computation but it is very obfuscated.
The following are generalized interactions that take place in a cell (perhaps analogous to machine instructions) written in a kind of pseudocode, to help illustrate the recursive functions involved.
DNA + Protein = RNA;
RNA + Protein = Protein;
Protein = Protein++;
Protein = Protein--;
Protein = RNA++;
Protein = RNA--;
RNA = RNA++;
RNA = RNA-+;
RNA = Protein++;
RNA = Protein--;
Protein + RNA = DNA;
Any protein or fRNA can have multiple functions in a cell and affect the production other proteins and fRNAs by interacting with DNA or RNA or with other Proteins involved in the production chain. In addition to this, proteins and fRNA also physically move around other proteins and molecules and make up the structure and machinery of a cell.
Untangling it all is close to impossible currently. There is several billion years worth of tech debt and zero documentation.
BTW: protein structure prediction didn't need supercomputers (in the traditional sense) and the PSP problem wasn't solved using supercomputers applying a high quality physics function to simulate folding- instead, it was solved using a combination of ML supercomputers, a really good algorithm (transformers), and a couple of really good data sets- the known structures of proteins, and the known relationship of proteins.
Instead of simulation on a huge supercomputer so they could predict a single strucfture, they trained a model which approximates structure well enough to beat every competitor. From what I can tell, most of the resulting quality doesn't come from their force field but from the distance constraints that are mostly derived from historical relationships between proteins, and the coevolution of their sequences.