We absolutely do not have a clear idea of how E.coli works. Hell, we don't even know how almost 1/3 of the genes work on JCVI-Syn3a works, a minimal genome we synthetically created. Far fewer in E.coli.
My side would immortalize b-cells with Epstein-Barr... but that experience left me with a healthy respect for e coli.
Even just within the subset of E. coli which causes UTI's, 25-40% of the genome varied between strains. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5653229/)
This diversity wasn't really appreciated in 2008 when many E. coli genomes hadn't been sequenced yet.
It turns out that the Escherichia coli (to spell out its Latin binomial) that cause disease are in some sense “diseased” themselves: the genes that enable them to be pathogenic, or make them pathogenic, I should say, are originally from a phage, a type of virus that infects bacteria [1]. In a manner that is not the same as, but conceptually similar to how HIV inserts its genes into the human’s genome, phages insert their genes (termed the “prophage”) into the bacterial genome.
In addition, most strains of pathogenic Escherichia are also holding on to an entirely separate, small, circular “genome” called a plasmid, also of exogenous origin, that contains additional genes that make them pathogenic.
So in addition to wide genome variation within the “species” (which is not really the same thing for bacteria as for mammals, mind you) of Escherichia coli, many subtypes have additional genetic material from endogenous sources that substantially changes their observed characteristics (phenotype).
Do you have a citation on the fact that 'most' pathogenic strains have a plasmid making them so? Some guys in our lab have been playing around with plasmid copy number lately (in a largely 'basic science' kind of way) -- this could give some nice context for that work.