Basic genomic features
In silico genotyping revealed that E. coli strain CFSAN061771 belongs to ST1485, O83:H42 serogroup, and phylogroup F. The complete genome of E. coli strain CFSAN061771 comprises a chromosome of 4,908,204 bp with a G + C content of 50.6%. Further, two plasmids named pCFSAN061771_01 and pCFSAN061771_02 of 167,754 bp and 215,531 bp were detected, respectively. Plasmid replicons IncFIA, IncFIB (AP001918), and IncFIC (FII) were detected in pCFSAN061771_01, whereas IncHI2 and IncHI2A were detected in pCFSAN061771_02.
Closest relative genomes to E. coli CFSAN061771
As illustrated in Fig. 1, grapeTree was used to produce and visualize a minimum spanning tree (MST) based on comparing cgMLST allelic profiles. Escherichia coli CFSAN061771 was clustered with two unpublished strains, E. coli 119 (accession number: JABADS01) and E. coli 120 (accession numbers: JABADT01), isolated from ditch water in the Netherland, with only 32 allelic differences. Noteworthy, CFSAN061771 was also similar to two Chinese strains isolated from chicken, YH17143 (accession number: PTNO01) and YH17174 (accession number: PTMO01), with 39 and 43 allelic differences, respectively. The remaining two strains that were a part of the same grape are DF376 and M160133, that showed 44 and 90 allelic differences, respectively.

Phylogeny of closest relatives of CFSAN061771 based on cgMLST. The closest relative strains were identified by a core genome MLST (cgMLST) allele threshold of 500 alleles. CFSAN061771 is the white circle highlighted in yellow and the individual isolates are marked with different colors. The numbers on the branches represent the allelic differences.
On the other hand, the closest reference and representative genomes to our strain identified by Mash/MinHash27 were YH17143, YH17174, and M160133 (accession number: CP022164) with distances of 0.00105797, 0.00115999, and 0.00216198, respectively.
Pan-genome analysis
We conducted the pan-genome analysis to investigate possible differences in gene repertoires among the 85 E. coli ST1485 used in this study. The overall pangenome consisted of 13,376 genes represented by 3998 core genes (3527 hard + 471 soft core genes) and 9378 accessory genes (1324 shell + 8054 cloud genes). Strikingly, only 12 strains carried pCFSAN061771_02 like plasmid genes, as illustrated in Fig. 2 (outlined in red). The closest match to CFSAN061771 is M160133, a strain isolated from a human patient in the United States23. Pan-genome analysis revealed that they share a core genome of 4751 genes accounting for 91.1% (4751/5215) of their pan-genome. A more extensive comparison of CFSAN061771with the M160133 strain indicated that they shared 94.03% (4367/4644) of their chromosomal genes. In terms of plasmids, pCFSAN061771_01 and pM160133 p2 shared 166 core plasmid genes, accounting for 86.91% (166/191) of their pan plasmid genes, whereas pCFSAN061771_02 and pM160133_p1 share 221 core plasmid genes accounting for 78.36% (221/282) of their pan plasmid genes. Notably, the M160133 strain has an additional plasmid, pM160133_p3, consisting of 113,428 bp, not present in CFSAN061771.

Source niches are denoted with different colors. All strains in the clade containing the sequenced reference strain CFSAN061771 are given in blue.
Pan-genome comparison of 85 E. coli ST1485 strains. (A) Phylogenetic tree constructed by FastTree 2 version 2.1.953 based on accessory genes. (B) Matrix plot, which shows the presence and absence of each gene across all strains by blue and white, respectively. A complete list of annotated genes is provided in the Supplementary Table 1. Red arrow refers to the pCFSAN061771_02 like plasmid genes that are outlined by red squares, while the orange arrow refers to pCFSAN061771_01 like plasmid genes that are outlined by orange square.
Virulome analysis
Chromosomal virulence genes
The chromosome of CFSAN061771 carries different virulence genes that are commonly involved in urinary tract infections, including yfcV (encodes the major subunit of a putative chaperone-usher fimbria), chuA (encodes a heme binding protein)28, heat-resistant agglutinin (hra) gene29, outer membrane protein (ompT) gene30, and capsular genes (kpsE, kpsMII)31. Interestingly, it also carries enteroaggregative E. coli (EAEC) virulence gene regulator (eilA), air gene that encodes enteroaggregative immunoglobulin repeat protein32, and long polar fimbriae (lpfA) gene that was detected in EAEC and other pathogenic E. coli strains33. Furthermore, it carries ompA, ibeB, ibeC, and aslA genes that encode structures critical to neonatal meningitis E. coli (NMEC) for crossing of the blood–brain barrier and subsequently invasion of brain endothelial cells34,35. It also harbors two bacteriocins, mchF and mcmA, which encode for the ABC transporter MchF and the microcin M, respectively, and are associated with antibacterial activity against closely related species36.
Plasmid virulence genes
The genetic organization of the virulence genes in pCFSAN061771_01 is presented in Fig. 3. The plasmid carries the genes encoding: colicin V (ColV operon) (cvaABC, cvi); a core region of ColV plasmids which includes three different iron uptake and utilization systems (ferric aerobactin system (iutA/iucABCD), iron and manganese ABC transport system (sitABCD), and salmochelin siderophore system (iroBCDEN)); an outer membrane protein T-encoding gene ompT; ABC transport system etsABC; the increased serum survival gene involved in complement resistance iss; and a hemolysin-encoding gene hlyF. The putative virulence region of pCFSAN061771_01 also harbors tsh gene, which encodes temperature-sensitive hemagglutinin that was confirmed to be associated with the virulence of APEC37. Additionally, different maintenance systems associated with virulence plasmids were identified, including plasmid partitioning system (parABS) and toxin-antitoxin-based addiction systems. The toxin-antitoxin-based addiction systems comprised: postsegregational killing (PSK) system, ccdA/ccdB; virulence-associated genes, vagC and vagD; and host killing gene, hok38. Further, the whole transfer (tra) region (traN, traF, traQ, traH, traG, traT, traD, traI, traX, traJ, traY, traA, traL, traE, traK, traB, traP, traR, traC, traW, and traU) that encodes for the transfer components of plasmids were detected. On the other hand, pCFSAN061771_02 did not carry any virulence genes.

BRIG comparison of pCFSAN061771_01–like plasmids in Escherichia coli ST1485 strains. The pCFSAN061771_01 plasmid from the CFSAN061771 strain was used as reference for alignment and gene annotation and is shown in the outermost black circle. Query genomes are color-coded according to source and the order plotted in the circle reflects their similarities to pCFSAN061771_01. Gene inventories are colored as follows: red, virulence genes; fuchsia, antibiotic resistance genes; blue, insertion sequences; green (maintenance genes); black replication genes; navy, tra locus. Strains are arranged from inside as follow: environmental strains (water) A F F-6, 119, A DUS F-11, J DD Zu-12, I DD RUB-3, 120, 20-MO00076-0, B DD F-4; animal strains (livestock, wild, and companion animals), FSIS12107454, ECOL-20-VL-SD-OK-003, 45,950, ECOL-20-VL-OH-WA-0028, AG19-0146, LD67-1, INT007782; human strains, ME160675, MER-90, S.18.21.Ec, 34 68, M160133, LREC_23, 1U3417eb, 364,151, 260,022, R3-EC181; poultry strains, 20MD12GT08, NC_STEC162, PNUSAE068782, CVM N20EC4096, CVM N19EC1210, FSIS11808976, PNUSAE076599, FSIS12106091, PSU-3943, DF376, M28CTX1, YH17174, YH17143, ampC_0104, 2835-26, MA_120, 103,003,012, CVM N17EC0616, 101,403,016, Ec4, PP743, FSIS12209249, LREC_201, AC12187, LREC_200, AC12076, LREC_213, LREC_192, LREC_224, LREC_212, LREC_194, LREC_191, LREC_228, LREC_189.
The global spread of pCFSAN061771_01 and pCFSAN061771_02 like plasmids in E. coli ST1485
The alignment of contigs of 84 genomes retrieved from public databases on the pCFSAN061771_01 sequence further confirmed the presence of pCFSAN061771_01–like plasmids in most E. coli ST1485 strains, regardless of country and source of isolation (Fig. 3). On the other hand, when we used pCFSAN061771_02 as a BLAST query sequence against 84 E. coli ST1485 genome sequences, only 11 genomes were returned. These genomes were identified from: animal, LD67-1 (pLD67-1-157 kb); food (NIFDS_EC2017_2); human (M160133 (pM160133_p1), MER_90, 3468); and poultry (PP743, LREC_201, ampC_0104, 20,151,021, YH17174, YH17143) (Fig. 4).

BRIG comparison of pCFSAN061771_02–like plasmids in Escherichia coli ST1485 strains. The pCFSAN061771_02 plasmid from the CFSAN061771 strain was used as reference for alignment and gene annotation as is shown in the outermost black circle. Query genomes are color-coded according to source and the order plotted in the circle reflects their similarities to pCFSAN061771_02. Gene inventories are colored as follow: red (antibiotic resistance genes; blue (mobile elements); fuchsia (quaternary ammonium compound); and maroon (tar locus). Strains are arranged from inside as follow; animal strain LD67-1 (pLD67-1-157 kb); food strain (NIFDS_EC2017_2); human strains (M160133 (pM160133_p1), MER_90, 3468); poultry strains (PP743, LREC_201, ampC_0104, 20,151,021, YH17174, YH17143).
Clustering of globally disseminated E. coli ST1485 strains based on their virulence profiles
Figure 5 shows that 85 E. coli ST1485 strains, including CFSAN061771, harbored seven or more virulence genes from a panel of 43 genes and displayed 56 virulence patterns. It is worth noting that the majority of strains from diverse sources had eight or more genes of a panel consisting of 12 virulence genes (ompt, sitA, iroN, etsC, traT, cvaC, hylF, iss, tsh, mchf, iucC, iutA) grouped in cluster 2, shown in red, which is likewise present on pCFSAN061771_01. All strains harbored at least six genes from another panel comprised of eight genes grouped in cluster 3, which was shown in green (hra, eilA, kpsE, air, yfcv, terC, chuA, IpfA). Our food strain (marked as a yellow box on the left of Fig. 5) clustered with a poultry strain (YH17143) with the same virulence pattern.

Heat map demonstrating the distribution of virulence genes in E. coli ST1485 strains. Blue represents the presence and white represents the absence of a virulence gene. Strains from various origins with identical virulence profiles are denoted as red boxes on the left, whereas strains from the same source were denoted as black boxes. The yellow box denotes the sequenced strain CFSAN061771 (cluster 31). H human, L livestock, P poultry, C companion animals, W wild animals, F food, A aquatic animal.
Resistome analysis
Chromosomal antibiotic resistance genes
The chromosome of CFSAN061771 possesses mutations in the DNA gyrase (gyrA, S83L) and parC (S80I), which are associated with fluoroquinolone resistance.
Plasmids’ antibiotic resistance genes
pCFSAN061771_01 harbors an MDR-encoding region integrated with several mobile elements (Fig. 3). It comprises a cluster of genes encoding resistance to sulfonamide (sul2), β-lactam (blaTEM-IB), kanamycin (aph(3″)-Ib), and streptomycin (aph(6)-Id) flanked by tnpR and IS110 as well as a class 1 integron (Int191) harboring a sole gene cassette, dfrA14, encoding resistance to trimethoprim. The genetic organization of this region in our strain and pM160133_p2, a plasmid of closely related strain M160133, were compared in Fig. 6. Notably, CFSAN061771 lacks the tetracycline tetA. On the other hand, as shown in Fig. 4, pCFSAN061771_02 harbored mcr-1 gene, encoding colistin resistance with an upstream copy of ISApl1 as well as class 1 integron, In641, with gene cassettes encoding resistance to aminoglycosides (estX, aadA2, aadA2), chloramphenicol (cmlA1) and quaternary ammonium compounds (qacL). The sulfonamide resistance gene, sul3, which was found to be associated with In6439, was also detected. The genetic organization of antibiotic resistance genes and alignment of genome sequences on the pCFSAN061771_02 sequence is illustrated in Fig. 4.

Linear maps of the multidrug resistance regions of pCFSAN061771_01 and pM160133_p2. Antimicrobial resistance genes are shown in red, green and blue. Mobile elements are shown in brown. Homologous segments with ≥ 99% sequence identity are indicated by black shading, while gray shading shows inverted homologous segments.
Clustering of globally disseminated E. coli ST1485 strains based on their antibiotic resistance profiles
Figure 7 shows that 85 E. coli ST1485 strains harbored one or more antibiotic resistance genes from a panel of 57 genes and displayed 58 resistance patterns. Antibiotic resistance genes were clustered into 30 clusters, and those in cluster 7 (blaTEM-1, aph(6)-Id, aph(3″)-Ib, sul2, dfrA14) were found to compromise a multidrug resistance region (MDR) in pCFSAN061771_01 (Fig. 6). Our strain formed a separate cluster (cluster 26) and was shown as a yellow box on the left of Fig. 7.

Heat map demonstrating the distribution of antibiotic resistance genes in E. coli ST1485 strains. Blue represents the presence and white represents the absence of a virulence gene. Strains from various origins with identical virulence profiles were denoted as red squares on the left, whereas strains from the same source were denoted as black squares. The yellow square denotes the sequenced reference strain CFSAN061771. H human, L livestock, P poultry, C companion animals, W wild animals, F food, A aquatic animal.
link