Commentary : Comparative Genomic Analysis of Ten Clinical Streptococcus pneumoniae Collected From a Malaysian Hospital Reveal 31 New Unique Drug-Resistant SNPs Using Whole Genome Sequencing

Despite the effort and decades of research, S. pneumoniae remains a primary cause of infectious morbidity and mortality worldwide. Although Antibiotics are lifesaving medications that offer tremendous benefits to patients with infectious diseases. Yet, several reports have revealed that the overuse and misuse of these agents had led to antibiotic resistance. Our study utilized whole genome sequencing (WGS) to reveal the pattern of antibiotic-resistance among ten pneumococcal isolates with various degree of susceptibility to antibacterial drugs. The main purpose of our study was to explore genetic variations related to drug-resistance in those ten strains. The results indicated that pneumococcal strains with resistant profile were associated with greater number of SNPs compared to susceptible ones. Out of all the SNPs identified, 31 were unique and had not been reported before. Our data propose that these SNPs could possess an important role in modifying the degree of sensitivity to different antibacterial drugs. In this article we comment on the methodology and results of our study which previously published in Journal of Biomedical Science.


Introduction
Streptococcus pneumoniae or the pneumococcus is a major cause of community-acquired pneumonia and meningitis, as well as bloodstream, ear, and sinus infections [1][2][3][4] .Globally, it is estimated that this bacterial pathogen colonizes as many as 40-60% of young children.While colonization most often results in asymptomatic carriage, S. pneumoniae is still responsible for a substantial burden of disease 5 .In 2000, pneumococcus caused 14.5 million episodes of severe pneumococcal infections resulting in 826 000 deaths in children beneath the age of five 6 .Apart from pneumococcal deaths in HIV-positive children, death caused by pneumococcus accounts for approximately 11% of under-five mortality 5 .Decades of overuse of antibiotics in medical and agricultural applications as well as inappropriate prescribing of these drugs were the primary driver of antibiotic resistance crisis [7][8][9] .Drug-resistant S. pneumoniae (DRSP) has become an important clinical and public health problem during the past 20 years 10 .Pneumococcal resistance to different antibiotics led to 32,398 extra outpatient visits and 19,336 additional hospitalizations, accounting for $91 million (4%) in direct medical costs and $233 million (5%) in total costs, including work and productivity losses 11 .In order to overcome this issue and for better understanding on how pneumococci develop resistance to different types of antibiotics we utilized Whole Genome Sequencing (WGS) in our study 12 .WGS allow researchers to study the mode of action of antibiotics and the mechanisms involved in bacterial resistance 13,14 .Also, WGS can be applied by scientists to investigate the molecular basis and rate of evolution of antibiotic resistance in real-time under treatment regimens of single drugs or drugs combinations 15 .In our study, we used Whole Genome Sequencing to reveal the patterns of resistance of 10 pneumococcal isolates with a range of susceptibility and resistance to four different antibiotics: penicillin, cefotaxime, erythromycin, and tetracycline.The aim of our study was to investigate the genetic variation among pneumococcal isolates with different susceptibility profiles to four antibiotics in order to identify SNPs associated with virulent genes that could be a target for drug development.

Methodolgy
The aim of our study is to identifiy Single Nucleotide Polymorphisms (SNPs) that are associated with antibioticresistance. Association of a SNP with drug resistance implicates genes that either reside near the genomic location of the SNP, or are regulated by a genetic factor located there.Ten clinical isolates of S. pneumoniae were collected previously from University of Malaya Medical Centre (UMMC) (Table 1).The genomes of S. pneumoniae clinical isolates were extracted using DNeasy Blood & Tissue Kit (Qiagen), the quantity and purity of the DNA was measured using qubit (Table 2).DNA fragmentation was done using Covaris S2.The fragmented DNA were ends repaired, added with dA base and ligated with Illumina indexed adapters.Standard concentration was used as the quantification becomes less reproducible, the sequencing library becomes less stable and subsequently Lower sequencing yield is the likely outcome.Size selections of the samples were performed using Invitrogen 2% agarose E-gels.The selected DNA fragments with adapters molecules on both ends underwent 10 cycles of PCR for amplification of prepared material.The samples were then diluted to 10Nm using hybridization buffer and pooled in to one pool.The libraries were loaded onto 1 lane of Illumina HiSeq 2000 flow cell v3 for sequencing.
In order to exclude low quality reads, PRINSEQ version 0.20.3 was used and the following types of reads were removed: 1. Reads having 'N' in more than 10% of the total bases of that read 2. Reads with Phred quality score less than 20.
In order to evaluate the core genome average identities and completeness, the sequenced reads were assembled and mapped against S. pneumoniae TIGR4.SPAdes assembler was used in our study to assemble the genomic DNA extracted from the bacterial samples.This software is initially designed to assemble small genomes from MDA singlecell and standard bacterial data sets.Assembly of single cell data is challenging due to non-uniform read coverage, difference in insert length, high levels of sequencing errors and chimeric reads.Thus, SPAdes addresses these issues by performing assembly in four stages:

Contig construction.
To build a phylogenetic tree based on the identified SNPs, kSNP3 program was used.The reason for using this software over other available ones is that kSNP detects SNPs and builds phylogenies for large numbers of finished and draft sequences.Unlike other methods such as Parsnp which aligns the core genome and requires finished or assembled genomes, kSNP can use raw reads and is able to analyze hundreds of bacterial or viral genomes in only a few hours.In addition, kSNP can build Maximum Likelihood, Neighbor Joining, and parsimony phylogenetic trees based on all SNPs, only core SNPs, and SNPs present in at least a user-specified fraction of genomes.Realphy is another method to build a phylogenetic tree.This method maps raw reads to several reference genomes, therefore increasing the probability of using all of the information in the raw-read genomes for analysis.However, this method relays on accurate mapping of raw reads to the reference genomes, and if some taxa are diverged by > 5-10% the distances to the reference genome are under estimated, leading to incorrect topologies.kSNP overcomes this issue By not relaying on reference genome and by the ability of using raw read files.

Genes were Identified
Ten pneumococcal isolates with different sensitivity to four antibiotics were used in this study (Table 3).By using WGS we were able to found that the majority of the nonsynonymous SNPs associated with pneumococcal essential genes were present in antibiotic resistant strains 12 .Through our analysis we were able to identify 90 non-synonymous SNPs related to the essential genes of the resistant strains, and some of them have reappeared in more than one resistant isolate, while none of these SNPs have occurred in susceptible isolates (Table 4).In addition, we were able to identify 31 unique SNPs associated with penicillin binding proteins, pneumolysin, PspA, sensor histidine kinase (ciaH) and capsular polysaccharide biosynthesis protein CpsA (Table 5).Phylogenetic analysis is the most commonly used tool to predict biological relationships.We used the parsimony tree to estimate the phylogenetic relationships among the clinical strains of S. pneumonaie.Our results are in agreement with the MIC profile of the ten pneumococcal  strains.The observations that pneumococcal isolates with similar MIC profile were gathered together in the phylogenetic tree propose that these strains possess shared mutations and were probably originated from the same clone.It is possible that these strains could have evolved and acquired mutations in a similar manner due to selection pressures.The high phylogenetic relatedness among the clinical pneumococcal isolates with similar MIC profile is related to the specific SNPs in the mutated genes.
The presence of identical uncommon mutations, as well as certain genes in the grouped isolates in the phylogenetic tree, is indicative of a single cluster of strains circulating in the population.

Conclusion
In summary, we compared the genomic sequences of ten pneumococcal strains isolated from University of Malaya Medical Centre (UMMC) with different sensitivity to four different antibiotics: penicillin, cefotaxime, erythromycin, and tetracycline in order to identify the genetic variations within the sequences of these isolates and identifying SNPs that could play significant role in conferring resistance to those antibiotics.The high level of sequence conservation and the presence of the same mutations mainly those associated with genes involved in β-lactam resistance in both sensitive and resistant isolates makes it a difficult task to identify distinct mechanisms of resistance that differentiate strains with different drugsensitivities, and that antibiotic resistance cannot be only linked to the presence of certain genes.Nevertheless, through our extensive analysis we were able to identify unique SNPs associated with virulent genes that could play a key role in resistance to various antibiotics.However, the small number of the clinical samples included in this study has limited our understanding to the role of these SNPs in conferring resistance toward different antibiotics.Moreover, all resistant genes have yet to be subjected to individual mutational analysis.This can be achieved by introducing the identified SNPs to the resistant genes by site-directed mutagenesis and further expression analysis to confirm the role of these SNPs in conferring antibiotic resistance.

Table 1 :
Bacterial strains and sources used for the genomic comparison of S. pneumoniae strains.
1. SPAdes proposes a new approach to assembly graph construction that uses the multisized de Bruijn graph, implementation of new bulge/tip removal algorithms, detection and removing of a all serotypes were identified using multiplex PCR as described before (Pai et al., 2006).Abbreviations: NA, not available; NT, non-typeable.

Table 2 :
Quantity of Samples using Qubit chimeric reads, aggregation of biread information into distance histograms, and allowing of backtrack the performed graph operations.

Table 3 :
Antibiotic susceptibility profiles of S. pneumoniae isolates.

Table 4 :
Conserved non-synonymous Single Nucleotide Polymorphisms (SNPs) associated with Penicillin Binding Proteins (PBPs) and other virulent genes found in resistant isolates.

Table 5 :
Unique non-synonymous Single Nucleotide Polymorphisms (SNPs) associated with Penicillin Binding Proteins (PBPs) and other virulent genes found in all ten pneumococcal isolates isolates.