Procalcitonin predicts, to a degree, the likelihood of a bacterial versus a viral infection. Dependence on a single variable like procalcitonin has made clinicians uneasy, particularly if such a single test is used to guide continued use of antimicrobials. Hospitalized adults with pneumonia is a population that can benefit from markers that distinguish between bacterial and viral infection. If the infection is associated or not with certain biomarkers, then antibiotics can be stopped when they are not useful, such as with viral infections.

This study was performed using patients from Rochester General Hospital in Rochester, NY. During respiratory infection seasons, 118 patients with lower respiratory tract infections (LRTIs) and 20 healthy controls were included in the study. Very broad culture and polymerase chain reaction (PCR)-based diagnostics were used for traditional identification of bacterial and viral pathogens. Those identifications were used as the gold standard. A large set of transcriptional modules that group together genes with shared expression were the basis of the transcriptional array analysis. A method known as the K-nearest neighbor (K-NN) algorithm was used to identify top-ranked genes which discriminated between viral and bacterial infection.

Of the 118 patients, 71% had a viral infection and 22% had a bacterial infection. A set of patients’ blood samples was used as a “training” set, which identified the transcriptional signature of LRTI. These signatures were then validated further in a “test” set of blood samples. There were 3986 “differentially expressed transcripts” showing a consistent gene expression pattern in the patients with bacterial compared to viral LRTIs. Test set samples correctly grouped gene expression patterns in 53 of 59 (90%) bacterial infections.

Bacterial LRTI showed significant overexpression of genes related to innate immunity (inflammation and neutrophil modules) and underexperssion of genes related to adaptive immunity, such as B- and T-cell activation, particularly interferon expression. The authors went even further and constructed a set of “classifier” genes. A K-NN algorithm found 10 classifier genes that did differentiate between bacterial and viral LRTI. Classifier genes group correctly 22 or 23 new patients’ samples and repeated this analysis again in 23 patients with the same findings. There were patients with coninfections whose transcripts patterns were functionally between pure bacterial and pure viral infection.


Severe pneumonia meets big data. It is hard to imagine 10 years ago when studies showing the promise of procalcitonin to differentiate bacterial from viral pneumonia that just a decade later that massive molecular arrays could be utilized economically and rapidly to differentiate at a much more sensitive level. Surely the implication of this study is that we are at the beginning of a new revolution in microbial diagnostics. The data points are numerous, their analysis is highly technical, but the amount of information is extremely helpful, not only in differential diagnosis but also in understanding basic pathophysiology. For example, in the current study, we see that array technology identifies that multiple interferon genes are overexpressed in viral LRTIs and not in bacterial infections. The finding itself is not so surprising in view of long-standing knowledge about the nature of interferon responses. What is surprising is that a huge array can pinpoint what the authors call “classifier genes” that themselves can be used for a very practical differentiation between a bacterial and viral pneumonia and, moreover, suggest by an expanded transcript pattern when the infection may be mixed.

We have entered a brave new molecular diagnostic world where DNA array applications are just a part of the emerging technology to facilitate microbial diagnosis and treatment.