Abstract
Objectives: Extensive multi-omic bacterial gene expression datasets are publicly available, but tools that unify these datasets for interpretation and hypothesis-testing are limited. The complexity and requirement of specialized bioinformatics and programming expertise pose significant major barriers for researchers attempting to query and analyze these datasets. Our objective was to develop an integrated search engine that simplifies access to publicly available gene expression data to facilitate comparison and analysis of multi-omics datasets. Methods: We developed the Centralized Access to Gene Expression Datasets (CAT-GxD) search engine to provide integrated access to, and facilitate analysis of, publicly available transcriptomics and proteomics datasets of the CDC Urgent Threat pathogen Clostridioides difficile. Manual data curation was performed to integrate and standardize all 74 nonredundant transcriptomics and quantitative proteomics datasets available at Gene Expression Omnibus (GEO) database and ProteomeXchange Consortium. The CAT-GxD search engine, developed on open-source R-shiny framework, is available at https://viz.datascience.arizona.edu/catgxd/. Results: CAT-GxD successfully consolidated disparate transcriptomics and proteomics datasets, supporting interpretation and hypothesis testing. CAT-GxD provides customizable visualization of gene expression data under different conditions. We demonstrate the utility of CAT-GxD in analyzing the contribution of RNA polymerase, nitrogen-limitation N (RpoN) to C. difficile biology, and highlight the RpoN-dependent regulation of genes treated with succinate and the secondary bile acid deoxycholate. Conclusions: CAT-GxD streamlines the analysis of C. difficile multi-omic data, reducing the complexity and analysis time. It facilitates the generation of novel hypotheses and the identification of anti-infective targets, and can be adapted to incorporate data analysis paradigms for diverse organisms.
| Original language | English (US) |
|---|---|
| Article number | 103005 |
| Journal | Anaerobe |
| Volume | 96 |
| DOIs | |
| State | Published - Dec 2025 |
Keywords
- Bacterial gene expression datasets
- Clostridioides difficile
- Database
- Multi-omic
- Proteomics
- Transcriptomics
ASJC Scopus subject areas
- Microbiology
- Infectious Diseases
Fingerprint
Dive into the research topics of 'CAT-GxD: Centralized access to gene expression datasets'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS