Recent evidence from the human microbiome project suggests that microbes play a key role in host physiology, health and disease. Yet, microbial community interactions (especially among viruses and their bacterial hosts), over space and time, and with environmental stress (such as antibiotic treatment) are less well understood. Perhaps one of the largest roadblocks in reaching an ecosystem-level understanding of host-microbe-environment interaction is the difficulty in interconnecting and untangling large-scale datasets from varied sources. In collaboration with Dr. David Armstrong of the UA College of Medicine, our lab is creating an interoperable graph database resources to support large-scale data analysis in microbiome-related research.
Specifically, we are building scalable graph databases for storing large-scale metagenomic datasets and analyses, extending Bio4J an open source biological graph database to include KEGG pathway data, and interconnecting disparate biological graph databases towards functional metagenomic analyses. These development efforts will be applied to research on elucidating phage-encoded processes related to bacterial colonization in chronic diabetic foot ulcers. Specifically, phage may provide a genetic reservoir for their hosts that confers antibiotic resistant or virulence genes through horizontal gene transfer thereby contributing to wound chronicity. This big data toolkit will be broadly applicable to any study that seeks to explore systems-level biology using large-scale -omics datasets. Further, the proposed research addresses unsolved problems in graph database theory related to graph database network interoperability and updating applicable to any area of study.