Cyberinfrastructure for Microbial Ecology

Developing cyberinfrastructure for microbial ecology to enable high-resolution comparison of microbes in dynamic ecosystems.

Our lab facilitates data-driven discovery in ecosystem genomics by integrating data from varied big data sources using advanced computational architectures and analytics. Through these efforts we promote open science and develop virtual communities (in conjunction with partners at protocols.io) to disseminate protocols and encourage the discussion and development of bioinformatics methods.

Planet Microbe

Planet microbe reconnects ‘omics data with environmental data from oceanographic cruises to enable biological inquiry into environmental changes that affect the distribution and abundance of microbes in the sea. Funded by NSF 2017-2020

iMicrobe

An interactive website for microbial datasets built on top of the CyVerse Cyberinfrastructure to discover data and perform bioinformatics analyses. Funded by NSF 2016-2019 and Gordon Betty Moore Foundation 2014-2016

iVirus

Building a Cyberinfrastructure for Viral Ecology: Using the existing CyVerse cyberinfrastructure we are developing tools, data and metadata resources specific to viral ecology through the iVirus project.

muSCOPE

A unifying framework for SCOPE datasets to integrate ‘omics data with more traditional physical, geological, geochemical, and biological data sets collected by SCOPE researchers. Funded by the Simons Foundation 2016-2019

Ocean Cloud Commons

The Ocean Cloud Commons (OCC) is a cloud-based resource and repository that allows researchers to query the Tara Oceans Expedition Data in the cloud; and makes available comparative metagenomic tools through the Ocean Treasure Box (OTB). Funded by NSF 2017-2020

VERVE Net

An online forum to increase connectivity and collaboration among virus ecology researchers. Funded by the Gordon Betty Moore Foundation 2015-2017

Algorithm and Tool Development for Metagenomics

Developing novel algorithms for annotating and comparing sequence data that run on parallel and distributed computing environments to optimize resource usage including: compute cycles, input-output (I/O), memory hierarchies, and energy consumption.

For the last two decades, Intel chip design has followed Moore’s law where the density of transistors double every two years, while the cost declines. Recently, making smaller and less expensive chips has become more challenging, and we may be reaching a computational plateau. Given that DNA sequencing technologies are producing increasingly more data, and compute power may not follow the same increase, algorithms that optimize the use of compute cycles, input-output (I/O), memory hierarchies, and energy consumption are vital. Our lab focuses on developing novel algorithms for comparing massive metagenomic datasets based on their “genetic fingerprint”. We ground-truth our results using mock communities and carefully controlled mixtures to make inferences in metagenomic communities.

Libra

Libra is a highly-scalable Hadoop application for similarity comparison of metagenomic samples. Publication

Fizkin

All-vs-all sequence comparison for WGS.

Singularity/Docker Containers

We develop Singularity/Docker containers for many common bioinformatic and metagenomic programs.

Github Repository

The Hurwitz Lab is a strong advocate for the creation and use of open-source software. As such, all of our public code is freely available on Github

Exploration in Metagenomics

Exploring microbial interactions in diverse habitats ranging from the global ocean for field diagnostic development, to clinical infections in wounds for phage therapy using ‘omics technologies.

Isogenie

Wound Microbiome

In collaboration with Dr. David Armstrong of the UA College of Medicine, our lab is using machine learning and network analytics to find patterns in healing vs chronic ulcers.

Phage Therapy in Wounds

Understanding the close interplay between bacteria and their phages, and the human immune response to microbial communities.

Gut Microbiome in Colon Cancer

Interlinking large-scale -omics datasets with functional annotations and metabolic pathways to better understand the role of microbes in colon cancer