Author Archives: kyclark


Ph.D. student Scott Daniel has advanced to the final round in University of Arizona’s “Gradslam” competition #uagradslam. The Gradslam competition is a chance for any graduate student at the university to present their research in 3 minutes or less. The event has a grand prize of $3,000. Scott competed against 15 other grad students with his talk entitled “What can poo do for you?”. While the title may be funny, the science behind it isn’t. It’s about saving colon cancer patient’s lives with the beneficial bacteria in our faecal matter. Scott will be competing in the final round on Monday, April 4th in the Steve Eller Dance Theater at 5:30pm. 16 - 1 (4)

UA Data Lake

Exciting times! We are collaborating with the University of Arizona high performance computing team to create the first ever University-wide Data Lake. What is a Data lake you ask? In short, a data lake is a large pool of data stored in a hadoop “big data” architecture that allows researchers to query and compute on enormous data sets. We are creating this amazing new data reservoir, by making high impact large-scale datasets such as twitter and the human and earth microbiome projects available in a linearly-scaling hadoop architecture. Researchers can then “buy-in” to add additional data nodes to store their own “big data” sets, that can be used in conjunction with persistent university data resources. This data lake will allow researchers university-wide to query relevant “big data” resources and pair these data up with metadata tags and their own data sets to answer ever evolving data questions. Today at UA, researchers mine large-scale datasets from diverse areas to perform research on how to staff emergency rooms based on twitter data to understanding the role of microbes in biogeochemical cycling in the world’s ocean.

iMicrobe Data Commons is up!

The iMicrobe Data Commons (funded by the Gordon Betty Moore Foundation) is now available as an interactive website (see and The project was initially funded to make CAMERA microbial dataset available through an interactive data commons ( and in the iPlant Data Store for use in the iPlant cyberinfrastructure ( Two months in and we are finished with those goals! We are now working hard on developing a query-able ontology for the data sets and making these data easily discoverable within the iPlant cyberinfrastructure. HOWTO docs are under development at, and our first workshop is coming up at the American Society for Oceanography and Limnology (ASLO) in Granada, Spain. We hope to see you there (