Once you’ve finished picking your OTUs and you have a lovely BIOM formatted OTU table, you may want to leave QIIME and move to a more flexible environment (like R) for doing your analyses. Getting that table ready and importing it as a data.frame in R is fairly simple, but not necessarily straightforward, so we’ll walk through the necessary steps and tools.
First thing, your otu_table_mc2_w_tax.biom file ought to have any assigned taxonomy included for each picked OTU. If this were a standard table, you can think of OTUs as rows, and samples as columns…the 7 levels of taxonomy assigned to each OTU (those that got matches) would be 7 additional columns and are referred to as “observation metadata”. We want to go ahead and add our “sample metadata” from our mapping file to our otu table while we’re at it. First, check out the help file and then run the code, and check the summary to make sure everything stuck:
biom add-metadata --help biom add-metadata -i otu_table_mc2_w_tax.biom -o otu_table_mc2_metadata_w_tax.biom -m ~/FILEPATH/mapping_file.txt --sample-header Location,Elevation biom summarize-table -i otu_table_mc2_metadata_w_tax.biom
This will add the metadata columns named “Location” and “Elevation” from your tab-separated mapping file (Assuming you have those column names) as supplementary “rows” in your BIOM file. If it worked, you’ll see your metadata columns in the “Sample Metadata Categories:” row of the summary output. Now it’s in good shape to move it into R.
BIOM comes in two flavors: JSON, and HDF5. Older QIIME versions used JSON and the current QIIME version uses HDF5. There are a few BIOM packages in CRAN (Comprehensive R Archive Network), but getting them to read HDF5 format seems futile. Fortunately, Paul McMurdie and Joseph Paulson put together a great package that ACTUALLY WORKS! It’s not currently on CRAN, but we can still easily download and install the latest release from your R (RStudio!) console by typing:
install.packages("devtools") # if not already installed devtools::install_github("biomformat", "joey711")
If your version of R is current (I believe version 3.2.3+) then you should be all set. If you are running an older version you may get an error message saying the biomformat package is not available for your R release. Just give yourself a well-deserved upgrade using the following and try again (Mac users can upgrade using HomeBrew):
sudo apt-get update sudo apt-get dist-upgrade
Now that you have the right tool, let’s look at how to use it:
In RStudio, run the following to load the biomformat package:
Now, navigate to the “packages” tab and click on “biomformat” to view the help files. It’s good to read through these a couple of times, but if you are in a hurry to get that OTU table imported, here’s what you can do:
1) Define a file path for your OTU table with all that juicy metadata attached to it:
file_path <- "~/YOUR/FILEPATH/otu_table_mc2_metadata_w_tax.biom"
2) Read the BIOM file and load it to an object (in this case, named “dat”):
dat <- read_biom(file_path)
3) Coerce dat into a matrix (with a lowercase ‘m’) and then into a data frame:
#You can't go straight from a "sparse Matrix" to a data frame! otu_table <- as.data.frame(as.matrix(biom_data(dat)))
4) Get taxonomy:
taxonomy <- observation_metadata(dat)
5) Get sample metadata:
metadata <- sample_metadata(dat)
If we take a peek at the first 6 rows from our three data frames they should look something like what you see below (note: I limited the number of columns so it would fit reasonably)…
head(otu_table[,1:10]) Mon.WS5 PaCo.4 O5.NoLyz 69CV.5 PaTiTi.3 DS.5 PaTiTi.2 PaCo.15 PaTiTi.1 S4 SH206549.07FU_AF145321_refs 1 28 106 7 106 15 141 4 99 17 SH212843.07FU_GU721652_reps 0 0 0 0 0 0 0 0 0 0 SH190986.07FU_KJ130026_reps 7 0 0 0 0 0 0 0 11 0 SH212235.07FU_GU188414_reps 0 10 0 0 0 480 0 0 0 0 SH424854.07FU_U61681_refs 0 0 0 0 0 0 0 0 0 0 SH221495.07FU_JX064536_reps 0 0 0 0 0 0 0 0 33 0 head(metadata[,1:4]) BarcodeSequence Description Host_Taxon Location Mon.WS5 AACAACAAC Mon_WS5 Montipora_capitata K-Bay PaCo.4 AACAACAAC PaCo_4 Odontaster_validus Antarctica O5.NoLyz AACAACAAC O5_NoLyz Mycale_grandis K-Bay 69CV.5 AACAACAAC 69CV-5 Mellicope Culvert_69 PaTiTi.3 AACAACAAC PaTiTi_3 Odontaster_validus Antarctica DS.5 AACAACAAC DS-5 Myrica Dan_St-Palikea head(taxonomy) taxonomy1 taxonomy2 taxonomy3 taxonomy4 taxonomy5 taxonomy6 taxonomy7 SH206549.07FU_AF145321_refs k__Fungi p__Basidiomycota c__Tremellomycetes o__Tremellales f__Incertae sedis g__Cryptococcus s__Cryptococcus sp SH212843.07FU_GU721652_reps k__Fungi p__unidentified c__unidentified o__unidentified f__unidentified g__unidentified s__Fungi sp SH190986.07FU_KJ130026_reps k__Fungi p__Ascomycota c__Sordariomycetes o__Xylariales f__Incertae sedis g__Monographella s__Monographella cucumerina SH212235.07FU_GU188414_reps k__Fungi p__Ascomycota c__Lecanoromycetes o__Lecanorales f__Cladoniaceae g__Cladonia s__Cladonia novochlorophaea SH424854.07FU_U61681_refs k__Fungi p__Ascomycota c__Sordariomycetes o__Hypocreales f__Nectriaceae g__Fusarium s__Fusarium begoniae SH221495.07FU_JX064536_reps k__Fungi p__Ascomycota c__Lecanoromycetes o__Peltigerales f__Lobariaceae g__Sticta s__Sticta fuliginosa