Tag Archives: statistics

Snippets: guts, cancers, statistics

File under “interesting articles that I don’t have time to write about at length.”

  • Archaea and Fungi of the Human Gut Microbiome: Correlations with Diet and Bacterial Residents
  • Long ago, before metagenomics and NGS, I did a little work on detection of Archaea in human microbiomes. There’s a blog post in the pipeline about that but until then, enjoy this article in PLoS ONE.

  • Mutational heterogeneity in cancer and the search for new cancer-associated genes
  • This article is getting a lot of attention on Twitter this week. Brief summary: cancer cells are really messed up in all sorts of ways, most of which are not causal with respect to the cancer. Anyone who has ever looked at microarray data knows that it’s not uncommon for 50% or more of genes to show differential expression in a cancer/normal comparison, so this is hardly a new concept. I think we need to move away from ever-more detailed characterizations of the ways in which cancer cells are “messed up.” We know that they are and that doesn’t provide much insight, in my opinion.

  • The vast majority of statistical analysis is not performed by statisticians
  • Interesting post by Jeff Leek, summarized very well by its title. It points out that many more people are now interested in data analysis, many of them are not trained professionally as statisticians (I’m in this category myself) and we need to recognize and plan for that.

Bonus post doing the rounds of social media: Using Metadata to Find Paul Revere. Social network analysis, 18th-century style. Amusing, informative and topical.

A brief introduction to “apply” in R

At any R Q&A site, you’ll frequently see an exchange like this one:

Q: How can I use a loop to [...insert task here...] ?
A: Don’t. Use one of the apply functions.

So, what are these wondrous apply functions and how do they work? I think the best way to figure out anything in R is to learn by experimentation, using embarrassingly trivial data and functions.
Read the rest…

R has a JSON package

Named rjson, appropriately. It’s quite basic just now, but contains methods for interconversion between R objects and JSON. Something like this:

library(rjson)
data <- list(a=1,b=2,c=3)
json <- toJSON(data)
json
[1] "{\"a\":1,\"b\":2,\"c\":3}"
cat(json, file="data.json")

Use cases? I wonder if RApache could be used to build an API that serves R data in JSON format?