I work with people who generate a lot of microarray data. One question that they often ask is: can we find those genes with a two-fold or more change in median expression under two or more different conditions?

For example, let’s say that we have 3 conditions: “normal”, “adenoma” and “cancer”. That gives us 3 pairwise comparisons: normal-adenoma, normal-cancer and adenoma-cancer. Here’s a Ruby solution to the problem.

First, I installed StatArray, a Ruby gem that provides statistical methods for array objects. It’s not been updated for 3 years, but seems to work.

require 'rubygems'
require 'statarray'

Next, credit to David Burger for posting this code solution for combinations in Ruby. You give it an array of elements and the number of elements (r) that you want to see in each combination; it returns arrays of length (r) with each combination. I’ve just wrapped it in a class called Combination:

class Combination
def generate_combinations(array, r)
n = array.length
indices = (0...r).to_a
final = (n - r...n).to_a
while indices != final
yield indices.map {|k| array[k]}
i = r - 1
while indices[i] == n - r + i
i -= 1
end
indices[i] += 1
(i + 1...r).each do |j|
indices[j] = indices[i] + j - i
end
end
yield indices.map {|k| array[k]}
end
end

So, let’s represent our covariates (normal, adenoma, cancer) as hash keys and their expression values, from several samples, as arrays which are the hash values. Then, we create a new hash with the same keys where the values are medians, calculated using methods from StatArray. Here, I’m using very silly dummy values for expression, which are supposed to be log base 2 values:

values = {"normal" => [1,2,3,4], "adenoma" => [5,6,7,8], "cancer" => [9,10,11,12]}
median = Hash.new
values.each_pair {|k,v|
median[k] = v.to_statarray.median
}

I’m sure that there’s a more elegant way to map the values hash to the medians hash, but that’s for another day.

Finally, we generate all possible combinations of two covariates, subtract their median values and convert to an absolute value. If we’re using log base 2, then a change of one unit = a two-fold change in expression:

c = Combination.new
c.generate_combinations(median.keys,2) do |x|
puts "abs(#{x.join(" - ")}) = #{(median[x[0]] - median[x[1]]).abs}"
end

Result:

abs(normal - adenoma) = 4.0
abs(normal - cancer) = 8.0
abs(adenoma - cancer) = 4.0

Fin.

### Like this:

Like Loading...

*Related*