|
4.
Clustering executable
specification
a)
Input
The clustering executable
should run by taking a command line argument, which indicates the name
of the parameter setting file which can be assumed at the same directory
with the executable. The parameter setting file contains a list of "[name] =
[value]", one at a line. There are three types of parameters accepted in
"params.txt":
w
double values, such as "percentile = 0.10"
w
input data files, such as "exprFile =
../data/mouse_expr.txt" (should support relative file path)
w
output data file. This should be a single line with
name "outputFile" and value like "../result/output.txt" (should support
relative file path)
An example parameter setting file
for hierarchicalClustering.exe looks like the following:

b)
Output
There are several
constraints on the output data file of the executable:
1).
the first line of the file should look like this "[n]
[m1] ([m2] [m3] ¡.)" where mi (where i >=1) and n are positive integers
and they are tab delimited:
-
[n]: the first [n] number of columns are ids/names of
genes
-
[mi] where i >=1: the number of columns of
mircoarray data for species i. And the numbers in parentheses are optional
2).
the second line should be the names of the
experiments
3).
all the following lines of the file should exactly
follow the description in first line, except
-
output several lines of "NONE" to separate different
clusters
-
if additional information needs to be outputted,
start this line with "pseudo-probe", so that it will be ignored by our
visualization module
The example output file
for hierarchicalClustering.exe looks like the following:

|