1

According to Wikipedia's definition, "a gene signature is a group of genes in a cell whose combined expression pattern is uniquely characteristic of a biological phenotype or medical condition."

While the above definition is very useful to understand the concept, I am looking for a "more practical" definition or, rather, a mathematical representation (e.g., a vector) of gene signatures. Hence my question.

mrb
  • 161
  • 4
  • 2
  • 1
    I think the WGCNA package in R is an interesting and relevant take, the related paper is here. – CKM Oct 02 '17 at 16:01
  • Let's say you have a list of genes that you derived from a RNA-Seq and made a differential expression analysis. You can rank genes by p-values or fold-change values, or convert to membership values between 0 and 1. This ranked list and any its subsets are gene signatures. – Maxim Kuleshov Oct 02 '17 at 16:52
  • @MaximKuleshov thanks for your reply. Could you, please, provide a full answer listing the three methods you just highlighted (attaching a short discussion of applications would be great) – mrb Oct 02 '17 at 17:38
  • @mrb If you could comment on whether or not you think your post is a duplicate of the one I marked as possible duplicate. If it is not a duplicate, could you please highlight why? That would help to know what you're after. – Remi.b Oct 02 '17 at 20:52
  • Hi @Remi.b, the other post provides a definition of the concept, which is very useful, but my question is about a common mathematical representation (eg, the comment above by @MaximKuleshov). – mrb Oct 03 '17 at 11:41
  • @mrb You edited your post from genetic signature to gene signature after the question had been answered. You should rather roll this edit back, accept or comment on the given answer and open a new post for your new question. Also, it will bring more attention to your new question to open a new post than to change a previous one. – Remi.b Oct 04 '17 at 19:54
  • No, i haven’t. I am sorry about the confusion but i edited the post immediately after the first comment a few days ago. Dunno way you could not see the edited version until now. – mrb Oct 05 '17 at 07:42

2 Answers2

1

There are mathematical definitions of "gene signature". Please have a look at the supplemental material of Subramanian et al. PNAS 2005, one of the first papers on "gene signatures", and a method and tool which is still commonly used in basic research.

Note that Subramanian et al. have multiple definitions and that copying them here would exceed the scope of this page.

tsttst
  • 1,597
  • 9
  • 25
-1

There is no mathematical definition of genetic signature. Please first have a look at Is there a formal definition of signature of natural selection?. As explain the term genetic signature is used in a very broad sense.

There are a number of processes for which one might be looking for a signature of this process. Such signature can take the form of various statistics. For example, a historically important statistics is Tajima's D defined as

$$D = \frac{\Pi - S/a}{SE}$$

, where $\Pi$ is the average number of pairwise differences between two randomly sampled haplotypes, $S$ is the number of segregating sites, $a=\sum_i^{k-1}1/i$, where $k$ is the number of sampled haplotypes. $SE$ is the standard error for the numerator which takes a complicated formulation (have a look at Tajima 1989 for more information).

Remi.b
  • 68,088
  • 11
  • 141
  • 234
  • Why would anyone use pi as a variable?! – canadianer Dec 03 '17 at 02:31
  • @canadianer What do you mean? $\pi$ is very commonly used to represent the expected heterozygosity. Note that to avoid confusion, I used $\pi$ to $\Pi$ in my answer. – Remi.b Dec 03 '17 at 04:21