3

Assume that I have a data frame with two columns and 19 rows (see below); The left column is the name of cell lines and the right one is the expression of gene ZEB1 in corresponding cell line.

    CellLines   ZEB1
    600MPE  2.8186
    AU565   2.783
    BT20    2.7817
    BT474   2.6433
    BT483   2.4994
    BT549   3.035
    CAMA1   2.718
    DU4475  2.8005
    HBL100  2.6745
    HCC38   3.2884
    HCC70   2.597
    HCC202  2.8557
    HCC1007 2.7794
    HCC1008 2.4513
    HCC1143 2.8159
    HCC1187 2.6372
    HCC1428 2.7327
    HCC1500 2.7564
    HCC1569 2.8093

I've drawn a histogram for this data using simple code below:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")

and it gives me the histogram whose x axis is the amount of gene expression and the y axis is the frequency of that expression among cell lines; however, I would like to add the name of cell lines to their specific positions on histogram... How can I do that?

Thanks in advance for your time on answering this :-) Best.

Momeneh Foroutan
  • 915
  • 1
  • 6
  • 8

2 Answers2

2

One alternative is to use text to insert labels into the plot:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
text(Heiser$ZEB1, 2, labels= Heiser$CellLines, srt=90)

enter image description here

Edit:

Positioning labels in the same category one over another:

Heiser_hist <- hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
Heiser$cut <- cut(Heiser$ZEB1, breaks=Heiser_hist$breaks)
library(dplyr)
Heiser <- Heiser %>% group_by(cut) %>% mutate(pos = seq(from=1, to=2, length.out=length(ZEB1)))
with(Heiser, text(ZEB1, pos, labels=CellLines, srt=45, cex=0.9))

enter image description here

You could try the text without inclination changing srt, but the overplotting is worse in that case. You could also play with the x axis to reduce overplottig.

Carlos Cinelli
  • 11,354
  • 9
  • 43
  • 66
  • Thanks Carlos! I tried this command but it creates a lot of overlaps and does not make it clearer! I was wondering if there is a way to write the name of cell lines horizontally above each bar and list several names below each other? – Momeneh Foroutan Sep 30 '14 at 05:23
  • @MomenehForoutan yes, but you have to create the positions manually. I have sketched a solution above. – Carlos Cinelli Sep 30 '14 at 05:41
  • Thanks Carlos :-) This is much better than vertical form! – Momeneh Foroutan Sep 30 '14 at 05:58
  • I do not see "ok" sign! there is only an option to vote for an answer but because I am very new here I do not have enough reputation to up-vote that :-( – Momeneh Foroutan Oct 01 '14 at 09:07
  • @MomenehForoutan it is the sign right below the votes options. – Carlos Cinelli Oct 01 '14 at 17:24
  • I was wondering if you also know any way to color only one or two of the labels in R?! or maybe circle them! Thanks in advance for any hint on this :-) – Momeneh Foroutan Oct 07 '14 at 05:58
  • @MomenehForoutan yes it is possible, using the `col` parameter for example, I think you should ask a different question for this! This will make it easier for me and other people answer! – Carlos Cinelli Oct 07 '14 at 14:02
  • Thanks Carlos; I asked this here "http://stackoverflow.com/questions/26249385/how-to-color-only-one-of-the-labels-of-a-histogram-in-r" – Momeneh Foroutan Oct 08 '14 at 04:51
0

You are going to have a problem with overlapping labels (not sure what you want to do there) but

hist(Heiser$ZEB1[1:19], breaks=50, col="grey", xaxt="n")
axis(1,Heiser$ZEB1, Heiser$CellLines )

I think gives you what you're after based on the description.

Are you sure you don't want a bar plot instead? Because with a histogram, one bar does not represent one observation. The histogram is an attempt to estimate the underlying probability density function for continuous variables.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • 1
    Thanks for your comment MrFlick. Yes, I am sure that I need a histogram because I want to estimate which cell lines significantly differentially express ZEB1 from the mean value! and Yes, it gives me an overlapped names... that is why I thought it would be a good idea to ask that here! I need something somewhere on the hist to list the name of the cell lines related to each bar!! I also tried doing that by text()... but was not successful :-( – Momeneh Foroutan Sep 30 '14 at 05:10