How can I add already calculated standard error values to each bar in a bar plot (ggplot)?

jaria20 Published at Dev

jaria20

I have 6 genes which I want to compare the effect size ( following linear mixed models) between two groups ( control-crohns and control-ulcerative colitis). My bars will be both positive and negative and there are 6 genes altogether.

Here is my data:

structure(list(Gene1 = c(-0.017207751, 
-0.00939068, 0.007440363, 0.004574254), Gene2 = c(0.025987401, 
 0.025625772, 0.010239336, 0.00695056), Gene3 = c(0.018122943, 0.012997113, 
 0.008892864, 0.006541982), Gene4 = c(-0.022694115, 
-0.009823328, 0.007286011, 0.004776522), Gene5 = c(0.031315514, 
 0.013967722, 0.008674407, 0.00674662), Gene6 = c(-0.016374358, 
-0.009660298, 0.007140279, 0.004536602)), class = "data.frame", row.names = c("Control_Crohns", 
 "Control_UC", "Std.error_controlcrohns", "Std.errorr_controluc"
))

I have just extracted this data from a bigger set ( and therefore would like to keep the standard errors from the larger data set). I can plot the graph with just the bars for each of the genes using the following ( I removed the last two rows of the above with the std.error for each group to do this).

  datframe2=data.frame(Group=rownames(data), data)


  datframe.m <- melt(datframe2, id.vars = "Group")  


graph <- ggplot(datframe.m, aes(x = variable, y= value, fill=Group)) +geom_bar(aes(variable, value), 
 stat= "identity", width=0.8, position="dodge")

graph + theme(axis.text.x=element_text(angle = 90, vjust = 0.5, hjust=1)) + xlab("Gene") + 
ylab("Estimate")

However, I do not know how to add the calculated std.error values to each bar using geom_errorbar using the original data above. Please could somebody direct me to an example ( as I haven't been able to find one where they add already pre-existing values, and a similar question on here did not help). Thank-you.

dc37

I think you need to reshape your dataframe in order to make your data simpler to use in gglot2.

When it is about to reshape data into a longer format with multiples columns names as output, I prefered to use melt function from data.table package. But you can get a similar result with pivot_longer function from tidyr.

At the end, your dataset should look like this:

library(data.table)
DF <- as.data.frame(t(DF))
DF$Gene <- rownames(DF)

DF.m <- melt(setDT(DF), measure = list(grep("Control_",colnames(DF)),grep("Std.error",colnames(DF))),
     value.name = c("Control","SD"))

     Gene variable      Control          SD
 1: Gene1        1 -0.017207751 0.007440363
 2: Gene2        1  0.025987401 0.010239336
 3: Gene3        1  0.018122943 0.008892864
 4: Gene4        1 -0.022694115 0.007286011
 5: Gene5        1  0.031315514 0.008674407
 6: Gene6        1 -0.016374358 0.007140279
 7: Gene1        2 -0.009390680 0.004574254
 8: Gene2        2  0.025625772 0.006950560
 9: Gene3        2  0.012997113 0.006541982
10: Gene4        2 -0.009823328 0.004776522
11: Gene5        2  0.013967722 0.006746620
12: Gene6        2 -0.009660298 0.004536602

Then, you can easily plot with ggplot2 by using geom_errorbar for standard deviation of each genes.

library(ggplot2)

ggplot(DF.m, aes(x = Gene, y= Control, fill = as.factor(variable)))+
  geom_col(position = position_dodge())+
  geom_errorbar(aes(ymin = Control-SD,ymax = Control+SD), position = position_dodge(0.9), width = 0.2)+
  scale_fill_discrete(name = "Disease", labels = c("Crohns", "UC"))

Does it answer your question ?

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-12-13

Comments

0 comments

From Dev

How can I get my Chart.JS bar chart to stack two data values together on each bar, and print a calculated value on each bar?

From Dev

Related Related

Article