我正在尝试实现一个复杂的数据视图,如下图所示。但是用R和ggplot2。
如观察到的:
I am trying to achieve same results with 2 of my datasets. For India for example, I want in one line, a chart for symptoms and the second a chart for comorbidities. The same for UK and Pakistan. Here are some fake datasets created:
I have tried to get something by creating small datasets per each country and then created 2 plots, one for symptoms and the other for comorbities, and then adding them together. But this is heavy work with so many other issues coming up. Problems may emerge taking this approach. One example it is here:
india_count_symptoms <- count_symptoms %>%
dplyr::filter(Country == "India")
india_count_symptoms$symptoms <- as.factor(india_count_symptoms$symptoms)
india_count_symptoms$Count <- as.numeric(india_count_symptoms$Count)
library(viridis)
india_sympt_plot <- ggplot2::ggplot(india_count_symptoms, ggplot2::aes(x = age_band, y = Count, group = symptoms, fill = symptoms)) +
ggplot2::geom_area(position = "fill", color = "white") +
ggplot2::scale_x_discrete(limits = c("0-19", "20-39", "40-59","60+"), expand = c(0, 0)) +
ggplot2::scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
viridis::scale_fill_viridis(discrete = TRUE)
india_sympt_plot
this is what I got:
And as you can see:
a. the age bands aren't nicely aligned
b. I end up with legends for each plot for each country, if I take this approach
c. y axis does not give me the counts, it goes all the way to 1. and does not come intuitively right.
d. do the same for comorbidites and then get the same problems expressed in the above 3 points.
Thus, I want to follow an easier approach in order to get similar plot as in the first picture, with conditions expressed: from 1 to 5 points but for my 3 countries and for symptoms and comorbidities. However, my real dataset is bigger, with 5 countries but with same plotting - symptoms and comorbidities.
Is there a better way of achieving this with ggplot2, in RStudio?
This is a good start - I'm not clear on some of your goals, but this answer should get you over the immediate obstacles.
## read in your data
count_symptoms = readr::read_csv("https://github.com/gabrielburcea/stackoverflow_fake_data/raw/master/fake_symptoms.csv")
## as mentioned in comments, removing `position = 'fill'` lets your chart show counts.
## (I'm skipping the unnecessary data conversions)
## And I'm removing the `ggplot2::` to make the code more readable...
## No other changes are made
india_count_symptoms <- count_symptoms %>%
dplyr::filter(Country == "India")
india_sympt_plot <- ggplot(india_count_symptoms, aes(x = age_band, y = Count, group = symptoms, fill = symptoms)) +
geom_area(color = "white") +
scale_x_discrete(limits = c("0-19", "20-39", "40-59","60+"), expand = c(0, 0)) +
scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
viridis::scale_fill_viridis(discrete = TRUE)
现在,让我们使用各个方面:
## same plot code as above, but we give it the whole data set
## and add the `facet_grid` on
ggplot(count_symptoms, aes(x = age_band, y = Count, group = symptoms, fill = symptoms)) +
geom_area(color = "white") +
scale_x_discrete(limits = c("0-19", "20-39", "40-59","60+"), expand = c(0, 0)) +
scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
viridis::scale_fill_viridis(discrete = TRUE) +
facet_grid(Country ~ .)
注意,我们只有一个图例。您可以轻松地重新放置它,如下所示。我可能要进行的下一个更改是labels = scales::comma_format
在您的中添加参数scale_y_continuous
。我不知道您的x轴标签有什么问题。
对于完整的图,我建议facet_grid
对每一列绘制一个图,然后使用该patchwork
程序包将它们组合成一个图像。看看您可以从中获得多大的收益,如果仍然遇到问题,请问一个新问题,着眼于下一步。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句