毎月の着信と発信を示す棒グラフを作成しようとしています。積み上げ棒グラフを作成するときは、上位3つの連絡先(通話時間が最も長い)のスタックを表示し、残りを他の連絡先として表示できるようにしたいと思います。Rでこれを自動化する方法はありますか?
私の現在のチャートは次のようになります。
私のデータフレーム(callsummary)は以下のリンクからダウンロードできます(3kb):
https://dl.dropboxusercontent.com/u/4077057/callsummary.csv
ggplotの私のコードは次のとおりです。
ggplot(callsummary) +
aes(x = Bill, y = Duration) +
geom_bar(stat = "identity", aes(fill =Contact) ) +
facet_grid(~Direction) +
labs(list(title = "Monthly Call Summary", x = "Month", y = "Total Call duration in Minutes")) +
geom_hline(data = subset(monthlysummary, Direction == "In"), aes(yintercept = mean(Duration))) +
geom_hline(data = subset(monthlysummary, Direction == "Out"), aes(yintercept = mean(Duration)))
「In」グリッドと「Out」グリッドの両方の平均通話時間に等しいy切片を追加できるようにするために、monthlysummaryという別のデータフレームを作成したことに注意してください。
Monthlysummaryデータフレームは次のようになります。
Direction Bill Duration Amount
<fctr> <fctr> <dbl> <dbl>
1 Out April 9.3 1.40
2 In April 55.3 0.00
3 Out May 32.5 4.89
4 In May 76.9 0.00
5 Out June 17.4 2.62
6 In June 114.3 0.00
# Read the csv file
callsummary <- read.csv("callsummary.csv", header = T)
# Remove the first column which are row numbers
callsummary <- callsummary[,-1]
library(dplyr)
library(ggplot2)
callsummary$Contact <- as.character(callsummary$Contact)
df <- callsummary %>%
group_by(Direction, Bill) %>%
arrange(desc(Amount)) %>%
mutate(Index = 1:n(), Contact = ifelse(Index > 3, "Other", Contact))
df2 <- df %>% group_by(Direction) %>% summarise(Y = mean(Amount))
ggplot(df, aes(x = Bill, y = Amount)) +
geom_bar(stat = "identity", aes(fill = Contact)) +
facet_grid( ~ Direction) +
labs(list(title = "Monthly Call Summary", x = "Month", y = "Total Call duration in Minutes")) +
geom_hline(data = df2, aes(yintercept = Y))
データ(最初の列を削除した後)
structure(list(Direction = structure(c(1L, 1L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L,
2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("In",
"Out"), class = "factor"), Contact = c("D28", "D10", "D18", "D3",
"D10", "D18", "D3", "D18", "D10", "D18", "D21", "D27", "D13",
"D3", "D10", "D3", "D21", "D22", "D23", "D17", "D13", "D15",
"D18", "D21", "D2", "D8", "D1", "D15", "D23", "D23", "D18", "D11",
"D16", "D21", "D24", "D3", "D25", "D15", "D10", "D9", "D22",
"D19", "D10", "D3", "D8", "D12", "D13", "D15", "D17", "D19",
"D19", "D20", "D4", "D5", "D6", "D7", "D11", "D13", "D14", "D15",
"D17", "D19", "D20", "D21", "D22", "D22", "D26", "D27", "DNA"
), Bill = structure(c(2L, 3L, 3L, 3L, 2L, 1L, 2L, 3L, 1L, 2L,
2L, 1L, 2L, 1L, 2L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 1L, 1L, 3L, 3L,
1L, 1L, 1L, 3L, 2L, 1L, 3L, 3L, 3L, 3L, 1L, 2L, 1L, 1L, 3L, 3L,
3L, 1L, 3L, 3L, 1L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 1L, 2L, 2L, 1L,
3L, 3L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 3L, 2L), .Label = c("April",
"June", "May"), class = "factor"), Amount = c(56.3, 20.6, 16.3,
16, 15.9, 14.3, 11.2, 10.8, 9.1, 8, 7.4, 6.9, 6.4, 5.3, 5.1,
5, 4.6, 3.9, 3.7, 3.4, 3, 3, 3, 3, 3, 3, 2.8, 2.4, 2.4, 2.4,
2.3, 2.3, 2, 2, 2, 1.9, 1.5, 1.4, 1.3, 1.3, 1.2, 1.2, 1.1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1), NA. = c(0, 0, 2.45, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0.77, 0.75, 0, 0, 0, 0, 0.45, 0.45, 0.45, 0.45, 0, 0, 0, 0,
0, 0, 0.35, 0, 0.3, 0.3, 0, 0.29, 0, 0, 0.2, 0, 0.18, 0, 0.17,
0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("Direction",
"Contact", "Bill", "Amount", "NA."), row.names = c(NA, -69L), class = "data.frame")
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加