여러 줄 시계열에 대해 두 시계열 간의 상관 관계를 플롯에 표시합니다.

debugcn 에 게시 Dev

mlachans

두 개의 시계열이 표시된 그래픽이 많이 있습니다.

즉, 공통 날짜 집합에 대해 y_1과 y_2의 플롯이 하나 있습니다.

각 플롯에 대해 각 시리즈 쌍 사이의 플롯에 대한 상관 관계를 제시하고 싶습니다. 즉, cor (y_1, y_2)를 계산하고 각 플롯에 결과 숫자를 포함하고 싶습니다.

이것은 ggplot2에서 원칙적으로 수행하기가 놀랍도록 어렵습니다. 지금까지 stat_cor를 사용하여 간단한 방법을 찾지 못했습니다.

이 작업에 권장되는 다른 함수를 이미 살펴 보았지만 모두 y_1과 y_2가 시간에 대해 플롯되는 것이 아니라 y_1이 y_2에 대해 플롯되는 상황에서 y_1과 y_2의 상관 관계를보고하도록 설계되었습니다.

이 작업을 수행하는 ggplot2 방식을 선호하지만 R 내에서 그래픽 소프트웨어를 사용할 수 있습니다. 여기에 최소한의 작업 예제와 제가 시도한 코드가 있습니다.

library(reprex); library(ggplot2); library(ggpubr)
n <- 6; 
Q=sample(18:30, n, replace=TRUE)

# make sample data
dat <- data.frame(id=1:n, 
                  date=seq.Date(as.Date("2020-12-26"), as.Date("2020-12-31"), "day"),
                  group=rep(LETTERS[1:2], n/2),
                  quantity= Q,
                  price= 100 - 2*Q + rnorm(n))
dat
#>   id       date group quantity    price
#> 1  1 2020-12-26     A       19 63.02628
#> 2  2 2020-12-27     B       26 49.66597
#> 3  3 2020-12-28     A       27 44.98031
#> 4  4 2020-12-29     B       24 51.11224
#> 5  5 2020-12-30     A       29 41.11129
#> 6  6 2020-12-31     B       28 43.04494

tseriesplot <- ggplot(dat, aes(x = date)) + ggtitle("Oil: Daily Quantity and Price") +
                  geom_line(aes(y = Q, color = "Quantity (thousands of barrels)")) +
                  geom_line(aes(y = price, color = "Price"))
  
tseriesplot


# naive attempt fails
tseriesplot + stat_cor(data = dat, aes(x=quantity, y=price),method="pearson")
#> Error: Invalid input: date_trans works with objects of class Date only

^{reprex 패키지 (v0.3.0)에 의해 2021-01-05에 생성됨}

https://stat.ethz.ch/pipermail/r-help/2020-July/467805.html과 같이 다른 곳에서 더 복잡한 질문과 비슷 하지만 훨씬 더 기본적인 질문이므로 좋은 질문이라고 생각했습니다 .

G. Grothendieck

1) 주석txt 플롯하려는 텍스트 를 만든 다음 다음을 사용하십시오 annotate.

txt <- with(dat, sprintf("cor: %.2f", cor(quantity, price)))
tseriesplot + 
  annotate("text", label = txt, x = min(dat$date), y = max(dat$quantity, dat$price), 
    hjust = -0.1)

2) grid.text Another approach is to use grid graphics which allows one to specify the location independently of the data. Using txt from above:

library(grid)

tseriesplot
grid.text(txt, 0.1, 0.9)

3a) zoo This would also work:

library(zoo)

z <- read.zoo(dat[c("date", "price", "quantity")])
txt <- sprintf("cor: %.2f", cor(z)[2])
autoplot(z, facet = NULL) +
  annotate("text", label = txt, x = start(z), y = max(z), hjust = -0.1)

3b) scale

or you could scale the variables as that does not affect the correlation:

z <- scale(z)
autoplot(z, facet = NULL) +
  annotate("text", label = txt, x = start(z), y = max(z), hjust = -0.1)

Discussion

Overall putting together parts of different solutions this seems the most compact

library(zoo)
library(grid)

z <- read.zoo(dat[c("date", "price", "quantity")])
autoplot(z, facet = NULL)
grid.text(sprintf("cor: %.2f", cor(z)[2]), 0.1, 0.9)

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정2021-04-5

몇 마디 만하겠습니다

0리뷰

로그인참여 후 검토

Related 관련 기사

기사