A time series platform for the tidyverse

A time series platform for the tidyverse

本文是学习 time_series_platform_for_tidyverse.pdf 的笔记。介绍了 tibbletime 包的使用示例。

首先我们需要为 tibble 数据框指定日期索引生成 time tibble 数据框:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
library(tibbletime)
data(FB)
FB$volume <- NULL
FB_time <- tbl_time(FB, index = date)
FB_time
#> # A time tibble: 1,008 x 7
#> # Index: date
#> symbol date open high low close adjusted
#> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 FB 2013-01-02 27.4 28.2 27.4 28 28
#> 2 FB 2013-01-03 27.9 28.5 27.6 27.8 27.8
#> 3 FB 2013-01-04 28.0 28.9 27.8 28.8 28.8
#> 4 FB 2013-01-07 28.7 29.8 28.6 29.4 29.4
#> 5 FB 2013-01-08 29.5 29.6 28.9 29.1 29.1
#> 6 FB 2013-01-09 29.7 30.6 29.5 30.6 30.6
#> 7 FB 2013-01-10 30.6 31.5 30.3 31.3 31.3
#> 8 FB 2013-01-11 31.3 32.0 31.1 31.7 31.7
#> 9 FB 2013-01-14 32.1 32.2 30.6 31.0 31.0
#> 10 FB 2013-01-15 30.6 31.7 29.9 30.1 30.1
#> # … with 998 more rows

计算收益率:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
library(dplyr)
FB_time %>%
mutate(
adjusted_return = adjusted / dplyr::lag(adjusted) - 1
)
#> # A time tibble: 1,008 x 8
#> # Index: date
#> symbol date open high low close adjusted
#> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 FB 2013-01-02 27.4 28.2 27.4 28 28
#> 2 FB 2013-01-03 27.9 28.5 27.6 27.8 27.8
#> 3 FB 2013-01-04 28.0 28.9 27.8 28.8 28.8
#> 4 FB 2013-01-07 28.7 29.8 28.6 29.4 29.4
#> 5 FB 2013-01-08 29.5 29.6 28.9 29.1 29.1
#> 6 FB 2013-01-09 29.7 30.6 29.5 30.6 30.6
#> 7 FB 2013-01-10 30.6 31.5 30.3 31.3 31.3
#> 8 FB 2013-01-11 31.3 32.0 31.1 31.7 31.7
#> 9 FB 2013-01-14 32.1 32.2 30.6 31.0 31.0
#> 10 FB 2013-01-15 30.6 31.7 29.9 30.1 30.1
#> # … with 998 more rows, and 1 more variable: adjusted_return <dbl>

选择列:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
FB_time %>%
select(adjusted)
#> # A tibble: 1,008 x 1
#> adjusted
#> <dbl>
#> 1 28
#> 2 27.8
#> 3 28.8
#> 4 29.4
#> 5 29.1
#> 6 30.6
#> 7 31.3
#> 8 31.7
#> 9 31.0
#> 10 30.1
#> # … with 998 more rows

tibbletime 最好用的功能应该算是时间范围的筛选了:

R
1
2
3
4
5
6
7
8
9
10
FB_time %>%
filter_time("2013-01-01" ~ "2013-01-04")

#> # A time tibble: 3 x 7
#> # Index: date
#> symbol date open high low close adjusted
#> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 FB 2013-01-02 27.4 28.2 27.4 28 28
#> 2 FB 2013-01-03 27.9 28.5 27.6 27.8 27.8
#> 3 FB 2013-01-04 28.0 28.9 27.8 28.8 28.8

再例如,选择 2013 年 2 月的数据:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
FB_time %>%
filter_time(~ "2013-02")

#> # A time tibble: 19 x 7
#> # Index: date
#> symbol date open high low close adjusted
#> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 FB 2013-02-01 31.0 31.0 29.6 29.7 29.7
#> 2 FB 2013-02-04 29.1 29.2 28.0 28.1 28.1
#> 3 FB 2013-02-05 28.3 29.0 28.0 28.6 28.6
#> 4 FB 2013-02-06 28.7 29.3 28.7 29.0 29.0
#> 5 FB 2013-02-07 29.1 29.2 28.3 28.6 28.6
#> 6 FB 2013-02-08 28.9 29.2 28.5 28.5 28.5
#> 7 FB 2013-02-11 28.6 28.7 28.0 28.3 28.3
#> 8 FB 2013-02-12 27.7 28.2 27.1 27.4 27.4
#> 9 FB 2013-02-13 27.4 28.3 27.3 27.9 27.9
#> 10 FB 2013-02-14 28.0 28.6 28.0 28.5 28.5
#> 11 FB 2013-02-15 28.5 28.8 28.1 28.3 28.3
#> 12 FB 2013-02-19 28.2 29.1 28.1 28.9 28.9
#> 13 FB 2013-02-20 28.9 29.0 28.3 28.5 28.5
#> 14 FB 2013-02-21 28.3 28.5 27.2 27.3 27.3
#> 15 FB 2013-02-22 27.6 27.6 26.8 27.1 27.1
#> 16 FB 2013-02-25 27.2 27.6 27.2 27.3 27.3
#> 17 FB 2013-02-26 27.4 27.5 26.7 27.4 27.4
#> 18 FB 2013-02-27 27.3 27.3 26.6 26.9 26.9
#> 19 FB 2013-02-28 26.8 27.3 26.3 27.2 27.2

collapse_by() 可以把日期索引变成每年的最后一个观测值对应的日期:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
(FB_yearly <- collapse_by(FB_time, period = "year"))

#> # A time tibble: 1,008 x 7
#> # Index: date
#> symbol date open high low close adjusted
#> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 FB 2013-12-31 27.4 28.2 27.4 28 28
#> 2 FB 2013-12-31 27.9 28.5 27.6 27.8 27.8
#> 3 FB 2013-12-31 28.0 28.9 27.8 28.8 28.8
#> 4 FB 2013-12-31 28.7 29.8 28.6 29.4 29.4
#> 5 FB 2013-12-31 29.5 29.6 28.9 29.1 29.1
#> 6 FB 2013-12-31 29.7 30.6 29.5 30.6 30.6
#> 7 FB 2013-12-31 30.6 31.5 30.3 31.3 31.3
#> 8 FB 2013-12-31 31.3 32.0 31.1 31.7 31.7
#> 9 FB 2013-12-31 32.1 32.2 30.6 31.0 31.0
#> 10 FB 2013-12-31 30.6 31.7 29.9 30.1 30.1
#> # … with 998 more rows

FB_yearly %>%
select(date) %>%
unique()

#> # A time tibble: 4 x 1
#> # Index: date
#> date
#> <date>
#> 1 2013-12-31
#> 2 2014-12-31
#> 3 2015-12-31
#> 4 2016-12-30

这样我们就可以通过下面的方式把高频数据汇总为低频数据:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
FB_time %>%
collapse_by("year") %>%
group_by(date) %>%
summarise(adjusted_mean = mean(adjusted))

#> # A time tibble: 4 x 2
#> # Index: date
#> date adjusted_mean
#> <date> <dbl>
#> 1 2013-12-31 35.5
#> 2 2014-12-31 68.8
#> 3 2015-12-31 88.8
#> 4 2016-12-30 117.

再例如计算所有变量半年期的均值:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
FB_time %>%
collapse_by("2 quarter") %>%
group_by(date) %>%
summarise_if(is.numeric, mean)

#> # A time tibble: 8 x 6
#> # Index: date
#> date open high low close adjusted
#> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2013-06-28 27.0 27.4 26.6 27.0 27.0
#> 2 2013-12-31 43.6 44.4 43.0 43.7 43.7
#> 3 2014-06-30 62.5 63.4 61.4 62.4 62.4
#> 4 2014-12-31 74.8 75.7 74.0 74.9 74.9
#> 5 2015-06-30 80.0 80.8 79.3 80.0 80.0
#> 6 2015-12-31 97.2 98.3 96.0 97.2 97.2
#> 7 2016-06-30 111. 112. 109. 110. 110.
#> 8 2016-12-30 124. 124. 122. 123. 123.

还可以和 group_by() 一起使用:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
data("FANG")
FANG_time <- FANG %>%
group_by(symbol) %>%
as_tbl_time(date)

FANG_time %>%
collapse_by("year") %>%
group_by(symbol, date) %>%
summarise_all(median) %>%
print(n = 12)

#> # A time tibble: 16 x 8
#> # Index: date
#> # Groups: symbol [4]
#> symbol date open high low close volume adjusted
#> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 AMZN 2013-12-31 282. 285. 280. 282. 2556250 282.
#> 2 AMZN 2014-12-31 328. 331. 323. 326. 3498200 326.
#> 3 AMZN 2015-12-31 440. 445. 436. 440. 3245950 440.
#> 4 AMZN 2016-12-30 727. 731. 721. 728. 3594800 728.
#> 5 FB 2013-12-31 29.7 30.4 29.4 29.8 49838850 29.8
#> 6 FB 2014-12-31 69.3 70.0 68.4 69.1 42264250 69.1
#> 7 FB 2015-12-31 86.8 87.8 85.6 86.8 23898750 86.8
#> 8 FB 2016-12-30 118. 119. 117. 118. 20993100 118.
#> 9 GOOG 2013-12-31 876. 880. 871. 876. 3838400 438.
#> 10 GOOG 2014-12-31 569. 574. 563. 568. 1877200 565.
#> 11 GOOG 2015-12-31 564. 571. 559. 563. 1817850 563.
#> 12 GOOG 2016-12-30 743. 747. 737. 743. 1587300 743.
#> # … with 4 more rows

tidyfinance 一起使用:

计算日度收益率:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# devtools::install_github("DavisVaughan/tidyfinance")
library(tidyfinance)
FANG_time %>%
calculate_return(adjusted, period = "daily") %>%
select(symbol, date, adjusted, adjusted_return)

#> # A time tibble: 4,032 x 4
#> # Index: date
#> # Groups: symbol [4]
#> symbol date adjusted adjusted_return
#> <chr> <date> <dbl> <dbl>
#> 1 FB 2013-01-02 28 0
#> 2 FB 2013-01-03 27.8 -0.00821
#> 3 FB 2013-01-04 28.8 0.0356
#> 4 FB 2013-01-07 29.4 0.0229
#> 5 FB 2013-01-08 29.1 -0.0122
#> 6 FB 2013-01-09 30.6 0.0526
#> 7 FB 2013-01-10 31.3 0.0232
#> 8 FB 2013-01-11 31.7 0.0134
#> 9 FB 2013-01-14 31.0 -0.0243
#> 10 FB 2013-01-15 30.1 -0.0275
#> # … with 4,022 more rows

计算年度收益:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
FANG_time %>%
calculate_return(open:low, adjusted, period = "year") %>%
select(symbol, date, contains('return'))

#> # A time tibble: 20 x 6
#> # Index: date
#> # Groups: symbol [4]
#> symbol date open_return high_return low_return
#> <chr> <date> <dbl> <dbl> <dbl>
#> 1 FB 2013-01-02 0 0 0
#> 2 FB 2013-12-31 0.972 0.947 0.966
#> 3 FB 2014-12-31 0.470 0.455 0.444
#> 4 FB 2015-12-31 0.333 0.330 0.344
#> 5 FB 2016-12-30 0.1000 0.100 0.0970
#> 6 AMZN 2013-01-02 0 0 0
#> 7 AMZN 2013-12-31 0.541 0.545 0.555
#> 8 AMZN 2014-12-31 -0.210 -0.215 -0.213
#> 9 AMZN 2015-12-31 1.20 1.20 1.18
#> 10 AMZN 2016-12-30 0.117 0.116 0.107
#> 11 NFLX 2013-01-02 0 0 0
#> 12 NFLX 2013-12-31 2.84 2.85 3.01
#> 13 NFLX 2014-12-31 -0.0609 -0.0634 -0.0608
#> 14 NFLX 2015-12-31 -0.661 -0.660 -0.665
#> 15 NFLX 2016-12-30 0.0863 0.0776 0.0816
#> 16 GOOG 2013-01-02 0 0 0
#> 17 GOOG 2013-12-31 0.546 0.542 0.544
#> 18 GOOG 2014-12-31 -0.522 -0.525 -0.525
#> 19 GOOG 2015-12-31 0.448 0.445 0.442
#> 20 GOOG 2016-12-30 0.0172 0.0173 0.0159
#> # … with 1 more variable: adjusted_return <dbl>

计算累计收益率和回撤指标:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
FANG_return <- FANG_time %>%
select(symbol, date, adjusted) %>%
calculate_return(adjusted, period = "daily") %>%
mutate(
drawdown = drawdown(adjusted_return),
cum_ret = cumulative_return(adjusted_return)
)
FANG_return

#> # A time tibble: 4,032 x 6
#> # Index: date
#> # Groups: symbol [4]
#> symbol date adjusted adjusted_return drawdown cum_ret
#> <chr> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 FB 2013-01-02 28 0 0 0
#> 2 FB 2013-01-03 27.8 -0.00821 -0.00821 -0.00821
#> 3 FB 2013-01-04 28.8 0.0356 0 0.0271
#> 4 FB 2013-01-07 29.4 0.0229 0 0.0507
#> 5 FB 2013-01-08 29.1 -0.0122 -0.0122 0.0379
#> 6 FB 2013-01-09 30.6 0.0526 0 0.0925
#> 7 FB 2013-01-10 31.3 0.0232 0 0.118
#> 8 FB 2013-01-11 31.7 0.0134 0 0.133
#> 9 FB 2013-01-14 31.0 -0.0243 -0.0243 0.105
#> 10 FB 2013-01-15 30.1 -0.0275 -0.0511 0.0750
#> # … with 4,022 more rows

计算月度总收益:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
FANG_return_monthly <- FANG_return %>%
collapse_by("month") %>%
group_by(symbol, date) %>%
summarise(monthly_return = total_return(adjusted_return))
FANG_return_monthly

#> # A time tibble: 192 x 3
#> # Index: date
#> # Groups: symbol [4]
#> symbol date monthly_return
#> <chr> <date> <dbl>
#> 1 AMZN 2013-01-31 0.0318
#> 2 AMZN 2013-02-28 -0.00463
#> 3 AMZN 2013-03-28 0.00840
#> 4 AMZN 2013-04-30 -0.0476
#> 5 AMZN 2013-05-31 0.0606
#> 6 AMZN 2013-06-28 0.0315
#> 7 AMZN 2013-07-31 0.0847
#> 8 AMZN 2013-08-30 -0.0672
#> 9 AMZN 2013-09-30 0.113
#> 10 AMZN 2013-10-31 0.164
#> # … with 182 more rows

绘图:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# 累积收益率
plot_cum_ret <- FANG_return %>%
ggplot(aes(x = date, y = cum_ret, color = symbol)) +
geom_line() +
theme_minimal(base_family = enfont) +
theme(axis.title.x = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(axis.ticks.x = element_blank()) +
labs(y = "Cumulative Return",
title = "Performance summary: Facebook, Amazon, Netflix, Google") +
scale_color_brewer(palette = "Dark2")

# 月度收益率
plot_month_ret <- FANG_return %>%
calculate_return(adjusted, period = "monthly") %>%
ggplot(aes(x = date, y = adjusted_return, fill = symbol)) +
geom_col(width = 15, position = position_dodge()) +
theme_minimal(base_family = enfont) +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank()) +
labs(y = "Monthly Return") +
theme(legend.position = "none") +
scale_fill_brewer(palette = "Dark2")

# 回撤指标
plot_drawdown <- FANG_return %>%
ggplot(aes(x = date, y = drawdown, fill = symbol)) +
geom_area(position = position_identity(), alpha = 0.5) +
theme_minimal(base_family = enfont) +
scale_x_date(
date_breaks = "3 months",
date_labels = "%b %Y"
) +
labs(x = "", y = "Drawdown") +
scale_fill_brewer(palette = "Dark2") +
theme(axis.text.x = element_text(angle = 45, size = 5)) +
theme(legend.position = "bottom", legend.margin = margin(t = -10))

library(patchwork)
plot_cum_ret +
plot_month_ret +
plot_drawdown +
plot_layout(ncol = 1, heights = c(2, 1, 1))

# R

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×