hwordcloud:我的第一个基于 htmlwidgets 的 R 包

hwordcloud:我的第一个基于 htmlwidgets 的 R 包

终于我心心念念的目标被实现了,早在三个月之前我就试图学习 htmlwidgets 包了,这篇文字:自制“htmlwigets”?记录了我当时从入门到放弃的心路历程。中间我还尝试过一次,也失败了。所以这是我的第三次尝试,终于这次成功了,于是就诞生了这个包。显然这个包的功能是画词云图。

记录一下这个包创建的过程吧!

创建包

打开 RStudio,点击 👉 👉 👉 👉

然后这个 R 包模板就创建好了。

接下来参考这篇文章:

Creating a widget。实际上这篇文章我是一直没有看懂的,我是研究了好几个 htmlwidgets 包才有所感悟的。

建立 htmlwidgets 包的结构

运行:

R
1
htmlwidgets::scaffoldWidget("hwordcloud")

即可创建名为 hwordcloud 的 htmlwidgets 模板,现在这个包的结构应该是这样的:

Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ tree
.
├── DESCRIPTION
├── NAMESPACE
├── R
│   ├── hello.R
│   └── hwordcloud.R
├── hwordcloud.Rproj
├── inst
│   └── htmlwidgets
│   ├── hwordcloud.js
│   └── hwordcloud.yaml
└── man
└── hello.Rd

4 directories, 8 files

添加 JS 依赖

hwordcloud.yaml文件是用来设置 js 依赖的。这个词云图的示例在这里:词云图 | Highcharts

这个词云图需要三个 js 文件:

1
2
3
highcharts.js
oldie.js
wordcloud.js

所以我们的hwordcloud.yaml可以这么写:

1
2
3
4
5
6
7
8
9
10
# (uncomment to add a dependency)
dependencies:
- name: hwordcloud
version: 7.0.3
src: htmlwidgets/lib/
script:
- highcharts.js
- oldie.js
- wordcloud.js
stylesheet:

同时你还需要把这三个 js 文件放在htmlwidgets/lib/文件夹里,现在这个包的目录结构是:

Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ tree
.
├── DESCRIPTION
├── NAMESPACE
├── R
│   ├── hello.R
│   └── hwordcloud.R
├── hwordcloud.Rproj
├── inst
│   └── htmlwidgets
│   ├── hwordcloud.js
│   ├── hwordcloud.yaml
│   └── lib
│   ├── highcharts.js
│   ├── oldie.js
│   └── wordcloud.js
└── man
└── hello.Rd

5 directories, 11 files

R 函数的绑定

R 函数绑定的目的主要是把 R 数据传递给 js 脚本。我们可以从词云图 | JShare下载到词云图示例的源代码:

HTML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<link rel="icon" href="https://static.jianshukeji.com/hcode/images/favicon.ico">
<script src="https://img.highcharts.com.cn/highcharts/highcharts.js"></script>
<script src="https://img.highcharts.com.cn/highcharts/modules/exporting.js"></script>
<script src="https://img.highcharts.com.cn/highcharts/modules/wordcloud.js"></script>
<script src="https://img.highcharts.com.cn/highcharts/modules/oldie.js"></script>
</head>
<body>
<!--
*************************************************************************
Generated by JShare at 2019-02-16 21:47:22
From: https://jshare.com.cn/demos/IOrsq7
*************************************************************************
-->
<div id="container"></div>

<script>
var text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean bibendum erat ac justo sollicitudin, quis lacinia ligula fringilla. Pellentesque hendrerit, nisi vitae posuere condimentum, lectus urna accumsan libero, rutrum commodo mi lacus pretium erat. Phasellus pretium ultrices mi sed semper. Praesent ut tristique magna. Donec nisl tellus, sagittis ut tempus sit amet, consectetur eget erat. Sed ornare gravida lacinia. Curabitur iaculis metus purus, eget pretium est laoreet ut. Quisque tristique augue ac eros malesuada, vitae facilisis mauris sollicitudin. Mauris ac molestie nulla, vitae facilisis quam. Curabitur placerat ornare sem, in mattis purus posuere eget. Praesent non condimentum odio. Nunc aliquet, odio nec auctor congue, sapien justo dictum massa, nec fermentum massa sapien non tellus. Praesent luctus eros et nunc pretium hendrerit. In consequat et eros nec interdum. Ut neque dui, maximus id elit ac, consequat pretium tellus. Nullam vel accumsan lorem.';
// 注意:这里的代码只是对上面的句子进行分词并计算权重(重复次数)并构建词云图需要的数据,其中 arr.find 和
// reduce 函数可能在低版本 IE 中无法使用(属于ES6新增的函数),如果不能正常使用(对应的函数有报错),请自行加相应的 Polyfill
// array.find 的 ployfill 参见:https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/Array/find#Polyfill
// array.reduce 的 ployfill :https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/Array/reduce#Polyfill
var data = text.split(/[,\. ]+/g)
.reduce(function (arr, word) {
var obj = arr.find(function (obj) {
return obj.name === word;
});
if (obj) {
obj.weight += 1;
} else {
obj = {
name: word,
weight: 1
};
arr.push(obj);
}
return arr;
}, []);
Highcharts.chart('container', {
series: [{
type: 'wordcloud',
data: data
}],
title: {
text: '词云图'
}
});
</script>
</body>
</html>

起绘图作用的是这段代码:

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
var text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean bibendum erat ac justo sollicitudin, quis lacinia ligula fringilla. Pellentesque hendrerit, nisi vitae posuere condimentum, lectus urna accumsan libero, rutrum commodo mi lacus pretium erat. Phasellus pretium ultrices mi sed semper. Praesent ut tristique magna. Donec nisl tellus, sagittis ut tempus sit amet, consectetur eget erat. Sed ornare gravida lacinia. Curabitur iaculis metus purus, eget pretium est laoreet ut. Quisque tristique augue ac eros malesuada, vitae facilisis mauris sollicitudin. Mauris ac molestie nulla, vitae facilisis quam. Curabitur placerat ornare sem, in mattis purus posuere eget. Praesent non condimentum odio. Nunc aliquet, odio nec auctor congue, sapien justo dictum massa, nec fermentum massa sapien non tellus. Praesent luctus eros et nunc pretium hendrerit. In consequat et eros nec interdum. Ut neque dui, maximus id elit ac, consequat pretium tellus. Nullam vel accumsan lorem.';
// 注意:这里的代码只是对上面的句子进行分词并计算权重(重复次数)并构建词云图需要的数据,其中 arr.find 和
// reduce 函数可能在低版本 IE 中无法使用(属于ES6新增的函数),如果不能正常使用(对应的函数有报错),请自行加相应的 Polyfill
// array.find 的 ployfill 参见:https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/Array/find#Polyfill
// array.reduce 的 ployfill :https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/Array/reduce#Polyfill
var data = text.split(/[,\. ]+/g)
.reduce(function (arr, word) {
var obj = arr.find(function (obj) {
return obj.name === word;
});
if (obj) {
obj.weight += 1;
} else {
obj = {
name: word,
weight: 1
};
arr.push(obj);
}
return arr;
}, []);
Highcharts.chart('container', {
series: [{
type: 'wordcloud',
data: data
}],
title: {
text: '词云图'
}
});

这里给出的词云示例是英文的并且自动统计词频,并不是很适合中文的实际使用,实际更好用的是,给出一个词向量,一个词频向量,然后生成词云。所以我们可以把这段脚本改成下面的这个样子:

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
var data = [
["你好", 12],
["再见", 24],
["再也不见", 48],
]
Highcharts.chart('container', {
series: [{
type: 'wordcloud',
data: data
}],
title: {
text: '词云图'
}
});

现在再打开词云图就是这也的了:

当然目前我们暂时不需要理会这些,因为 R 函数绑定是非常死板的:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
#' hwordcloud: Rendering word clouds using R + Highcharts
#'
#' @description This function can create wordclouds by binding R and Highcharts.
#'
#' @import htmlwidgets
#'
#' @param text character vector;
#' @param size numeric vector;
#' @param width chart width, for example, "100\%";
#' @param height chart height, for example, "400px";
#' @param theme chart theme, you can use these themes:
#' darkgreen/darkblue/avocado/darkunica/gray/
#' gridlight/grid/sandsignika/sunset;
#' @param itermName attribute in tooltip;
#' @param title title;
#' @param titleAlign title alignment, left/center/right;
#' @param titleSize title size, like "20px";
#' @param titleColor title color, like "#333333";
#' @param subtitle subtitle;
#' @param subtitleAlign subtitle alignment, left/center/right;
#' @param subtitleSize subtitle size, like "16px";
#' @param subtitleColor subtitle color, like "#666666"
#'
#' @examples
#' library(hwordcloud)
#' library(wordcloud2)
#' df <- demoFreq %>%
#' head(50)
#' hwordcloud(text = df$word, size = df$freq)
#'
#' @export
hwordcloud <- function(text,
size,
width = "100%",
height = NULL,
theme = "sandsignika",
itermName = "数量",
title = "",
titleAlign = "center",
titleSize = "20px",
titleColor = "#333333",
subtitle = "",
subtitleAlign = 'center',
subtitleSize = "",
subtitleColor = "#666666") {

x = list(
text = text,
size = size,
theme = theme,
itermName = itermName,
title = title,
titleAlign = titleAlign,
titleSize = titleSize,
titleColor = titleColor,
subtitle = subtitle,
subtitleAlign = subtitleAlign,
subtitleSize = subtitleSize,
subtitleColor = subtitleColor
)

# create widget
htmlwidgets::createWidget(
name = 'hwordcloud',
x,
width = width,
height = height,
package = 'hwordcloud'
)
}

#' Shiny bindings for hwordcloud
#'
#' Output and render functions for using hwordcloud within Shiny
#' applications and interactive Rmd documents.
#'
#' @param outputId output variable to read from
#' @param width,height Must be a valid CSS unit (like \code{'100\%'},
#' \code{'400px'}, \code{'auto'}) or a number, which will be coerced to a
#' string and have \code{'px'} appended.
#' @param expr An expression that generates a hwordcloud
#' @param env The environment in which to evaluate \code{expr}.
#' @param quoted Is \code{expr} a quoted expression (with \code{quote()})? This
#' is useful if you want to save an expression in a variable.
#'
#' @name hwordcloud-shiny
#'
#' @export
hwordcloudOutput <- function(outputId, width = '100%', height = '400px'){
htmlwidgets::shinyWidgetOutput(outputId, 'hwordcloud', width, height, package = 'hwordcloud')
}

#' @rdname hwordcloud-shiny
#' @export
renderHwordcloud <- function(expr, env = parent.frame(), quoted = FALSE) {
if (!quoted) { expr <- substitute(expr) } # force quoted
htmlwidgets::shinyRenderWidget(expr, hwordcloudOutput, env, quoted = TRUE)
}

#' hwordcloud package
#' @description Rendering word clouds using R + Highcharts
#' @section \code{\link{hwordcloud}}: Rendering word clouds using R + Highcharts
#' @docType package
#' @name hwordcloud
NULL

大概就是你需要什么传递什么数据,就把它传递给x就好了。下面还有两个函数,hwordcloudOutputrenderHwordcloud是用于在 shiny 中渲染图表的。不需要修改的。

在这个函数里,可以看到我又添加了一些绘图参数,这些参数可以丰富词云图的自定义,这些参数可以参考 highcharts 的 API 文档:Highcharts API 文档 | Highcharts

JS 绑定

最重要和令我不能掌握的部分就是 JS 绑定了,我们首先打开hwordcloud.js

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
HTMLWidgets.widget({

name: 'hwordcloud',

type: 'output',

factory: function(el, width, height) {

// TODO: define shared variables for this instance

return {

renderValue: function(x) {

// TODO: code to render the widget, e.g.
el.innerText = x.message;

},

resize: function(width, height) {

// TODO: code to re-render the widget with a new size

}

};
}
});

我把这个代码文件修改成了这个样子:

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
HTMLWidgets.widget({

name: 'hwordcloud',

type: 'output',

factory: function(el, width, height) {
var id = el.id;
return {
renderValue: function(x) {
// 这里省略了主题选择相关的代码。
// TODO: code to render the widget, e.g.
listData = [];
for (var i = 0; i < x.text.length; i++) {
listData.push([x.text[i], x.size[i]]);
}
Highcharts.chart(id, {
series: [{
type: 'wordcloud',
data: listData,
name: x.itermName
}],
subtitle: {
text: x.subtitle,
align: x.subtitleAlign,
style: {
"color": x.subtitleColor,
"fontSize": x.subtitleSize
}
},
title: {
text: x.title,
align: x.titleAlign,
style: {
"color": x.titleColor,
"fontSize": x.titleSize
}
}
});
},

resize: function(width, height) {
}

};
}
});

这也我们就完成了 JS 绑定。因为所有的数据都通过 R 函数的绑定被传递到了 x 中,所以你需要什么就从 x 中取就行了。

建造 R 包

再 RStudio 中,点击 👉 👉

测试函数

R
1
2
3
4
5
library(hwordcloud)
library(wordcloud2)
df <- demoFreq %>%
head(50)
hwordcloud(text = df$word, size = df$freq)

成功!

你还可以在 R Markdown 和 R Shiny 中使用该包。

hwordcloud 包

该包建造的历程大概就是这些了,下面介绍这个包的安装和使用。

安装

首先,你可以从 github 上安装这个包:

R
1
2
3
devtools::install_github('czxa/hwordcloud')
# 或者使用git
devtools::install_git("https://github.com/czxa/hwordcloud.git")

不过并不推荐这种安装方式,因为这个安装方法无法安装我编写的小品文,下载我打包好的hwordcloud_0.1.0.tar.gz进行离线安装更佳。

vignettes

我为该包写了三个教程(都是一样的):

使用 R 和 HighCharts 渲染词云图

Rendering word clouds using R + Highcharts

Rendering word clouds using R + Highcharts(PPT)

一瞥

shiny 示例

我为该包编写了两个 shiny 示例,一个是中文的,一个是英文的。

安装之后,你可以运行下面的代码使用 shiny app:

R
1
2
3
dir <- system.file("examples", "hwordcloud", package = "hwordcloud")
setwd(dir)
shiny::shinyAppDir(".")

对于中文用户:

R
1
2
3
dir <- system.file("examples", "hwordcloudC", package = "hwordcloud")
setwd(dir)
shiny::shinyAppDir(".")

# R

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×