Cost of Living Ranking by Country & City

Cost of Living Ranking by Country & City

Expatistan provides two kinds of data: Cost of Living Ranking by Country and Cost of Living Ranking by City. It’s very easy to get these two data. This article introduced how to crawl these data and visualize theme on map.

Cost of Living Ranking by Country

First, use rvest to scrape data:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
library(rvest)
library(tidyverse)
library(stringr)
url <- "https://www.expatistan.com/cost-of-living/country/ranking"
html <- read_html(url)
df <- html %>%
html_nodes(xpath = '//*[@id="content"]/div/div[1]/div[1]/table') %>%
html_table() %>%
.[[1]] %>%
as_tibble() %>%
`colnames<-`(c("ranking", "country", "price_index")) %>%
mutate(ranking = str_remove_all(ranking, "[st nd rd th]")) %>%
type_convert()
Ranking Country Price Index
1 Cayman Islands 288
2 Hong Kong 234
3 Iceland 234
4 Switzerland 227
5 Bahamas 218
6 Norway 212

There are many ways to visualize this data:

tmap

tmap: Thematic maps are geographical maps in which spatial data distributions are visualized. This package offers a flexible, layer-based, and easy to use approach to create thematic maps, such as choropleths and bubble maps.

R
1
2
3
4
5
6
7
8
9
library(tmap)
data("World")
wdf <- World %>%
mutate(name = as.character(name)) %>%
left_join(df, by = c("name" = "country")) %>%
rename(`Price Index` = `price_index`)
tmap_style("classic")
tm_shape(wdf) +
tm_polygons("Price Index")

ggplot2 + sf

This is the most frequently used method.

R
1
2
3
4
5
6
7
8
library(ggplot2)
library(sf)
ggplot(wdf) +
geom_sf(aes(geometry = geometry,
fill = `Price Index`),
color = "white", size = 0.05) +
scale_fill_viridis_c() +
theme(plot.margin = grid::unit(c(0, 0, 0, 0), "cm"))

Unfortunately, I found that all these methods treat Taiwan as a independent “country”, but as we all know, Taiwan is a part of China, so I want to develop a new package base on htmlwidgets and 地图数据提示框 | Highcharts. It will coming soon …

Cost of Living Ranking by City

Based on the same method:

1
2
3
4
5
6
7
cdf <- "https://www.expatistan.com/cost-of-living/index" %>% 
read_html() %>%
html_nodes(xpath = '//*[@id="ranking"]/div[1]/table') %>%
html_table() %>%
.[[1]] %>%
`colnames<-`(c("Ranking", "City", "Price Index")) %>%
as_tibble()
Ranking City Price Index
1st Grand Cayman (Cayman Islands) 272
2nd Mountain View, California (United States) 261
3rd Palo Alto, California (United States) 261
4th New York City (United States) 255
5th Zurich (Switzerland) 250
6th San Francisco, California (United States) 245

Build a shiny document

See live version: https://czxa.top/shiny/cost/

I also build a shiny document for this project:

index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
title: "Cost of Living Ranking by Country & City"
author: "Zhenxing Cheng"
date: "`r Sys.Date()`"
runtime: shiny
output:
prettydoc::html_pretty:
theme: cayman
highlight: github
---

## Cost of Living Ranking by Country

```{r include=FALSE}
library(rvest)
library(dplyr)
library(readr)
library(stringr)
url <- "https://www.expatistan.com/cost-of-living/country/ranking"
html <- read_html(url)
df <- html %>%
html_nodes(xpath = '//*[@id="content"]/div/div[1]/div[1]/table') %>%
html_table() %>%
.[[1]] %>%
as_tibble() %>%
`colnames<-`(c("ranking", "country", "price_index")) %>%
mutate(ranking = str_remove_all(ranking, "[st nd rd th]")) %>%
type_convert()
```

```{r echo=FALSE, message=FALSE, dev='svglite', warning=FALSE}
library(ggplot2)
library(sf)
data("World", package = "tmap")
wdf <- World %>%
mutate(name = as.character(name)) %>%
left_join(df, by = c("name" = "country")) %>%
rename(`Price Index` = `price_index`)
ggplot(wdf) +
geom_sf(aes(geometry = geometry,
fill = `Price Index`),
color = "white", size = 0.05) +
scale_fill_viridis_c() +
theme(plot.margin = grid::unit(c(0, 0, 0, 0), "cm"))
```

To calculate each country's Price Index value, we start by assigning a value of 100 to a central reference country (that happens to be the Czech Republic). Once the reference point has been established, the Price Index value of every other country in the database is calculated by comparing their cost of living to the cost of living in the Czech Republic.

Therefore, if a country has a Price Index of 134, that means that living there is 34% more expensive than living in the Czech Republic.

```{r echo=FALSE}
DT::datatable(df)
```

## Cost of Living Ranking by City

```{r echo=FALSE}
"https://www.expatistan.com/cost-of-living/index" %>%
read_html() %>%
html_nodes(xpath = '//*[@id="ranking"]/div[1]/table') %>%
html_table() %>%
.[[1]] %>%
`colnames<-`(c("Ranking", "City", "Price Index")) %>%
as_tibble() %>%
DT::datatable()
```

To calculate each city's Price Index value, we start by assigning a value of 100 to a central reference city (that happens to be Prague). Once the reference point has been established, the Price Index value of every other city in the database is calculated by comparing their cost of living to the cost of living in Prague.

Therefore, if a city has a Price Index of 134, that means that living there is 34% more expensive than living in Prague.
# R

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×