桑基图(Sankey plot)是一种表现数据间包含和权重关系流向的统计图表,是展示数据“流动”变化一大利器。巴德学院利维经济研究所的Fernando Rios-Avila开发了Stata社区命令sankey_plot,用来方便地使用Stata绘制桑基图。
安装Stata社区命令sankey_plot的Stata命令如下:
*加了选项all,是为了下载附带演示数据
ssc install sankey_plot, all replace
绘制上图的Stata代码如下:
use immigration, clear
sankey_plot x0 from x1 to, /// coordinates for levels (x0 x1) and the from to data
width0(value) /// value of the flows, (migration)
xsize(4) ysize(6) /// adjusting size of plot
extra // Because we are using width, it is often recommended to use this for automatic adjustment.
多水平流动(Multiple levels flows),绘制上图的Stata代码如下:
use jobmarket.dta, clear
sankey_plot week0 y0 week1 y1, ///
width0(candidates) adjust extra /// Weight of # of candidates
label0(label0) label1(label1) /// labels for events
xlabel(0 "Starts" 1 "Week 1" 2 "Week 2" 3 "Week 3" 4 "Week 4" 5 "Week 5" 6 "Week 6") ///
fillcolor(gs10%40) gap(.2) /// a single color to all flows
xsize(8) ysize(5) // size change
宽数据(Wide data)绘制上图的Stata代码如下:
use dogs_and_happiness, clear
sankey_plot married pet happy , ///
wide width(freq) /// Need to indicate the data is wide, and notice its width() is not width0()
fillcolor(%50) xlabel("",nogrid) gap(0.1) tight /// Tight groups things together
title("The Secret to Happyness") ///
subtitle("Have Pets!: Nora and Bruce!") note("Nora and Bruce are my family dogs!")