stata驯服2——常用作图代码（含DID）

都是俺自己平时用的比较多的，后续会更新，也方便自取

基准回归

我们常需要在一张表上展现多个回归结果，如有多个解释（被解释）变量、是否控制变量等。
使用est store将回归暂时保存到内存中。

use "file1.dta", clear
reghdfe y x , absorb(year id_group) cluster(id_group) //回归1，未加入控制变量
est store m1 //将结果暂存到m1中

use "file2.dta", clear
reghdfe y x $X, absorb(year id_group) cluster(id_group) //回归2，加入控制变量
est store m2 //同理，将结果暂存到m2中
*以此类推m3 m4等等

需要注意的是，在est store m1之后即便使用不同文件进行其他的回归，m1结果依旧存在
然后我们就可以将这两个回归导出到一个表格啦

local m  "m1 m2"        // 模型名称
local mt "y y"  //模型标题

estfe m*, labels(group_id "个体固定" year "时间固定")  
esttab `m' using 文件名.rtf, mtitle(`mt') b(%6.4f) t(%6.3f) nogap compress  ///
   star(* 0.1 ** 0.05 *** 0.01)  ///
   ar2 scalar(N) replace         ///
   stats(N r2, fmt(%3s %10.4f)) ///
   indicate(`r(indicate_fe)') nogap //导出表格

但是！！！，此时导出的表格依旧不能直接放入论文中会很丑，论文需要使用三线表格。但可以提高效率，方便日后查看且更从容应对需要提供原始数据和代码的期刊。

平行趋势

一段代码，与你分享.jpg

gen policy = year - Year 
tab policy
replace policy = -4 if policy > 100
replace policy = -4 if policy < -5
replace policy = 5 if policy > 5 //生成相对年份变量policy

forvalue i=4(-1)1{
	gen pre`i'=(policy==-`i')
} //统计到前4年

gen current= (policy==0)

forvalue i=1(1)5{
	gen post`i'=(policy==`i')
} //统计到前5年

drop pre1  //将政策前第一年作为基准年

reghdfe y pre* current post* $X, absorb(id_group year) vce(cluster id_group)

coefplot, baselevels ///
    keep(pre* current post*) ///
    vertical ///
    coeflabels( ///
        pre4    = "-4" ///
        pre3    = "-3" ///
        pre2    = "-2" ///
        pre1    = "-1" ///
        current = "0"  ///
        post1   = "1"  ///
        post2   = "2"  ///
        post3   = "3"  ///
        post4   = "4"  ///
        post5   = "5"  ///
    ) ///
    yline(0, lcolor(gs8) lpattern(solid) lwidth(thin)) ///
    xline(4, lcolor(gs10) lpattern(dash) lwidth(medthin)) ///
    ylabel(, labsize(2) angle(0) nogrid) ///
    xlabel(, labsize(2)) ///
    ytitle("y轴名称", size(2.5)) ///
    xtitle("x轴名称（t = -1）", size(2.5)) ///
    msymbol(O) msize(medsmall) ///
    mcolor(black) ///
    ciopts(recast(rcap) ///
           lpattern(dash) ///
           lcolor(gs8) ///
           lwidth(thin)) ///
    addplot(line @b @at, lcolor(black) lwidth(medthin)) ///
    plotregion(style(none)) ///
    graphregion(color(white)) ///
    scheme(s1mono)

graph export "平行趋势1.svg", replace
graph drop _all

安慰剂检验

大家用什么命令我就用什么命令:P，没错就是permute

permute did beta = _b[did] se = _se[did] df = e(df_r), ///
 reps(500) rseed(123) saving("temp2.dta"): xtreg y did $X i.year, fe vce(cluster id_group)
 
use "temp2.dta", clear
gen t_value = beta / se
gen p_value = 2 * ttail(df, abs(beta/se)) //回归系数

#delimit ;
dpplot beta, 
 xline(0.34, lc(black*0.5) lp(dash))
 xline(0, lc(black*0.5) lp(solid))
 xlabel(-0.3(0.1)0.35)
    xtitle("Estimator", size(*0.8)) xlabel(, format(%4.1f) labsize(small))
    ytitle("Density", size(*0.8)) ylabel(, nogrid format(%4.1f) labsize(small)) 
    note("") caption("") graphregion(fcolor(white)) ;
#delimit cr

graph export "安慰剂.svg", replace
graph drop _all

clear all

工具变量

激动的心，颤抖的手，工具变量的成败在此一举。

ivreghdfe y (did=iv) $X , absorb(year id_group) cluster(id_group) first ///
  savefirst savefprefix(first_) //带第一阶段结果输出的回归

eststo m1:
estadd scalar F=`e(widstat)' : first_did
estadd scalar cdf1 =  `e(cdf)': first_did
estadd scalar sstat1 = `e(sstat)': first_did
esttab first_did m1 using "工具变量检验结果.rtf", replace coeflabels(_cons "con") ///
  b(4) t(2) compress nogaps scalar(F)order(iv did $X) stats(N r2 F cdf1 sstat1, ///
  fmt(0 4 4 4 4) labels("Obs" "R2" "F" "CD Wald F" "SW S stat."))  ///
  nobaselevels star(* 0.10 ** 0.05 *** 0.01)

对于找不到合适或者现成工具变量的小伙伴可以使用如下办法（仅针对DID）：

生成一个新的变量如IV，使其等于DID dummy变量与被解释变量初期观测值的乘积。该工具变量的相关性来源于政策冲击在不同个体中的差异化作用，而其外生性则基于初期水平严格发生在政策实施之前，且在控制个体与年份固定效应后，其本身不构成影响被解释变量的独立路径。因此，该工具变量在理论上能够同时满足相关性与排除性假设。

by id: egen firstyear = min(year) //找到每个个体的最早出现年份
by id: egen perf_initial = mean(cond(year==firstyear, y, .)) //将绩效变量在 firstyear 时的值取出

gen iv = did * perf_initial //生成我们需要的工具变量