Loading...
网友提供的内容

对propensity score进行匹配的方法之一:nearest neighbour

   

propensity score的方法似乎应用越来越多,计算propensity score比较简单,困难的在于对被试进行匹配,目前可见的几种方法是stratification、nearest neighbour、radius、kernel四种方法,这也是stata内嵌的四种方法。

写了一个sas的nearest neighbour的macro,允许同一个控制组被试对应几个处理组被试,两组总体数量是1:1,macro如下:

%macro compare(infile, class, logit, diff, var, N, outfile);
*—–         split data file                               ——*;
data t11; set &infile; if &class=0 and &logit ^= .; keep &var &class &logit; run;
proc sort data=t11; by &logit; run;
data t12; set &infile; if &class=1 and &logit ^= .; keep &var &class &logit; run;
proc sort data=t12; by &logit; run;
*—–         match                                         ——*;
%do i=1 %to &N;
  data t21; set t11; if _n_=&i; run;
  data t22; set t12; run;
  proc iml;
       use t21;
    read all into a;
    use t22;
    read all into b;
    brows=NROW(b);
       bcols=NCOL(b);
    aa=shape(a,brows,bcols);
    bb=j(brows,bcols+1,0);
    bb[1:brows,1:bcols]=b[1:brows,1:bcols];
    aaa=aa[,bcols];
    bbb=bb[,bcols];
    bb[,bcols+1]=abs(aaa-bbb);
    create t31 var{&var &class &logit &diff}; 
       append from bb;
  quit;
  proc sort data=t31; by &diff; run;
  data t32; set t31; if _n_=1; run;
  %if &i=1 %then %do;
  data &outfile; set t32; run;
  %end;
  %else %do;
  data &outfile; set &outfile t32; run;
  %end;
proc printto log=”d:\pscores.log” print=”d:\pscores.out”;run;
%end;
%mend compare;

对参数的解释:

 %macro compare(infile, class, logit, diff, var, N, outfile);

infile是事先准备好的文件,需事先计算过logit;class变量指明是处理组还是控制组;logit变量,无须解释;diff是处理组和控制组被试logit上的差异;

var是infile中希望保留的变量,计算是在矩阵里进行的,变量越少占用内存空间越少,最好只保留id以便和之前变量merge;N是处理组的被试个数;outfile是和处理组匹配的nearest neighbour的控制组被试。

如有同学试用且发现问题可留言!

站内评论

  • 暂无评论
昵称: 不能为空
E-mail: email不会被泄露 email格式不正确
评论: 评论不能为空