在这里使用不同的方法来对分段线性模型作一说明。前两个模型的区别是运用不同的参数化。第四个模型时使用非线性模型的分段方法。第三种方法是利用二次项来模拟分段函数。当然,不同的方法应该导向同样的结果。
I use different ways to fit a two-part piecewiselinear regression model. Not surprisingly, these models areidentical (for model 1, 2, and 4) or approximately the same (3).Here I assume 2-part piecewise linear models with onecutoff.
Nonlinear relationships between the response andthe explanatory viariables can be sometimes successfully modelledusing a linear model that has different slopes for certain rangesof the covariable, that is, the response is piecewise linear. Forexample, in the plot, the relatinships between children age andtheir weights are not linear
but increasing slowly before a cutoff(say 14here) and more fast afterwards.
This approach can be generalized to more than onechange point with the followingformular:
Y = b00 + b01 * x + bi * (x -ci) * (x > ci),whereci’sarethebreakpoints
Model 1:Piecewise linear model (1): Y = b0 +b1*x + b2(x-x0)*(x≥x0)
Model 1 | |||||
Variable | DF | Parameter | Standard | tValue | Pr>|t| |
Intercept | 1 | -1.38490 | 48.77311 | -0.03 | 0.9777 |
age11 | 1 | 7.30139 | 3.83035 | 1.91 | 0.0748 |
age12 | 1 | 13.26017 | 9.81718 | 1.35 | 0.1956 |
For age12: beta =20.56 se = 7.271 afterre-calculation
Model2: Piecewise linear model (2): Y = b0 +b1*(x-x0)*(x<x0)+b2*(x-x0)*(x≥x0)
Model 2 | |||||
Variable | DF | Parameter | Standard | tValue | Pr>|t| |
Intercept | 1 | 100.83458 | 6.44089 | 15.66 | <.0001 |
age21 | 1 | 7.30139 | 3.83035 | 1.91 | 0.0748 |
age22 | 1 | 20.56156 | 7.27105 | 2.83 | 0.0121 |
Model 3: using linear + quadratic: Y = b0 + b1*x+ b2*x*x
Model 3 | |||||
Variable | DF | Parameter | Standard | tValue | Pr>|t| |
Intercept | 1 | 249.74392 | 329.02657 | 0.76 | 0.4589 |
age31 | 1 | -34.18620 | 49.65665 | -0.69 | 0.5010 |
age32 | 1 | 1.70269 | 1.85630 | 0.92 | 0.3726 |
Model 4: from PROC NLIN: Y =(x<x0) *(b01 + b11*x) + (x≥x0) *(b02 + b12*x) whereb02 = b01 + (b11-b12)*x0
Parameter | Estimate | Approx | Approximate 95% ConfidenceLimits | |
beta01 | -1.3849 | 48.7731 | -104.8 | 102.0 |
beta11 | 7.3014 | 3.8303 | -0.8186 | 15.4214 |
beta12 | 20.5616 | 7.2711 | 5.1476 | 35.9755 |
SAScode:
*
Createdata
;
data demo;
set sashelp.class;
*Model 1;
w1= weight;
age11 = age;
age12 = (age-14) * (age>=14);
*Model 2;
w2= weight;
age21 = (age -14) *(age<14);
age22 = (age -14) *(age>=14);
*Model 3;
w3= weight;
age31 = age;
age32 = age*age;
run;
*
Fittingmodels
;
*Model1;
proc reg data=democovout outseb outest=ot1;
model w1 =age11 age12/covb;
output out=b1predicted=w1p;
*Model2;
proc reg covout outseb outest=ot11;
model w2 =age21 age22/covb;
output out=b2predicted=w2p;
*Model3;
proc reg covout outseb outest=ot3;
model w3 =age31 age32;
output out=b3predicted=w3p;
run;
quit;
*
Model4
;
proc nlin data =demo;
parms beta01=-.5 beta11=1beta12 =1;
x0=14;
beta02 = beta01 + (beta11 - beta12) *x0;
model weight = (age < x0 ) * (beta01 + beta11 * age)+
(age >=x0) * (beta02 + beta12 * age);
*force E[Y|x] equals at x0;
output out = b4 predicted = w4p;
run;
*
Model 1 ==Modle 2
;
dataot11;
set ot1 end=Eof;
array demo[3,2]_temporary_;
if _type_ ='COV'then do;
if upcase(_name_) ='AGE11'then do;
demo[1,1] = age11; demo[1,2]=age12;
end;
if upcase(_name_) ='AGE12'then do;
demo[2,1] = age11; demo[2,2]=age12;
end;
end;
if _type_ ='PARMS'then do;
demo[3,1] = age11; demo[3,2]=age12;
end;
if Eof then do;
beta2 = demo[3,1]+demo[3,2];
se2 =sqrt(demo[1,1]+demo[2,2]+2*demo[1,2]);
put 'beta = ' beta2 best5.2 @15'se= ' se2 best5.2;
end;
run;
datawp;
merge b1 b2 b3 b4;
run;
proc sort data =wp; by age;
run;
*
plot4models;
;
proc sgplot data=wp;
scatter x=agey=weight/markerAttrs=(symbol=star size =5);
series x=agey=w1p/markersmarkerAttrs=(symbol=CircleFilled size =5)lineAttrs=(color=red) name ='1'legendLabel ='Predictedvalue of Model 1';
series x=agey=w2p/markersmarkerAttrs=(symbol=Circle size =10)lineAttrs=(color=purple)name='2'legendLabel ='Predictedvalue of Model 2';
series x=agey=w3p/lineAttrs=(color=blue)name='3'legendLabel ='Predictedvalue of Model 3';
series x=agey=w4p/markersmarkerAttrs=(symbol=circle size =15)lineAttrs=(color=purple)name='4'legendLabel ='Predictedvalue of Model 4';
keyLegend '1' '2' '3' '4'/location =inside across =1;
refline 14/axis=xlabel ='Cutoff';
run;