把動態面板命令講清楚了，對Stata的ado詳盡解釋

最新 09-12

《正文》

當你看這篇文章的時候，先參看一下《IV和GMM相關估計步驟，內生性、異方差性等檢驗方法》，裡面有圈友提議詳細做一期動態面板命令方面的，所以咱們就敲定了這篇工具類型文章。當然這篇文章不僅僅講解了xtabond2，還有xtabond，xtdpdsys，xtdpd和xtdpdml這些動態面板Stata命令。只不過xtabond2可以涵蓋其他類型命令，所以我們就著重解析了xtabond2。

xtabond2總體而言，在設計思路上可以取代xtabond（difference GMM）和xtdpdsys（System GMM），因為他的語法更加靈活和複雜一些，可以通過設置參數來做前面這兩個動態面板回歸的操作。這些動態面板回歸都尤其適用於那些N比較大，T比較小的數據中。不過他們這三個的具體執行步驟是不同的。

A problem with the original Arellano-Bond estimator is thatlagged levels are poor instruments for first differences if the variables are close to a random walk（xtabond使用的工具變數有時候表現很不好）.Arellano and Bover (1995) describe how, if the original equation in levels is added to the system, additional instruments can be brought to bear to increase efficiency.In this equation, variables in levels are instrumented with suitable lags of their own first differences（然後xtdpdsys就改進了工具變數的選擇方式，不僅包括levels還有differences）.The assumption needed is that these differences are uncorrelated with the unobserved country effects（要求假定這些differenced過後的工具變數與不可見的個體效應不相關）.Blundell and Bond show that this assumption in turn depends on a more precise one about initial conditions.

xtabond2的語法格式：

xtabond2 depvar（因變數） varlist（系列解釋變數：前置變數、嚴格外生變數、內生變數） [條件篩選] [回歸區間][, level(置信區間) twostep（表明計算two step估計量而不是one step估計量） robust（如果前面選擇了twostep，那麼就必須選擇這個robust） cluster(用來重新命名Panel變數，就是說改變之前的id)noconstant（在level equations中不要常數項）small（用t統計量和F統計量，而不是用z統計量和Wald統計量來評估回歸顯著性）noleveleq（如果有這個命令，那工具變數中就只有difference equations，沒有了level equations，因此就等同於做了difference GMM）orthogonal gmmopt [gmmopt ...] ivopt [ivopt ...] pca components(主成分部分) artests(自相關檢驗的最大階數) arlevels（標明自相關檢驗用於level equations）h(這個選項一般不影響大局) ]

上面的gmmopt指的是，gmmstyle(varlist [, laglimits(對於transformed或者level equations，這個選項規定了工具變數選擇的前後日期) collapse（只為每個變數和滯後距離創造一個工具變數，而不是每一個時間段都創造一個工具變數，減少了工具變數個數）orthogonal（這是用向後orthogonal deviations方法來創造工具變數，主要是與difference GMM連著用，比傳統的AR（１）difference GMM更加穩定無偏）equation() passthru split（僅僅用於system GMM和沒有規定equation()，主要是把工具變數分成2組來做difference-in-Sargan/Hansen testing）])

上面ivopt指的是，ivstyle(varlist [, equation(（表示哪個equation用前面的那個工具變數）) passthru（這個命令在equation(diff)和nolevelleq用了之後使用）mz（工具變數中Missing值就換成0）])#注意的是，如果x變數是個前置變數，那作為level equation的工具變數是可以的，但是現在就不能用ivstyle選項，而是後面這個iv(x, eq(level))。

On balanced panels, GMM estimators based on the two transforms return numerically identical coefficient estimates, holding the instrument set fixed (Arellano and Bover 1995). But orthogonal deviations has the virtue of preserving sample size in panels with gaps. If some e_it is missing, for example, neither D.e_it nor D.e_i,t+1 can be computed（xtabond2在MATA程序中是用forward orthogonal deviations方法來消除固定個體效應，即一個第t期的變數減去t期之後所有日期的平均數值，這與我們時常用的first difference不太一樣，因為這種方式保證不了所有日期都能夠獲得數值）。

Autocorrelation indicates that lags of the dependent variable (and any other variables used as instruments that are not strictly exogenous), are in fact endogenous, thus bad instruments（xtabond2會報告自相關檢驗情況，如果有自相關情況，那表明這些工具變數並不好）.For example, if there is AR(s), then y_i,t-s would be correlated with e_i,t-s, which would be correlated with D.e_i,t-s, which would be correlated with D.e_i,t.

So for one-step, robust estimation (and for all two-step estimation), xtabond2 also reports the Hansen J statistic, which is the minimized value of the two-step GMM criterion function, and is robust. xtabond2 still reports the Sargan statistic in these cases because the J test has its own problem: it can be greatly weakened by instrument proliferation（xtabond2會報告Hansen J統計指標和Sargan指標來檢驗過度識別問題）。

To compensate, xtabond2 makes available a finite-sample correction to the two-step covariance matrix derived by Windmeijer (2005). This can make two-step robust estimations more efficient than one-step robust, especially for system GMM（xtabond2反正用了一些方式讓他的回歸更加有效率和穩健）。

Xtabond2操作示例：

GMM估計包括一步(One-Step)和兩步(Two-Step)的GMM。兩步估計的權重矩陣依賴於估計參數且標準差存在向下偏倚，並沒有帶來多大的效率改善且估計量不可靠，一步估計量儘管效率有所下降但它是一致的，因而在經驗應用中人們通常使用一步GMM估計。理論上，一步系統廣義矩估計(One-StepSystemGMM)利用了比一步差分廣義矩估計(One-stepDifference-GMM)更多的信息，前者可以解決後者不能解決的內生性和弱工具變數問題，因而前者比後者的估計結果更有效。Blundell and Bond利用蒙特卡羅模擬實驗也證實，在有限樣本下，系統GMM比差分GMM估計的偏差更小、效率也有所改進。

>use http://www.stata-press.com/data/r7/abdata.dta

>xtabond2 n l.n l(0/1).(w k) yr1980-yr1984, gmm(l.n w k) iv(yr1980-yr1984, passthru) noleveleq small

是檢驗擾動項的差分是否存在一階與二階自相關，以保證GMM的一致估計，一般而言擾動項的差分會存在一階自相關，因為是動態面板數據，但若不存在二階自相關或更高階的自相關，則接受原假設「擾動項無自相關」。

Arrellano-Bond test for AR(1/2) in first differences，是檢驗擾動項的差分是否存在一階與二階自相關，以保證GMM的一致估計，一般而言擾動項的差分會存在一階自相關，因為是動態面板數據，但若不存在二階自相關或更高階的自相關，則接受原假設「擾動項無自相關」。

兩步GMM會嚴重低估回歸係數的標準誤差；當標準誤差很小的時候，回歸係數的顯著性檢驗當然是拒絕的（例如p

但是這個結果是有很大的誤差的，所以兩步GMM必須通過加vce(robust)糾正這個誤差。以下那個論文專門討論了這個問題。因此，兩步GMM必須糾正這個誤差，在目前已經算是一個共識了。

>xtabond2 n l.n l(0/1).(w k) yr1980-yr1984, gmm(l.n w k) iv(yr1980-yr1984, mz) robust twostep small h(2)

以上的Sargan檢驗拒絕了overidentification restrictions，但是Hansen檢驗失敗拒絕overidentification restrictions，可能是因為Hansen檢驗比Sargan檢驗更穩健。例如，在異方差情況下，Sargan檢驗不具有卡方分布，但是Hansen檢驗卻具有卡方分布，因此如果這個問題出現了，那Sargan可能錯誤地拒絕原假設。不過，像這種有很多工具變數的估計，其他的問題也完全可能出現，從而導致上面的結果出現。

關於工具變數的選擇問題，可以看看下方的合併圖，一個是以differenced equations作為工具變數，另一個是以level equations作為工具變數。

>xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, gmm(L.(w k n), collapse) iv(yr1978-yr1984, eq(level)) h(2) robust twostep##通過collapse選項，我們減少了工具變數的數目，這樣有利於做諸如overidentification 檢驗。

>xtabond2 n w cap [pw=_n], iv(cap k ys, eq(level)) iv(rec, eq(level)) cluster(id year) h(1) #Cluster主要考慮組內（比如以id為組，year為組）相關問題。

1. with cluster

2. without cluster

xtabond2是默認把ivstyle裡面的變數都取滯後項同時作為差分、水平方程的工具變數；xtdpdsys默認只用於差分方程，並且，xtdpdsys將沒有設定為內生或先決變數的都自動作為外生變數，將其滯後項用作工具變數估計差分方程；

xtabond2中可以有一部分在前面的回歸變數中列出，但既不列入gmmstyle，也不列入ivstyle，這樣就不參與差分和水平方程的估計了（主要是一些滯後項）。

xtdpd的靈活性基本跟xtabond2一樣，但更加簡潔，就是可以直接、分別地設定差分估計和水平估計中採用gmm形式（一個多列矩陣）和iv形式（一個包含自身滯後的列向量）的變數。

>webuse abdata, clear

>xtabond2 n L.n, gmm(n, laglimits(2 .)) small h(2)

用xtabond2做了一個與xtdpd相同的回歸，不過xtabond2報告的檢驗更多，而xtdpd需要通過下一步estab來做檢驗。

下面我們用xtdpd也可以得到一樣的回歸結果，請看劃線部分與上圖對比。

>xtdpd n L.n, dgmm(n, lagrange(2 .)) lgmm(n, lag(1)) vce(r)

還想要介紹一個類似的動態面板回歸命令xtdpdml（似然法估計的）

Paul Allison, Enrique Moral-Benito, and Richard Williams are currently working on a project entitled "Dynamic Panel Data Modeling using Maximum Likelihood." Panel data have many advantages when trying to make causal inferences but can also be difficult to work with. We show that ML provides an alternative to widely used GMM methods such as Arellano-Bond and is superior in many cases. We have prepared a Stata command calledxtdpdmlthat greatly simplifies the process of estimating our models.

《END》

點擊展開全文

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自 計量經濟學圈 的精彩文章:

TAG:計量經濟學圈 |