{smcl} {* 12February2010}{...} {cmd:help ivqte} {hline} {title:Title} {p 4 8 2} {bf:IV quantile treatment effects} {title:Syntax} {p 8 17 2} {cmdab:ivqte} {depvar} [{it:varlist}] {cmd:(}{it:treatment} {cmd:=} {it:instrument}{cmd:)} {ifin} [{cmd:,} {it:options}] {synoptset 25 tabbed}{...} {synopthdr :options} {synoptline} {syntab :Model} {synopt:{opth q:uantiles(numlist)}}sets the quantile(s) at which the effects will be estimated.{p_end} {synopt:{opth c:ontinuous(varlist)}}gives the name(s) of the continuous and ordered discrete covariates.{p_end} {synopt:{opth d:ummy(varlist)}}gives the name(s) of the binary covariates.{p_end} {synopt:{opth u:nordered(varlist)}}gives the name(s) of the discrete unordered covariates.{p_end} {synopt:{opt aai}}selects the Abadie, Angrist and Imbens (2002) estimator.{p_end} {syntab:Estimation of the (instrument) propensity score} {synopt:{opt li:near}}uses a local linear instead of local logit estimator.{p_end} {synopt:{opt mata_opt}}uses the official Mata function {bf:{help mata optimize()}}.{p_end} {synopt:{opt k:ernel(kernel)}}type of kernel function used to calculate the propensity score, where {it:kernel} is {opt epan2} (Epanechnikov, the default), {opt b:iweight}, {opt triw:eight}, {opt c:osine}, {opt g:aussian}, {opt p:arzen}, {opt r:ectangle}, {opt t:riangle}, {opt epanechnikov_o4}, {opt epanechnikov_o6}, {opt gaussian_o4}, {opt gaussian_o6} or {opt gaussian_o8}. {p_end} {synopt:{opt b:andwidth(#)}}bandwidth used to calculate the propensity score, default is infinity.{p_end} {synopt:{opt l:ambda(#)}}parameter lambda used to calculate the propensity score, default is 1.{p_end} {syntab:Estimation of the weights} {synopt:{opt trim(#)}}trimming level for very small and large estimated propensity scores, default is 0.001.{p_end} {synopt:{opt p:ositive}}estimates positive weights, for the two-step estimator of endogenous unconditional QTE.{p_end} {synopt:{opt pb:andwidth(#)}}bandwidth used to calculate the positive weights.{p_end} {synopt:{opt pl:ambda(#)}}parameter lambda used to calculate the positive weights.{p_end} {synopt:{opt pk:ernel(kernel)}}type of kernel function used to calculate the positive weights.{p_end} {syntab:Inference} {synopt:{opt v:ariance}}provides an analytical estimate of the variance.{p_end} {synopt:{opt vb:andwidth(#)}}bandwidth used to calculate the variance.{p_end} {synopt:{opt vl:ambda(#)}}parameter lambda used to calculate the variance.{p_end} {synopt:{opt vk:ernel(kernel)}}type of kernel function used to calculate the variance.{p_end} {synopt:{opt le:vel(#)}}set confidence level; default is {cmd:level(95)}{p_end} {syntab:Saved propensity scores and weights} {synopt:{opt generate_p(newvarname[, replace])}}creates {it:newvarname} containing the estimated propensity score.{p_end} {synopt:{opt generate_w(newvarname[, replace)}}creates {it:newvarname} containing the estimated weights.{p_end} {synopt:{opt phat(varname)}}gives the name of an existing variable containing the propensity score.{p_end} {synopt:{opt what(varname)}}gives the name of an existing variable containing the weights.{p_end} {synoptline} {title:Description} {p 4 4 2} {cmd:ivqte} computes the quantile treatment effects of a binary variable using a weighting strategy. This command can estimate both conditional and unconditional quantile treatment effects (QTE) either under exogeneity or endogeneity. The estimator proposed by Froelich and Melly (2008) is used if unconditional QTE under endogeneity are estimated. The estimator proposed by Abadie, Angrist and Imbens (2002) is used if conditional QTE under endogeneity are estimated. The estimator proposed by Firpo (2007) is used if unconditional QTE under exogeneity are estimated. The estimator proposed by Koenker and Bassett (1978) is used if conditional QTE under exogeneity are estimated. (We use the term exogeneity as meaning exogenous conditional on the control variables X. This is also often called a selection on observables assumption or a matching assumption.) {p 4 4 2} The estimator used by {cmd:ivqte} is determined as follows. (i) If an {it:instrument} is provided and {opt aai} is not activated, the estimator proposed by Froelich and Melly (2008) is used. (ii) If an {it:instrument} is provided and {opt aai} is activated, the estimator proposed by Abadie, Angrist and Imbens (2002) is used. (iii) If there is no {it:instrument} and {it:varlist} is empty, the estimator proposed by Firpo (2007) is used. (iv) If there is no {it:instrument} and {it:varlist} contains variables, the estimator proposed by Koenker and Bassett (1978) is used. {p 4 4 2} The instrument is assumed to satisfy the exclusion restriction conditionally on the covariates given in {opt dummy}, {opt unordered}, and {opt continuous}. If the same variable is given as {it:treatment} and as {it:instrument}, then conditional exogeneity of the treatment variable is assumed. If these two variables are different, then {it:instrument} is used as an instrumental variable to estimate the effects of the endogenous treatment variable. {p 4 4 2} Variables should be given in {it:varlist} only if you want to use the Koenker and Bassett (1978) estimator. For all other estimators, {it:varlist} should be empty and covariates should be given in {opt dummy}, {opt unordered} and {opt continuous}. {p 4 4 2} The endogenous variable {it:treatment} must be a binary 0/1 variable. This implementation of the estimator also requires the IV, {it:instrument}, to be binary. Froelich and Melly (2008, section 2) explain how several IV or a non-binary IV can be transformed into a binary instrument. {p 4 4 2} We can distinguish between four types of covariates: continuous, ordered discrete, binary and unordered discrete. The first two types of variables are treated similarly by the kernel function and they must be given in the option {opt continuous}. Binary variables must be given in {opt dummy}. Unordered discrete variables are treated similarly to the dummy variable in the kernel function but a set of dummies is then used in the local regression plane. To give an example, {it: experience} and {it: education} should be given in {opt continuous}, {it: female} in {opt dummy} and {it: region} in {opt unordered}. For more details on the local estimator see the command {cmd:locreg}. {p 4 4 2} In a first step, the instrument propensity score, defined as Pr(instrument = 1 | continuous, dummy, unordered), is estimated. A mixed kernel suggested by Racine and Li (2004) is used to smooth over the continuous and categorical data. The more conventional approach consisting of estimating the regression plane inside each cell defined by the discrete variables can be obtained by setting {opt lambda} to 0. A local linear estimator is used if the option {opt linear} is selected. Otherwise, a local logit estimator is used. Two algorithms are available to maximize the local logistic likelihood function. The default is a simple Gauss-Newton algorithm written for this purpose. If you select the option {opt mata_opt}, the official Stata 10 optimizer {bf:{help mata optimize()}} is used. We expect the official estimator to be more stable in difficult situations. However, it can only be used if you have Stata 10 or newer. In all cases, the kernel is determined by the value of the options {opt kernel}, {opt bandwidth} and {opt lambda}. The cross-validation procedure implemented in {cmd:locreg} can be used to guide the choice of the bandwidth. {p 4 4 2} In the unconditional endogenous case, the option {opt positive} may be activated. In this case, the estimated weights are first projected onto the space defined by the treatment variable and the dependent variable. These weights are asymptotically positive. (Negative estimated weights are set to zero.) This allows us to use standard quantile regression algorithms to directly estimate the quantile treatment effects. The projection is estimated by a local linear regression separately for the treated and the non-treated. The nonparameric regression is determined by the values of the options {opt pkernel} and {opt pbandwidth}. (Negative estimated weights are set to zero.) {p 4 4 2} In the conditional endogenous case (that is when the option {opt aai} is activated), positive weights are always estimated by the projection of the first weights onto the space defined by the treatment variable, the dependent variable and all covariates. These weights are also asymptotically positive. The projection is estimated by a local linear regression separately for the treated and the non-treated. The nonparametric regression is determined by the values of the options {opt pkernel}, {opt pbandwidth} and {opt lambda}. {p 4 4 2} The quantile treatment effects are estimated at each of the quantiles defined by the option {opt quantiles}. The conditional quantile treatment effect can only be estimated at one quantile each time. However, the option {opt generate_w} allows the user to save the estimated weights. These weights will then be used for other quantiles if they are given in {opt what}. The same can be done for the estimated propensity score with the options {opt generate_p} and {opt phat}. This also allows the user to use other methods than the local linear or local logit estimators to estimate the propensity score. For instance, the series estimators proposed by Abadie, Angrist and Imbens (2002) and Firpo (2007) can be used in a first step. {p 4 4 2} By default, no standard errors are reported, because the estimation of the variance is computationally intensive if all functions are estimated nonparametrically (even more than the estimation of the QTE). Analytical standard errors can be obtained by selecting the option {opt variance}. In the conditional exogenous case, the kernel estimator proposed by Powell (1984) has been implemented. This estimator is consistent in the presence of heteroscedasticity, in contrast to the estimator of the variance of the official command {bf:{help qreg}}. In the other cases, the variance estimator suggested by Abadie, Angrist and Imbens (2002), Firpo (2007) and Froelich and Melly (2008) have been implemented. {p 4 4 2} With the exception of standard quantile regression, the estimation of the variance of all estimators requires the estimation of additional nonparametric functions that are different at each quantile. Therefore, in contrary to the estimation of the QTE, for estimating the variance the computation time increases proportionally to the number of quantiles at which the effects are estimated. The options {opt vkernel}, {opt vbandwidth} and {opt vlambda} determine the specification of the nonparametric estimation for the variance. The estimation of the covariance between different QTE is currently not implemented. To avoid an error message, all covariances have been set to 0. Do NOT use these covariances to test hypotheses concerning several quantiles. The bootstrap can also be used to estimate the variance (and the covariances between the QTE). Do NOT activate the option {opt variance} when you estimate the variance by bootstrapping the results. {title:Options} {dlgtab:Model} {phang} {opt quantiles(numlist)} specifies the quantiles at which the effects are estimated and should contain numbers between 0 and 1. Note that the computational time needed to estimate an additional quantile is very short compared to the time needed to estimate the preliminary nonparametric regressions. Note also that, when conditional QTE are estimated, only one quantile may be specified. If one is interested in several QTE, one can save the estimated weights for later use by using the option {opt generate_w}. {phang} {opt continuous(varlist)}, {opt dummy(varlist)}, and {opt unordered(varlist)} specify the names of the covariates depending on their type. Ordered discrete variables should be treated as continuous. {phang} {opt aai} selects the Abadie, Angrist and Imbens (2002) estimator. If this option is selected all the variables given in {opt continuous(varlist)}, {opt dummy(varlist)}, and {opt unordered(varlist)} are taken as covariates. {dlgtab:Estimation of the propensity score} {phang} {opt linear} selects the method used to estimate the propensity score. If this option is not activated, the local logit estimator is used. If it is activated, the local linear estimator is used. {phang} {opt mata_opt} selects the official optimizer introduced in Stata 10 to estimate local logit, {help mata optimize()}. The default is a simple Gauss-Newton algorithm written for this purpose. (This option is only relevant when the option {opt linear} has not been selected.) {phang} {opt kernel(kernel)} specifies the kernel function. {it:kernel} may be {opt epan2} (Epanechnikov kernel function; the default), {opt biweight} (biweight kernel function), {opt triweight} (triweight kernel function), {opt cosine} (cosine trace), {opt gaussian} (Gaussian kernel function), {opt parzen} (Parzen kernel function), {opt rectangle} (rectangle kernel function) or {opt triangle} (triangle kernel function). In addition to these second order kernels, there are also several higher order kernels: {opt epanechnikov_o4} (Epanechnikov order 4), {opt epanechnikov_o6} (Order 4), {opt gaussian_o4} (Gaussian order 4), {opt gaussian_o6} (Order 6), {opt gaussian_o8} (Order 8). By default, {opt epan2}, specifying the Epanechnikov kernel, is used. {phang} {opt bandwidth(#)} sets the bandwidth used to smooth over the continuous variables in the estimation of the propensity score. Note that the continuous regressors are first orthogonalized such that their covariance matrix is the identity matrix. The bandwidth must be strictly positive. If the bandwidth is set to the missing value ".", an infinite bandwidth is used. The default value is infinity. If the bandwidth is infinity and lambda is one, a global model (linear or logit) is estimated without any local smoothing. The cross-validation procedure implemented in {cmd:locreg} can be used to guide the choice of the bandwidth. {phang} {opt lambda(#)} sets the lambda used to smooth over the dummy and unordered discrete variables in the estimation of the propensity score. It must be between 0 and 1. A value of 0 implies that only observations within the cell defined by all discrete regressors are used to estimate the conditional mean. The default value is 1. If the bandwidth is infinity and lambda is one, a global model (linear or logit) is estimated without any local smoothing. The cross-validation procedure implemented in {cmd:locreg} can be used to guide the choice of the bandwidth. {dlgtab:Estimation of the weights} {phang} {opt trim(#)} controls the amount of trimming. The observations with a propensity score lower than {opt trim} or above 1-{opt trim} are trimmed and not used further by the procedure. This prevents giving very high weights to single observations. The default is 0.001. {phang} {opt positive} can be used only with the estimator suggested by Froelich and Melly (2008). If it is activated, positive weights are estimated by the projection of the first weights on the dependent and treatment variable. (Negative estimated weights are set to zero.) {phang} {opt pkernel(kernel)}, {opt pbandwidth(#)} and {opt plambda(#)} are used to calculate the positive weights if the option {opt aai} has been selected. {opt pkernel(kernel)} and {opt pbandwidth(#)} are used to calculate the positive weights if the option {opt positive} has been selected. They are defined similarly to {opt kernel(kernel)}, {opt bandwidth(#)} and {opt lambda(#)}. If {opt pkernel(kernel)}, {opt pbandwidth(#)} and/or {opt plambda(#)} are not specified, the values given in {opt kernel(kernel)}, {opt bandwidth(#)} and {opt lambda(#)} are used. {dlgtab:Saved propensity scores and weights} {phang} {opt generate_p(newvarname[, replace])} creates {it:newvarname} containing the estimated propensity score. This may be useful if you want to compare the results with and without the projection of the weights. You can first estimate the QTE without the projection and save the propensity score in {it:newvarname}. In the second step you can use the estimated propensity score as an input in the option {opt phat}. The option {opt replace} allows {cmd:ivqte} to overwrite a potentially existing variable. {phang} {opt generate_w(newvarname[, replace])} creates {it:newvarname} containing the estimated weights. This may be useful if you want to estimate several conditional QTE. The weights must be estimated only once and then be given as input in the option {opt what}. The option {opt replace} allows {cmd:ivqte} to overwrite a potentially existing variable. {phang} {opt phat(varname)} gives the name of an existing variable containing the estimated instrument propensity score. The propensity score may have been estimated using {cmd:ivqte} or any other command such as a series estimator. {phang} {opt what(varname)} gives the name of an existing variable containing the estimated weights. The weights may have been estimated using {cmd:ivqte} or with any other command such as a series estimator. {dlgtab:Inference} {phang}{opt level(#)}; see {helpb estimation options##level():[R] estimation options}. {phang} {opt variance} activates the estimation of the variance. The asymptotic variance is estimated by plugging in estimators for all elements appearing in the formula. {phang} {opt vkernel(kernel)}, {opt vlambda(#)}, and {opt vbandwidth(#)} are used to calculate the variance of the estimated quantile treatment effects if the option {opt variance} has been selected. They are defined similarly to {opt kernel(kernel)}, {opt bandwidth(#)} and {opt lambda(#)}. If {opt variance} is used without the options {opt vkernel(kernel)}, {opt vbandwidth(#)} and/or {opt vlambda(#)}, the values given in {opt kernel(kernel)}, {opt bandwidth(#)} and {opt lambda(#)} are used. {title:Saved results} {phang}{cmd:ivqte} saves the following results in {cmd:e()}: {phang}Scalars{p_end} {col 10}{cmd:e(N)}{col 25}Number of observations {col 10}{cmd:e(bandwidth)}{col 25}Bandwidth {col 10}{cmd:e(lambda)}{col 25}Lambda {col 10}{cmd:e(pbandwidth)}{col 25}Pbandwidth {col 10}{cmd:e(plambda)}{col 25}Plambda {col 10}{cmd:e(vbandwidth)}{col 25}Vbandwidth {col 10}{cmd:e(vlambda)}{col 25}Vlambda {col 10}{cmd:e(pseudo_r2)}{col 25}Pseudo R2 of the quantile regression {col 10}{cmd:e(compliers)}{col 25}Proportion of compliers {col 10}{cmd:e(trimmed)}{col 25}Number of observations trimmed {phang}Macros{p_end} {col 10}{cmd:e(command)}{col 25}Name of the command: ivqte {col 10}{cmd:e(depvar)}{col 25}Name of the dependent variable {col 10}{cmd:e(treatment)}{col 25}Name of the treatment variable {col 10}{cmd:e(instrument)}{col 25}Name of the instrumental variable {col 10}{cmd:e(continuous)}{col 25}Name of the continuous covariates {col 10}{cmd:e(dummy)}{col 25}Name of the binary covariates {col 10}{cmd:e(unordered)}{col 25}Name of the unordered covariates {col 10}{cmd:e(regressors)}{col 25}Name of the regessors (conditional QTE) {col 10}{cmd:e(estimator)}{col 25}Name of the estimator {col 10}{cmd:e(ps_method)}{col 25}Linear or logistic model {col 10}{cmd:e(optimization)}{col 25}Algorithm used {col 10}{cmd:e(kernel)}{col 25}Kernel function {col 10}{cmd:e(pkernel)}{col 25}Kernel function for positive weights {col 10}{cmd:e(vkernel)}{col 25}Kernel function for variance estimation {phang}Matrices{p_end} {col 10}{cmd:e(b)}{col 25}Row vector containing the quantile treatment effects. {col 10}{cmd:e(quantiles)}{col 25}Row vector containing the quantiles at which QTE have been estimated. {col 10}{cmd:e(V)}{col 25}Matrix containing the variances of the estimated QTE in the diagonal and 0 otherwise. {phang}Functions{p_end} {col 10}{cmd:e(sample)}{col 25}Marks estimation sample {title:Version requirements} {p 4 4 2}This command requires Stata 9.2. Stata 10 is required to use the option {opt mata_opt}. In addition {cmd:locreg} is required, which itself requires the {cmd:moremata} (see Jann, 2005a) and {cmd:kdens} (see Jann, 2005b) packages. Type {net "describe moremata, from(http://fmwww.bc.edu/repec/bocode/m/)":ssc describe moremata}{txt} and {net "describe kdens, from(http://fmwww.bc.edu/repec/bocode/k/)":ssc describe kdens}{txt}. {title:Methods and Formulas} {p 4 6} See Froelich and Melly (2010). {title:References} {p 4 6} Alberto Abadie, Josh Angrist and Guido Imbens (2002): Instrumental variable estimates of the effect of subsidized training on the quantiles trainee earnings. Econometrica, 70, 91-117. {p 4 6} Sergio Firpo (2007): Efficient semiparametric estimation of quantile treatment effects. Econometrica, 75, 259-276. {p 4 6} Markus Froelich and Blaise Melly (2008): Unconditional quantile treatment effects under endogeneity, IZA Discussion Paper 3288. Mimeo. {p 4 6} Markus Froelich and Blaise Melly (2010): Estimation of quantile treatment effects with STATA, forthcoming in Stata Journal. {p 4 6} Ben Jann (2005a): moremata: Stata module (Mata) to provide various > functions.Available from {browse ideas.repec.org/c/boc/bocode/s455001.html}. {p 4 6} Ben Jann (2005b): kdens: Stata module for univariate kernel density estimation. Available from {browse ideas.repec.org/c/boc/bocode/s456410.html}. {p 4 6} Roger Koenker and Gilbert Bassett (1978): Regression quantiles. Econometrica, 46, 33-50. {p 4 6} Jeff Racine and Qi Li (2004): Nonparametric estimation of regression functions with both categorical and continuous data. Journal of Econometrics, 119, 99-130. {title:Remarks} {p 4 4}This is a preliminary version. Please feel free to share your comments, reports of bugs and propositions for extensions. {p 4 4}If you use this command in your work, please cite Froelich and Melly (2008) and Froelich and Melly (2010). {title:Disclaimer} {p 4 4 2}THIS SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. {p 4 4 2}IN NO EVENT WILL THE COPYRIGHT HOLDERS OR THEIR EMPLOYERS, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THIS SOFTWARE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM. {title:Authors} {p 4 6}Markus Froelich and Blaise Melly{p_end} {p 4 6}University of Mannheim and Brown University{p_end} {p 4 6}blaise_melly@brown.edu{p_end} {p 4 6}February 2010{p_end}