Chernozhukov, Victor
Overview
Works: 
131
works in
222
publications in
1
language and
582
library holdings

Roles: 
Author, Editor, Other

Classifications: 
HB1,
330.072 
Most widely held works by
Victor Chernozhukov
Quantile regression under misspecification with an application to the U.S. wage structure by Joshua David Angrist (
Book
)
11
editions published
in
2004
in
English
and held by
46
libraries
worldwide
Quantile regression(QR) fits a linear model for conditional quantiles, just as ordinary least squares (OLS) fits a linear model for conditional means. An attractive feature of OLS is that it gives the minimum mean square error linear approximation to the conditional expectation function even when the linear model is misspecified. Empirical research using quantile regression with discrete covariates suggests that QR may have a similar property, but the exact nature of the linear approximation has remained elusive. In this paper, we show that QR can be interpreted as minimizing a weighted meansquared error loss function for specification error. The weighting function is an average density of the dependent variable near the true conditional quantile. The weighted least squares interpretation of QR is used to derive an omitted variables bias formula and a partial quantile correlation concept, similar to the relationship between partial correlation and OLS. We also derive general asymptotic results for QR processes allowing for misspecification of the conditional quantile function, extending earlier results from a single quantile to the entire process. The approximation properties of QR are illustrated through an analysis of the wage structure and residual inequality in US Census data for 1980, 1990, and 2000. The results suggest continued residual inequality growth in the 1990s, primarily in the upper half of the wage distribution and for college graduates
Handbook of quantile regression by Roger Koenker (
Book
)
6
editions published
between
2017
and
2018
in
English
and held by
35
libraries
worldwide
Quantile regression constitutes an ensemble of statistical techniques intended to estimate and draw inferences about conditional quantile functions. Median regression, as introduced in the 18th century by Boscovich and Laplace, is a special case. In contrast to conventional mean regression that minimizes sums of squared residuals, median regression minimizes sums of absolute residuals; quantile regression simply replaces symmetric absolute loss by asymmetric linear loss. Since its introduction in the 1970's by Koenker and Bassett, quantile regression has been gradually extended to a wide variety of data analytic settings including time series, survival analysis, and longitudinal data. By focusing attention on local slices of the conditional distribution of response variables it is capable of providing a more complete, more nuanced view of heterogeneous covariate effects. Applications of quantile regression can now be found throughout the sciences, including astrophysics, chemistry, ecology, economics, finance, genomics, medicine, and meteorology. Software for quantile regression is now widely available in all the major statistical computing environments. The objective of this volume is to provide a comprehensive review of recent developments of quantile regression methodology illustrating its applicability in a wide range of scientific settings. The intended audience of the volume is researchers and graduate students across a diverse set of disciplines.
Bootstrap confidence sets under model misspecification by Mayya Zhilova (
file
)
1
edition published
in
2015
in
English
and held by
18
libraries
worldwide
Identification and Efficient Semiparametric Estimation of a Dynamic Discrete Game by Patrick L Bajari (
Book
)
5
editions published
in
2015
in
English
and held by
7
libraries
worldwide
In this paper, we study the identification and estimation of a dynamic discrete game allowing for discrete or continuous state variables. We first provide a general nonparametric identification result under the imposition of an exclusion restriction on agent payoffs. Next we analyze large sample statistical properties of nonparametric and semiparametric estimators for the econometric dynamic game model. We also show how to achieve semiparametric efficiency of dynamic discrete choice models using a sieve based conditional moment framework. Numerical simulations are used to demonstrate the finite sample properties of the dynamic game estimators. An empirical application to the dynamic demand of the potato chip market shows that this technique can provide a useful tool to distinguish long term demand from short term demand by heterogeneous consumers
Improving point and interval estimates of monotone functions by rearrangement by Victor Chernozhukov (
Book
)
2
editions published
between
2007
and
2008
in
English
and held by
4
libraries
worldwide
Suppose that a target function ... is monotonic, namely, weakly increasing, and an original estimate of this target function is available, which is not weakly increasing. Many common estimation methods used in statistics produce such estimates. We show that these estimates can always be improved with no harm using rearrangement techniques: The rearrangement methods, univariate and multivariate, transform the original estimate to a monotonic estimate, and the resulting estimate is closer to the true curve in common metrics than the original estimate. The improvement property of the rearrangement also extends to the construction of confidence bands for monotone functions. Suppose we have the lower and upper endpoint functions of a simultaneous confidence interval that covers the target function with a prespecified probability level, then the rearranged confidence interval, defined by the rearranged lower and upper endpoint functions, is shorter in length in common norms than the original interval and covers the target function with probability greater or equal to the prespecified level. We illustrate the results with a computational example and an empirical example dealing with ageheight growth charts. Keywords: Monotone function, improved estimation, improved inference, multivariate rearrangement, univariate rearrangement, Lorentz inequalities, growth chart, quantile regression, mean regression, series, locally linear, kernel methods. JEL Classifications: 62G08, 46F10, 62F35, 62P10
Inference on parameter sets in econometric models by Victor Chernozhukov (
Book
)
2
editions published
in
2006
in
English
and held by
4
libraries
worldwide
This paper provides confidence regions for minima of an econometric criterion function Q([mu]). The minima form a set of parameters, [Theta]I, called the identified set. In economic applications, [Theta]I represents a class of economic models that are consistent with the data. Our inference procedures are criterion function based and so our confidence regions, which cover [Theta]I with a prespecified probability, are appropriate level sets of Qn([mu]), the sample analog of Q([mu]). When [Theta]I is a singleton, our confidence sets reduce to the conventional confidence regions based on inverting the likelihood or other criterion functions. We show that our procedure is valid under general yet simple conditions, and we provide feasible resampling procedure for implementing the approach in practice. We then show that these general conditions hold in a wide class of parametric econometric models. In order to verify the conditions, we develop methods of analyzing the asymptotic behavior of econometric criterion functions under set identification and also characterize the rates of convergence of the confidence regions to the identified set. We apply our methods to regressions with in terval data and set identified method of moments problems. We illustrate our methods in an empirical Monte Carlo study based on Current Population Survey data. Keywords: Set estimator, level sets, interval regression, subsampling bootsrap. JEL Classifications: C13, C14, C21, C41, C51, C53
Conditional extremes and nearextremes : concepts, asymptotic theory, and economic applications by Victor Chernozhukov (
Archival Material
)
3
editions published
between
2000
and
2001
in
English
and held by
4
libraries
worldwide
This paper develops a theory of high and low (extremal) quantile regression: the linear models, estimation, and inference. In particular, the models coherently combine the convenient, flexible linearity with the extremevaluetheoretic restrictions on tails and the general heteroscedasticity forms. Within these models, the limit laws for extremal quantile regression statistics are obtained under the rank conditions (experiments) constructed to reflect the extremal or rare nature of tail events. An inference framework is discussed. The results apply to crosssection (and possibly dependent) data. The applications, ranging from the analysis of babies' very low birth weights, (S, s) models, tail analysis in heteroscedastic regression models, outlierrobust inference in auction models, and decisionmaking under extreme uncertainty, provide the motivation and applications of this theory. Keywords: Quantile regression, extreme value theory, tail analysis, (S, s) models, auctions, price search, Extreme Risk. JEL Classifications: C13, C14, C21, C41, C51, C53, D21, D44, D81
Posterior inference in curved exponential families under increasing dimensions by Alexandre Belloni (
Book
)
2
editions published
between
2007
and
2013
in
English
and held by
3
libraries
worldwide
N this work we study the large sample properties of the posteriorbased inference in the curved exponential family under increasing dimension. The curved structure arises from the imposition of various restrictions, such as moment restrictions, on the model, and plays a fundamental role in various branches of data analysis. We establish conditions under which the posterior distribution is approximately normal, which in turn implies various good properties of estimation and inference procedures based on the posterior. We also discuss the multinomial model with moment restrictions, which arises in a variety of econometric applications. In our analysis, both the parameter dimension and the number of moments are increasing with the sample size. Keywords: Bayesian Infrence, Frequentist Properties. JEL Classifications: C13, C51, C53, D11, D21, D44
Conditional valueatrisk : aspects of modeling and estimation by Victor Chernozhukov (
Book
)
2
editions published
in
2000
in
English
and held by
3
libraries
worldwide
This paper considers flexible conditional (regression) measures of market risk. ValueatRisk modeling is cast in terms of the quantile regression function  the inverse of the conditional distribution function. A basic specification analysis relates its functional forms to the benchmark models of returns and asset pricing. We stress important aspects of measuring very high and intermediate conditional risk. An empirical application illustrates. Keywords: Conditional Quantiles, Quantile Regression, Extreme Quantiles, Extreme Value Theory, Extreme Risk. JEL Classifications: C14, C13, C21, C51, C53, G12, G19
Local identification of nonparametric and semiparametric models
(
Computer File
)
2
editions published
in
2011
in
English
and held by
2
libraries
worldwide
In parametric models a sufficient condition for local identifcation is that the vector of moment conditions is differentiable at the true parameter with full rank derivative matrix. We show that there are corresponding sufficient conditions for nonparametric models. A nonparametric rank condition and differentiability of the moment conditions with respect to a certain norm imply local identifcation. It turns out these conditions are slightly stronger than needed and are hard to check, so we provide weaker and more primitive conditions. We extend the results to semiparametric models. We illustrate the sufficient conditions with endogenous quantile and single index examples. We also consider a semiparametric habitbased, consumption capital asset pricing model. There we find the rank condition is implied by an integral equation of the second kind having a onedimensional null space
Inference on counterfactual distributions by Victor Chernozhukov (
Book
)
2
editions published
in
2008
in
English
and held by
2
libraries
worldwide
In this paper we develop procedures for performing inference in regression models about how potential policy interventions affect the entire marginal distribution of an outcome of interest. These policy interventions consist of either changes in the distribution of covariates related to the outcome holding the conditional distribution of the outcome given covariates fixed, or changes in the conditional distribution of the outcome given covariates holding the marginal distribution of the covariates fixed. Under either of these assumptions, we obtain uniformly consistent estimates and functional central limit theorems for the counterfactual and status quo marginal distributions of the outcome as well as other functionvalued effects of the policy, including, for example, the effects of the policy on the marginal distribution function, quantile function, and other related functionals. We construct simultaneous confidence sets for these functions; these sets take into account the sampling variation in the estimation of the relationship between the outcome and covariates. Our procedures rely on, and our theory covers, all main regression approaches for modeling and estimating conditional distributions, focusing especially on classical, quantile, duration, and distribution regressions. Our procedures are general and accommodate both simple unitary changes in the values of a given covariate as well as changes in the distribution of the covariates or the conditional distribution of the outcome given covariates of general form. We apply the procedures to examine the effects of labor market institutions on the U.S. wage distribution. Keywords: Policy effects, counterfactual distribution, quantile regression, duration regression, distribution regression. JEL Classifications: C14, C21, C41, J31, J71
Inference on treatment effects after selection amongst highdimensional controls by Alexandre Belloni (
Book
)
1
edition published
in
2012
in
English
and held by
1
library
worldwide
We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly nonGaussian and heteroscedastic disturbances where the number of controls may be much larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by conditioning on a relatively small number of controls whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of controls. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the "postdoubleselection" method. Our results apply to Lassotype methods used for covariate selection as well as to any other model selection method that is able to find a sparse model with good approximation properties. The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard postmodel selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We illustrate the use of the developed methods with numerical simulations and an application to the effect of abortion on crime rates. Keywords: treatment effects, partially linear model, highdimensionalsparse regression, inference under imperfect model selection, uniformly valid inference after model selection. JEL Classification: C10, C51
Conditional quantile processes based on series or many regressors by Alexandre Belloni (
Book
)
1
edition published
in
2011
in
English
and held by
1
library
worldwide
Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In this framework, we approximate the entire conditional quantile function by a linear combination of series terms with quantilespecific coefficients and estimate the functionvalued coefficients from the data. We develop large sample theory for the empirical QR coefficient process, namely we obtain uniform strong approximations to the empirical QR coefficient process by conditionally pivotal and Gaussian processes, as well as by gradient and weighted bootstrap processes. We apply these results to obtain estimation and inference methods for linear functionals of the conditional quantile function, such as the conditional quantile function itself, its partial derivatives, average partial derivatives, and conditional average partial derivatives. Specifically, we obtain uniform rates of convergence, large sample distributions, and inference methods based on strong pivotal and Gaussian approximations and on gradient and weighted bootstraps. All of the above results are for functionvalued parameters, holding uniformly in both the quantile index and in the covariate value, and covering the pointwise case as a byproduct. If the function of interest is monotone, we show how to use monotonization procedures to improve estimation and inference. We demonstrate the practical utility of these results with an empirical example, where we estimate the price elasticity function of the individual demand for gasoline, as indexed by the individual unobserved propensity for gasoline consumption. Keywords: quantile regression series processes, uniform inference. JEL Classifications: C12, C13, C14
Program evaluation with highdimensional data
(
file
)
1
edition published
in
2013
in
English
and held by
1
library
worldwide
We consider estimation of policy relevant treatment effects in a datarich environ ment where there may be many more control variables available than there are observations. In addition to allowing many control variables, the setting we consider allows heterogeneous treatment effects, endogenous receipt of treatment, and functionvalued outcomes. To make informative inference possible, we assume that reduced form predictive relationships are approx imately sparse. That is, we require that the relationship between the covariates and the outcome, treatment status, and instrument status can be captured up to a small approximation error using a small number of controls whose identities are unknown to the researcher. This condition allows estimation and inference for a wide variety of treatment parameters to proceed after selection of an appropriate set of control variables formed by selecting controls separately for each reduced form relationship and then appropriately combining this set of reduced form predictive models and associated selected controls. We provide conditions under which postselection inference is uniformly valid across a widerange of models and show that a key condition underlying uniform validity of postselection inference allowing for imperfect model selection is the use of approximately unbiased estimating equations. We illustrate the use of the proposed treatment effect estimation methods with an application to estimating the effect of 401(k) participation on accumulated assets
Double/debiased machine learning for treatment and structural parameters by Victor Chernozhukov (
file
)
2
editions published
in
2017
in
English
and held by
0
libraries
worldwide
We revisit the classic semiparametric problem of inference on a low dimensional parameter [theta]_0 in the presence of highdimensional nuisance parameters [eta]_0. We depart from the classical setting by allowing for [eta]_0 to be so highdimensional that the traditional assumptions, such as Donsker properties, that limit complexity of the parameter space for this object break down. To estimate [eta]_0, we consider the use of statistical or machine learning (ML) methods which are particularly wellsuited to estimation in modern, very highdimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating [eta]_0 cause a heavy bias in estimators of [theta]_0 that are obtained by naively plugging ML estimators of [eta]_0 into estimating equations for [theta]_0. This bias results in the naive estimator failing to be N^(1/2) consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest [theta]_0 can be removed by using two simple, yet critical, ingredients: (1) using Neymanorthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate [theta]_0, and (2) making use of crossfitting which provides an efficient form of datasplitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in a N^(1/2)neighborhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of DML applied to learn the main regression parameter in a partially linear regression model, DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model, DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness, and DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples
Censored Quantile Instrumental Variable Estimation with Stata by Victor Chernozhukov (
file
)
2
editions published
in
2018
in
English
and held by
0
libraries
worldwide
Many applications involve a censored dependent variable and an endogenous independent variable. Chernozhukov, FernandezVal, and Kowalski (2015) introduced a censored quantile instrumental variable estimator (CQIV) for use in those applications, which has been applied by Kowalski (2016), among others. In this article, we introduce a Stata command, cqiv, that simplifes application of the CQIV estimator in Stata. We summarize the CQIV estimator and algorithm, we describe the use of the cqiv command, and we provide empirical examples
more
fewer

Alternative Names
Chernozhukov, V. Chernozhukov, Victor Victorovich
Languages
