Topics in High-Dimensional Econometrics and ML Theory

Name: Topics in High-Dimensional Econometrics and ML Theory
Author: Marcelo Ortiz

Emory University, Spring 2024

Marcelo Ortiz-Villavicencio

About
Course Description
Content
References
Acknowledgments

About

Designed to be a Directed Study (a intensive reading in econometrics on a topic not covered in a regular course at Emory University), this course covers a variety of topics in high-dimensional econometrics and machine learning theory. The course is based on student presentations, and discussions among participants (students and faculty invited).

Course Description

This course aims to be a in-depth exploration of the theoretical foundations and practical applications of high dimensional statistics and machine learning theory at the graduate level. Since this a topics course, the content is based on the interests of the participants and student presenter. The primary objective of this directed reading is to provide a comprehensive and self-contained overview of high-dimensional statistics, covering topics such as concentration inequalities, empirical processes, uniform laws, reproducing kernel Hilbert spaces, semiparametric theory, double/debiased machine learning and their applications in causal inference.

Content

Topic 1: Concentrations Inequalities

Motivating examples
Classical bounds

Topic 2: Uniform law of large numbers

Uniform convergence
Rademacher complexity

Topic 3: Sparse linear models in high-dimensions

Different types of sparsity
Shrinkage estimators and regularizers
Regularization bias

Topic 4: Reproducing Kernel Hilbert Spaces

Hilbert spaces
Kernels and operations
Reproducing kernel Hilbert spaces
Kernel Ridge regression

Topic 5: Semiparametric Efficiency Theory

Semiparametric efficiency
Efficiency Influence Functions
Pathwise Defferentiability and Distributional Taylor Expansion

Topic 6: Double/Debiased Machine Learning

Neyman Orthogonality
Sample Splitting
Cross-fitting
Applications

References

The content of the course will be based on the following references:

High-Dimensional Statistics: A Non-Asymptotic Viewpoint by Martin Wainwright.
Lectures Notes for Machine Learning Theory (CS229M/STATS214) by Tengyu Ma.
Introduction to RKHS, and some simple kernel algorithms by Arthur Gretton.
Machine Learning for Econometrics by Christophe Gaillac and Jeremy L’Hour.

Acknowledgments

I thank Prof. David Jacho-Chavez for his guidance and serve as Faculty Sponsor to develop this course.