Summary of Applied Empirical Modeling of Nonlinearity and Endogeneity in Regression Models SS 2018, Ronny Behrens

Applied Empirical Modeling of Nonlinearity and Endogeneity in Regression Models SS 2018, Ronny Behrens

Often empirical problems do not fit the modeling assumptions of Ordinary Least Squares (OLS) estimation. This workshop looks at two specific scenarios: (1) Nonlinearities in dependent and independent variables and (2) instrumental variable techniques for dealing with endogeneity and non-random sample selection. These problems are often encountered in applied work. The goal of this workshop is to provide researchers with tools used to address some of the inadequacies of traditional OLS estimation in each setting.

We begin by looking at different nonlinear approaches to modeling discrete choice. We also extend the theme of nonlinearity to the independent variable side by discussing the interpretation of interaction effects in traditional OLS. Then we consider different instrumental variable strategies to deal with the problem of endogeneity. Finally, we combine the themes of nonlinearity and instrumental variables by considering selection models to deal with non-random samples.

By the end of the workshop, you should be able to understand and (equally important) run Stata code to model dichotomous dependent variables with logit and probit estimations, perform instrumental variable estimation and accompanying tests of instrument exogeneity and relevance, and estimate models controlling for selection bias.

To achieve these goals, the course is structured in a way that aims at covering the following sub-questions and aspects:

Thursday Morning: Dichotomous Dependent Variables and Probit/Logit Estimation – Part 1

What are dichotomous dependent variables and why can’t we use regular OLS?
What is a ”link” function? How do Probit and logit link to probability?
How do probit/logit improve on OLS? How do you interpret probit/logit coefficients?

Thursday Afternoon: Dichotomous Dependent Variables and Probit/Logit Estimation – Part 2

How do probit/logit coefficients compare to each other and to OLS? How do you determine marginal effects in probit/logit?
What’s the major difference between probit and logit?
How do you test hypotheses in probit/logit? How do you determine which model is ”better”?
(time permitting) What about heteroskedasticity?
Review probit and/or logit in action (Geyskens et al 2015; Keller et al 2016; Liu et al 2016; Mantin and Eran 2016).

Friday Morning: Nonlinearities on the Right Hand Side and Possible Multicollinearity

How do we incorporate and interpret interaction effects in our regression models?
What is the main effect vs. simple effect? How can Stata help us to determine marginal effects when interactions are present?
What is multicollinearity? How will it impact our estimations? What are some strategies to measure and deal with possible multicollinearity.

Friday Afternoon: Endogeneity and Instrumental Variables – Part 1

What is ‘endogeneity’? What are some possible reasons why an independent variable could be endogenous? What impact does endogeneity have on OLS estimates?
What is an ‘instrumental variable’? What are the two important requirements for an instrumental variable? Given these requirements, how can instrumental variables address the problem of endogeneity?
What is Two-Stage Least Squares (2SLS) and how does it compare to OLS?

Monday Morning: Endogeneity and Instrumental Variables – Part 2

How do you test for ‘endogeniety’? How do you test the ‘overidentification restrictions’? How are overidentification restrictions interpreted?
How do you test for instrument relevance? What happens when instruments are ‘weak’?
2SLS vs. Generalized Methods of Moments (GMM) estimation
(time permitting) Endogeneity and systems of equations – what is three-stage least squares (3SLS)?

Monday Afternoon: Endogeneity and Instrumental Variables – Part 3

Instrumental variables in practice. Two big problems: weak instruments and exogeneity (Murray 2006)
The state of instrumental variables in marketing (Rossi 2014)
The hunt for good instruments (will definitely talk about: Elberse 2010; Germann et al 2015; Levitt 1996; Levitt 1997; Petersen et al 2015) (possibly talk about: Geyskens et al 2015; Keller et al 2016; Liu et al 2016; Mantin and Eran 2016).
”Best Practices”

Tuesday Morning: Selection Bias and Heckman’s Correction

What is selection bias and how will it impact estimation? What is a ‘control function’?
Heckman’s correction
Selection bias in action (Germann et al 2015; Liu 2016; Allen et al 2016)

Es gibt komplexe empirische Probleme, bei denen die normale OLS-Regression an ihre Grenzen stößt – dieser Workshop betrachtet zwei dieser Szenarien im Detail: (1) Nichtlinearitäten in den abhängigen und unabhängigen Variablen und (2) die Verwendung von instrumentellen Variablen, um mit Endogenität und einem nicht-zufällig gezogenem Sample umzugehen.

In diesem Seminar lernen Wissenschaftler die nötigen Werkzeuge, um mit diesen Unzulänglichkeiten der traditionellen OLS-Schätzung umzugehen.

Zuerst wird der Fokus auf unterschiedliche nicht-lineare Ansätze zur Modellierung von Discrete Choice-Problemen gerichtet. Diese Betrachtung wird durch verschiedene Interpretationsansätze von Interaktionseffekten zwischen unabhängigen Variablen in der traditionellen OLS-Regression ergänzt. Im Anschluss daran werden verschiedene Strategien zur Nutzung von instrumentellen Variablen zum Umgang mit Endogenitätsproblemen aufgezeigt.

Abschließend werden die besprochenen Thematiken zusammengeführt und im Kontext von Selection-Modellen, die einen Umgang mit nicht-zufällig gezogenen Samples ermöglichen, diskutiert.

Am Ende des Workshops sollten die Teilnehmer in der Lage sein, Stata-Code zu verstehen und anzuwenden, um dichotome abhängige Variablen mit Logit- und Probit-Schätzungen zu modellieren, instrumentelle Variablenschätzungen sowie begleitende Tests zur Relevanz und Exogenität der instrumentellen Variablen durchzuführen und die gewählten Modelle auf mögliche Selection Biases zu testen.

Um diese Lernziele zu erreichen, ist der Kurs in einer Weise gegliedert, die auf folgende Unterfragen und Aspekte abzielt:

Donnerstagmorgen: Dichotome abhängige Variablen und Probit/Logit Schätzungen – Teil 1

Was sind dichotome abhängige Variablen – wieso ist eine reguläre OLS-Regression in diesen Fällen weniger geeignet?
Was ist eine ”Link”-Funktion? Und wie sind probit und logit mit Wahrscheinlichkeiten verbunden?
Was sind die Vorteile von probit/logit im Vergleich zur regulären OLS-Regression? Wie interpretiert man ihre Koeffizienten?

Donnerstagnachmittag: Dichotome abhängige Variablen und Probit/Logit Schätzungen – Teil 2

Was sind die Unterschiede zwischen probit/logit-Koeffizienten? Und im Vergleich zur OLS-Regression? Wie bestimmt man marginale Effekte in probit/logit?
Was ist der Hauptunterschied zwischen probit und logit?
Wie testet man Hypothesen mit probit/logit-Modellen? Wie bestimmt man das „bessere” Modell?
(wenn es die Zeit zulässt) Was ist Heteroskedastizität?
Eine Betrachtung von probit/logit Anwendungsbeispielen (Geyskens et al 2015; Keller et al 2016; Liu et al 2016; Mantin and Eran 2016).

Freitagmorgen: Nichtlinearität auf der rechten Seite der Gleichung und Multikollinearität

Wie integriert und interpretiert man Interaktionseffekte in Regressionsmodellen?
Was ist der Unterschied zwischen Haupteffekten und einfachen Effekten? Wie kann Stata genutzt werden, um marginale Effekte bei Interaktionen zu berechnen?
Was ist Multikollinearität? Welchen Effekt hat es auf unsere Schätzungen? Wie kann man es messen? Und wie kann man mit den möglichen Folgen umgehen?

Freitagnachmittag: Endogenität und instrumentelle Variablen – Teil 1

Was ist Endogenität? Was sind mögliche Gründe für Endogenität? Und wie beeinflusst Endogenität die Schätzungen der OLS-Regression?
Was ist eine instrumentelle Variable? Was sind die zwei Voraussetzungen, die instrumentelle Variablen erfüllen müssen? Vor diesem Hintergrund: Wie können instrumentelle Variablen Endogenitätsprobleme adressieren?
Was ist eine Two-Stage Least Squares (2SLS) Schätzung und wie unterscheidet sie sich von der traditionellen OLS-Schätzung

Montagmorgen: Endogenität und instrumentelle Variablen – Teil 2

Wie testet man, ob Endogenität vorhanden ist? Wie testet man für Einschränkungen durch überidentifizierte Modelle? Wie kann man diese Einschränkungen interpretieren?
Wie testet man, ob das gewählte Instrument relevant ist? Was passiert mit „schwachen” Instrumenten?
2SLS vs. Generalized Methods of Moments (GMM) Schätzung
(wenn es die Zeit zulässt) Endogenität und Gleichungssysteme – was ist die three-stage least squares (3SLS) Schätzung?

Montagnachmittag: Endogenität und instrumentelle Variablen – Teil 3

Die Anwendung von instrumentellen Variablen – zwei große Probleme: Schwache Instrumente und Exogenität (Murray 2006)
Status Quo von instrumentellen Variablen im Marketing (Rossi 2014)
Die Jagd nach guten Instrumenten (wir diskutieren: Elberse 2010; Germann et al 2015; Levitt 1996; Levitt 1997; Petersen et al 2015) (wahrscheinlich auch: Geyskens et al 2015; Keller et al 2016; Liu et al 2016; Mantin and Eran 2016).
”Best Practices”

Dienstagmorgen: Selection Bias und Heckman-Korrektur

Was ist ein Selection Bias und wie beeinflusst er die Schätzung? Was ist eine ”Kontrollfunktion”?
Heckman-Korrektur
Beispiele für Selection Bias (Germann et al 2015; Liu 2016; Allen et al 2016)

Kurs im HIS-LSF

Lehrende/r: Ronny Behrens

Semester: ST 2018

ePortfolio: No