Welcome to localprojections’s documentation!
- class localprojections.LP(df, maxlags=1, sample=None, endogvars=None, responsevars=None, shocks=None, ci=0.9, use_t=True, use_correction=True, interaction=None, timelevel=None, cluster=None, identification='cholesky', trend=0)
Bases:
objectThe LP class implements the local projections methodology of Jorda (2005) and the R package lpirfs.
- dfpandas.DataFrame
DataFrame containing the time series or panel data to be used in the estimation.
- maxlagsint, optional
Maximum number of lags to include in the model. The default is 1.
- samplefunction, optional
Function that takes a pandas.DataFrame and returns a boolean array of the same length indicating whether to include that observation in the estimation. The default is True for all. Reason for this is that the user may want to exclude observations after lags are computed.
- endogvarsdict or list, optional
Lagged variables to include in the model. The default is None, in which case all variables are included. If a dict is supplied, the keys are the names of the variables and the values are the lags to include. Lags can be an integer
n, in which case lags will be1,...,n. Lags can also be a tuple(n,m), in which case lags will ben,...,m. Or they can be an exact list of lags. If a list is supplied, maxlags is used to determine the lags.- responsevarslist or string, optional
List of variables to use as response variables. The default is None, in which case all
endogvarsare used.- shockslist or string, optional
List of variables to use as shocks. The default is None, in which case endogvars are used. If shocks are not supplied, identification must be supplied to indicate how reduced-form innovations in endogvars are transformed into structural shocks.
- cifloat, optional
Confidence interval to use for the impulse response plots. The default is 90%.
- use_tbool, optional
In the case of time series data, whether to use t-statistics instead of z-statistics. The default is True. This is ignored for panel data, which always uses t-statistics.
- use_correctionbool, optional
In the case of time series data, whether to use small sample correction for the standard errors. The default is True. This is ignored for panel data, which always uses small sample correction.
- interactionstring, optional
Name of the variable to use for interaction terms. The default is None, in which case no interaction terms are used. If a categorical variable is supplied, a separate IRF path is computed for each level. If a continuous variable is supplied, the base effect and interaction effect are computed.
- timelevelstring, optional
Name of the time index in the DataFrame. The default is None, in which case the last index is assumed to be the time index.
- clusterbool or string, optional
Whether to use clustered standard errors in panel estimation. If True, all index levels other than the time index are used for clustering. If a string, that variable is used. If None, Driscoll-Kraay standard errors with Bartlett Kernel are used with a bandwidth of
horizon + 1. cluster is ignored for time series data, which always uses Newey-West standard errors with a bandwidth ofhorizon + 1.- identificationstring or 2D ndarray, optional
Method to use for identification if no shocks are supplied. The default is
'cholesky', in which case the Cholesky decomposition of the reduced-form covariance matrix is used using the ordering in endogvars. If a 2D ndarray is supplied, it is used as the identification matrix.- trendint, optional
Order of polynomial trend to include in the model. The default is 0, in which case no trend is included. If 1, a linear trend is included. If 2, a quadratic trend is included, etc.
Jorda (2005) defines local projections as a series of separately estimated regressions where a shock at time \(t\) is used to predict the response variable at time \(t+h\) for \(h=0,1,...,H\).
Without exogenously identified shocks, the model is written as:
\[y_{t+h} = \alpha_{h} + \sum_{s=1}^p \beta_{h,s}' x_{t-s} + \epsilon_{t+h}, \qquad h=0,1,...,H \]where \(y_{t+h}\) is the response variable at time \(t+h\), \(x_{t-s}\) is the vector of endogenous variables at time \(t-s\) (potentially including \(y_{t}\)). The impulse response to the reduced-form shock is \(\hat{beta}_{h,1}\). It is transformed into a structural shock using the identification matrix \(B\) such that
\[I(h) = B \hat{\beta}_{h,1}\]\(B\) is computed separately. By default, it will be estimated using the Cholesky decomposition of the equivalent VAR(p) model. Alternatively, the user can supply an identification matrix.
With exogenously identified shocks, the model is written as:
\[y_{t+h} = \alpha_{h} + \beta_h' z_t + \sum_{s=1}^p \gamma_{h,s}' x_{t-s} + \epsilon_{t+h}, \qquad h=0,1,...,H \]A panel version estimates effect \(\beta_h\) of \(x_{i,t-1}\) on \(y_{i,t+h}\) using fixed effects \(\alpha_{i,h}\).
- design_matrices(rhs, responsevar)
This function generates the design matrices for the regression. It is used internally by the
estimateandorthogonalizemethods. It will always return the design matrices for a contemporaneous regression. Subsequent lags are handled by theshift_lhsmethod. This is done for efficiency, since creating the design matrices can be quite slow.- rhsstring
Right-hand side of the regression.
- responsevarstring
Response variable of the regression (LHS)
This function uses the
patsypackage to generate the design matrices. Maybe in the future we will switch toformulaicwhich is faster.- dflhsDataFrame
LHS design matrix
- dfrhsDataFrame
RHS design matrix
- estimate(max_horizon, shock_size=None)
This method estimates the impulse response functions using local projections.
- max_horizonint
Maximum horizon to compute the impulse response functions.
- shock_sizefloat, optional
Size of the shock. The default is 1 for identified shocks and standard deviation for orthogonalized shocks.
- coefsDataFrame
- DataFrame containing the impulse response functions. The indices are:
response: response variableimpulse: impulse variablehorizon: horizoninteraction: interaction variable (if applicable)
- The columns are:
coef: impulse response functionlb: lower bound of the confidence intervalub: upper bound of the confidence interval
- regresultsdict
Nested dictionary containing the regression results. The keys are the horizons and the values are dictionaries containing the regression results for each spec.
- estimate_var()
This function estimates a VAR model. It is used internally by the
orthogonalizemethod to estimate the covariance matrix of the residuals, which is used to orthogonalize the shocks.- constDataFrame
DataFrame containing the constant term of the VAR model.
- outcoefsdict
- Dictionary containing the coefficients of the VAR model. The keys are:
coef: coefficientslb: lower bound of the confidence intervalub: upper bound of the confidence interval
Each dictionary value is a DataFrame with a row for each response variable and a column for each impulse variable and lag.
df[y,(x,s)]represents(y,x)th element of thesth lag of the VAR.- SigmaDataFrame
DataFrame containing the covariance matrix of the residuals.
- gen_rhs(endogvars=None, shocks=None, responsevars=None, use_interaction=True)
This function generates the right-hand side of the model. It is used internally by the
estimateandorthogonalizemethods.- endogvarsdict, optional
Lagged variables to include in the model. The default is None, in which the object instance’s
endogvarsattribute is used.- shockslist, optional
List of variables to use as shocks. The default is None, in which case the object instance’s
shocksattribute is used.- responsevarslist, optional
List of variables to use as response variables. The default is None, in which case the object instance’s
responsevarsattribute is used.- use_interactionbool, optional
Whether to include interaction terms. The default is True, which means the instance’s
interactionattribute is used. False means no interaction terms are used, whatever the instance’sinteractionattribute is. This is useful for theorthogonalizemethod, which estimates a VAR model and isnt’ compatible with interaction terms.
rhs : string
- orthogonalize(identification)
This function orthogonalizes the shocks using the supplied identification strategy.
- identificationstring or ndarray
- If
identificationis a string, it must be one of the following: 'cholesky': use the Cholesky decomposition of the covariance matrix of the residuals.
Otherwise,
identificationmust be a square matrix with the same number of rows as endogenous variables. The matrix must be invertible.- If
- orthofunction
Function that takes a vector of shocks and returns a vector of orthogonalized shocks.
- run_regression(dflhs, dfrhs, nwlags=0)
This function runs the regression for a given horizon. It is used internally by the
estimateandorthogonalizemethods.- dflhsDataFrame
LHS design matrix
- dfrhsDataFrame
RHS design matrix
- nwlagsint, optional
Number of Newey-West lags to use. The default is 0, which means no Newey-West correction.
- outDataFrame
- DataFrame containing the regression results. The columns are:
params: regression coefficientslb: lower bound of the confidence intervalub: upper bound of the confidence interval
- residsSeries
Series containing the regression residuals.
fit : RegressionResults from OLS or PanelOLSResults from PanelOLS
- shift_lhs(dflhs, dfrhs, horizon=0)
This function shifts the LHS design matrix by the horizon. It is used internally by the
estimateandorthogonalizemethods to compute the impulse response functions at different horizons. After shifting, the resulting missing rows are removed.- dflhsDataFrame
LHS design matrix
- dfrhsDataFrame
RHS design matrix
- horizonint, optional
Horizon to shift the LHS design matrix. The default is 0, which means no shift.
This function creates copies of the design matrices because the caller may need to use the original matrices for other horizons. We can refactor this later to avoid the copies if we can be sure that horizons are always incremented sequentially.
- dflhsDataFrame
LHS design matrix shifted by the horizon.
- dfrhsDataFrame
RHS design matrix shifted by the horizon.
- localprojections.drop_singletons(df, level)
Find levels of a MultiIndex that have only one value and drop them.
- localprojections.fill_index_level(df, level=0)
Fill in missing values in a dataframe index. For MultiIndexes, this doesn’t necessarily generate a balanced panel. It just fills in missing times within the entity-specific range of times.
- dfDataFrame
DataFrame with a MultiIndex to fill.
- levelint or str, optional
Level of the index to fill. The default is 0.
- dfDataFrame
DataFrame with filled index.
- localprojections.flatten(nested_list)
Recursively flatten a nested list
- localprojections.lag(x, n=0, cxlevel=None)
Lag by
nperiods. Ifcxlevelis supplied, lag within each cross-section.- xSeries, DataFrame, or ndarray
Object to lag.
- nint, optional
Number of periods to lag. The default is 0. Negative numbers shift forward.
- cxlevelstr or list, optional
Name of the cross-section level(s) of the index. The default is None.
- localprojections.make_iterable(x, n=1)
Make an object iterable. If
xis already iterable, return it. Otherwise, ifnis an integer, return a list of length n with each element equal toxnis a list, return a dictionary with keysnand valuesx
- localprojections.make_polynomial(var, n, pre='', post='')
Create a formula for a polynominal of degree n in var, prepending or postpending additional strings if needed.
- localprojections.plot_irf(dftmp, impulse=None, response=None, interaction=None, colorlevel='interaction', colorvalues=None, ax=None, colormap=None, legend_here=True, title_fcn=None)
Plot a single impulse response function. This function can be called directly by the user or by the
plot_irfsfunction that plots a grid of IRFs.- dftmpDataFrame
DataFrame containing the impulse responses. The expected contents of the DataFrame depend on how the function is called.
- If
impulseandresponseare both supplied, thendftmpcan contain multiple IRFs, but its indices must be, in order,
impulse,response,horizon.
- If
- If
impulseandresponseare not supplied, thendftmpmust contain a single IRF.
- If
- In either case, the DataFrame must contain the following columns:
coef: impulse response functionlb: lower bound of the confidence intervalub: upper bound of the confidence interval
If
colorlevelis supplied, then the DataFrame must also contain a column with the namecolorlevel.- impulsestr, optional
Name of the impulse variable. The default is None, in which case the DataFrame must contain a single IRF.
- responsestr, optional
Name of the response variable. The default is None, in which case the DataFrame must contain a single IRF.
- interactionstr, optional
Name of the interaction variable. The default is None.
- colorlevelstr, optional
Name of the column in
dftmpthat indexes the dimension that will be represented by color. The default is ‘interaction’.- colorvalueslist, optional
List of values of
colorlevelto plot. The default is None, in which case all values of thecolorlevelcolumn are plotted.- axmatplotlib Axes, optional
Axes on which to plot. The default is None, in which case a new figure and axes are created.
- colormaplist, optional
List of colors to use for each value of
colorvalues. The default is None, in which case the default matplotlib color cycle is used.- legend_herebool, optional
Whether to plot the legend on the axes. The default is True.
- title_fcnfunction, optional
Function that takes
impulse,response, andinteractionas arguments and returns a string to use as the title. The default is None, in which case the title isimpulseonresponse.
- localprojections.plot_irfs(dfirf, impulses=None, responses=None, interactions=None, rows='impulse', columns='response', color='interaction', colormap=None, styles=None)
Plot a grid of impulse response functions.
- dfirfDataFrame
- DataFrame containing the impulse responses. It must contain the following indices:
impulse: name of the impulse variableresponse: name of the response variablehorizon: forecast horizoninteraction: optional, name of the interaction variable
- It must also contain the following columns:
coef: impulse response functionlb: lower bound of the confidence intervalub: upper bound of the confidence interval
The first argument returned by
estimate()is a suitable DataFrame.- impulseslist, optional
List of impulse variables to plot. The default is None, in which case all impulse variables in
dfirfare plotted.- responseslist, optional
List of response variables to plot. The default is None, in which case all response variables in
dfirfare plotted.- interactionslist, optional
List of interaction variables to plot. The default is None, in which case all interaction variables in
dfirfare plotted.- rowsstr, optional
Name of the index level to use for the rows of the grid. The default is ‘impulse’.
- columnsstr, optional
Name of the index level to use for the columns of the grid. The default is ‘response’.
- colorstr, optional
Name of the index level to use for the color of the lines. The default is ‘interaction’.
- colormaplist, optional
List of colors to use for each value of
colorvalues. The default is None, in which case the default matplotlib color cycle is used.- styleslist, optional
NOT IMPLEMENTED YET: List of line styles to use for each value of
colorvalues. The default is None, in which case the default matplotlib line styles are used.
- localprojections.set_lags(endogvars_list, maxlags)
Assign the same max lags for each endogenous variable. Returns a dict with the same keys as
endogvars_listand the same value for all keys.