Understanding the MDFA-Package: a Step-by-Step (Line by Line) Introduction

The tutorial proposed in 1 replicated chapters 1-4 in MDFA. I'd like to briefly review the various functions available in the MDFA-package and discuss their handling (parametrization, data-feed).

Objectives

The R-files provided in 1 are: MDFA_Legacy_MSE.r and control_default.r. I'm now assuming that the files have been installed according to the instructions in 1 and that the examples are running. I now briefly go sequentially, from top to bottom, along the (code-) lines of the first file in order to discuss the various functions available in the MDFA-package as well as their parametrization and the (particular) way the data is fed to the estimation routines (the peculiarity has some deeply rooted explanation/motivation).

To be clear: the following R-code is copied from MDFA_Legacy_MSE.r. It is partitioned into thematic blocks which are the subject(s) of discussion in this (blog-)post.

Installation

No additional comments here, see 1 for reference.

### R code from vignette source 'C:/wia_desktop/2018/Projekte/MDFA-Legacy/Sweave/Rnw/MDFA_Legacy.Rnw'
### Encoding: ISO8859-1

###################################################
### code chunk number 1: init
###################################################
rm(list=ls())

###################################################
### code chunk number 2: init
###################################################
# Load packages: time series and xts
#library(tseries)
library(xts)
# Library for tables
library(Hmisc)
require(xtable)
#install.packages("devtools")
library(devtools)
# Load MDFA package from github
devtools::install_github("wiaidp/MDFA")
# MDFA package
library(MDFA)

path.pgm <- paste(getwd(),"/",sep="")

DFA: Univariate

The following functions in the MDFA-package correspond to the DFA (univariate filtering) extensively discussed in DFA (part of a series of lectures intended for engineering classes: the students receive the R-code generating the script in the Sweave environment). The available DFA-functions are:

per(): computes the periodogram of a time series
dfa_ms: computes (univariate) real-time (concurrent) filters based on the MSE-criterion
dfa_analytic: replicates dfa_ms (MSE) and extends the estimation to customized filters (ATS-trilemma discussed in The Trilemma Between Accuracy, Timeliness and Smoothness in Real-Time Signal Extraction)

###################################################
### code chunk number 6: dft
###################################################
head(per,100)

###################################################
### code chunk number 7: dfa_ms
###################################################
# This function computes MSE DFA solutions
# L is the length of the MA filter,
# periodogram is the frequency weighting function in the DFA
# Gamma is the transfer function of the symmetric filter (target) and
# Lag is the lag-parameter: Lag=0 implies real-time filtering, Lag=L/2
# implies symmetric filter
# The function returns optimal coefficients as well as the transfer
# function of the optimized real-time filter
head(dfa_ms,100)

###################################################
### code chunk number 8: dfa_ms
###################################################
head(dfa_analytic)

I skip further discussion of these functions because

The implementation is partially 'ad hoc': (too) short coding
The functions are replicated by the general multivariate MDFA-function(s) see consistency-check (ad hoc simplifications are now replaced by bomb-proof formal expressions)
The DFA functions cannot tackle overfitting (regularization), multivariate problems or unequally-sampled data, for example

The inclusion of the DFA (above functions) in the MDFA-package is not necessary but useful. The main reason for including them is teaching: the functions are much simpler (much shorter code) to introduce and to explain; and the main concepts of a 'Direct Filter Approach' (see what-is-a-direct-filter-approach) can be illustrated without undesirable (multivariate) distractions. Also, I know of some users outside who are working with these (or similar?) functions and the package now provides a reference for benchmarking. Be warned: I won't up-date this material in the future (which might be considered a blessing).

Feeding the Data

At this (early) stage, I have to distinguish time-domain and frequency-domain:

Mappings from one domain to the other are provided by Fourier and by inverse Fourier transforms, respectively, see DFA section 2.2 for R-coded examples.
The latter (frequency-domain) has some deeply-rooted 'structural advantage' over the former (time-domain) which allows MDFA to tackle a series of practically relevant estimation problems which cannot be addressed by classic time series approaches (immanent impossibility: search for 'forecast dilemma' in Advances in Signal Extraction and Forecasting and look for section 2.5 in The Trilemma Between Accuracy, Timeliness and Smoothness in Real-Time Signal Extraction).

Data (Time Domain)

The data must have the following (matrix-) structure:

The first column always corresponds to the target series

The concept of a 'target' typically supposes a time series that is not observable at current time (today, at the end of the sample). So we have to estimate it (the very purpose of DFA/MDFA).
As an example, the target could be tomorrow's realization (one-step or one-day ahead forecast). Or it could be a k-day ahead forecast, k>1. Or it could be a weighted combination of one- and k-days ahead forecasts (typical for signal extraction problems)
To be clear (avoid confusions): the series in the target-column (first column of data matrix) must be observable since we want to feed this data to DFA/MDFA. The effective target (forecast/signal extraction) will be computed at a later stage, withing the DFA/MDFA functions (after having fed the data). As an example, the parameter Lag in the function-calls can be used to shift the series in the target-column (Lag=-1 would correspond to a one-step ahead forecast, for example).

The other columns (columns 2,3,...) correspond to the explanatory data: this is used for estimating the target.

The MDFA-filter will be applied to the explanatory data, only

The series in the target column (first column) could also belong to the explanatory columns (duplicated). In such a case, it is assumed that the series appears in the second column of the data-matrix: the first column (target) is duplicated in the second column (the first explanatory column), see example below. There is some internal coding which relies on this assumption.
Note, however, that the series in the target column (first column) does not have to be part of the set of explanatory series. As an example, when tracking recession draw-downs in (US-) GDP I use classic (monthly) indicators as explanatory series but in general I won't include GDP as an explanatory series (because it is subject to revisions and to publication lags; not because it is quarterly: that can be fixed by the generic mixed-frequency extension...).
In a univariate setting (DFA) the data-matrix is always 2-dimensional and (it is assumed that) the target-series is also the explanatory series (duplication).

Hereunder is an example for organizing the data in a bivariate estimation problem: the first column is the target series, the second column is a duplicate of the first one and the third column is additional data, related to the target (bivariate design). If we skipped the last (third) column, then the corresponding T*2-dim matrix (with duplicated columns) would correspond to a univariate estimation problem.

###################################################
### code chunk number 9: dfa_ms
###################################################
set.seed(1)
len <- 100
target <- arima.sim(list(ar=0.9),n=len)
explanatory_2 <- target+rnorm(len)
explanatory <- cbind(target,explanatory_2)
x <- cbind(target,explanatory)
dimnames(x)[[2]] <- c("target","explanatory 1","explanatory 2")
head(x)

Note that we could shift the explanatory series relative to the target (or vice versa) as done next in the code:

###################################################
### code chunk number 10: dfa_ms
###################################################
x<-cbind(x[,1],lag(x[,2:3],-1))
dimnames(x)[[2]]<-c("target","lagged explanatory 1","lagged explanatory 2")
head(x)

This would correspond to a one-step ahead forecast framework. Note, however, that we would lose one data point (the last observations of the explanatory data). Moreover, typical signal extraction problems make use of the 'far away' future and we would loose lots of relevant information by shifting the data accordingly.

So the point is: Don't shift the data!

If shifting is required, it will be done at a later stage, in the MDFA functions, by suitable parameter settings, and no observations will be lost, whatever the target specification (because the data will be 'rotated' instead of 'shifted'...).

Data (Frequency Domain)

Now... the above data-matrix is not fed directly to MDFA: it has to be transformed into the frequency-domain before!

Here's some background on the topic:

The criteria are implemented in the frequency-domain

because tracking the filter error 'directly' is much simpler in the frequency-domain and, more importantly, because the ATS-trilemma could not be derived in the time-domain
the key 'structural' advantage of the frequency-domain (over the time-domain) is that convolutions (of filters) can be disentangled into products (of transfer functions)
that's the secret of DFA: it makes use of a structural elegance...

The data enters the criteria in a 'frequency-domain'-form (not in the original time-domain form). We are talking about a spectrum (for example DFT, periodogram, hilbert-spectrum, model-based spectrum,...)

Search for "(MSE-) Criterion" in Advances in Signal Extraction and Forecasting or look at section 4.1 in DFA)

The spectral estimate (the transformation of the time series into the frequency-domain) is not uniquely/unambiguously determined: there's no single overall best frequency-domain transformation.

Some users (count me in) are fine with the periodogram (Discrete Fourier Transform, DFT)
Some others (I know quite a few) would prefer a model-based spectrum estimate, instead
And still some others want to do it 'differently'

I wanted the user to be free to use any preferred spectrum. Therefore, the transformation of the data in the frequency-domain must be accomplished beforehand supplying the data to the MDFA-functions

All MDFA functions assume that the data has already been transformed (frequency-domain form)
The matrix that collects the transformed data is called 'weight_func' (the criterion is a weighted approximation in the frequency-domain). The matrix weight_func is the data 'as seen' by the MDFA-criteria. Its entries are generally complex valued (not real numbers).
Each column of the matrix weight_func corresponds to a column of the original data matrix.
I personally use the DFT: when feeding data to any MDFA functions I plug the DFT into weight_func (but you could plug a model-based spectrum, for example)

Here's my workhorse function for transforming the original data matrix into a spectrum (DFT actually):

###################################################
### code chunk number 11: dfa_ms
###################################################
spec_comp

Generic MDFA Function

The following is (the head of) the generic (most general) MDFA-function. There are a whole lot of mysterious parameters involved!

###################################################
### code chunk number 12: dfa_ms
###################################################
head(mdfa_analytic)

This function replicates many (I'm inclined to think 'most'...) linear forecast and signal extraction paradigms, including classic time series analysis (ARIMA, VAR, exponential smoothing,...), state space models and/or classic filter designs. It tackles forecasting (one/multi-step ahead) and signal extraction. It replicates MSE-performances and allows for customization (ATS-trilemma). It handles univariate (replicates DFA) and multivariate problems. It tackles overfitting by relying on novel 'universal' regularization concepts. It can handle mixed-frequency problems. It allows for effective embedding of (arbitrarily complex/rich) expert knowledge. See Advances in Signal Extraction and Forecasting.

The scope (generality/flexibility) of the function is reflected by the parametric manifold which allows shaping/tailoring of the generic optimization criterion to 'problem structures' and to 'user priorities'.

Next, I consider a delimited application-field of the above function, a generic MSE-framework, which is analyzed in MDFA, chapter 4 (from which the R-code has been copied). The generic MSE-framework includes and embeds classic time series approaches (see for example chapter 9 in MDFA) and it is obtained here by a particular parametric encoding in the function-call (of mdfa_analytic).

Generic MSE-Framework: Standard Parameter Setting

What does 'generic MSE' mean, please?

The predicate 'generic' refers to the fact that the classic mean-square one-step ahead forecast criterion, prevalent in forecasting, is a simplified version of the (more general) mean-square filter error criterion of the DFA/MDFA: the latter can address a much wider class of potentially interesting estimation problems, see Optimal Real-Time Filters for Linear Prediction Problems.

In a MSE-perspective, many features of the above function (mdfa_analytic) are 'sleeping' and most parameters can be assigned default values. Let's see how this is obtained in the R-code. First we provide some (irrelevant) dummy values for the spectrum (weight_func) and for the filter-length (a filter of length L will be assigned to each explanatory series): these are mandatory entries for all MDFA-functions but here they are just arbitrary place-fillers:

###################################################
### code chunk number 13: dfa_ms
###################################################
weight_func <- matrix(rep(1:6,2),ncol=2)
L <- 2

Next we look at the default MSE-settings:

###################################################
### code chunk number 14: dfa_ms
###################################################
d<-0
lin_eta<-F
lambda<-0
Lag<-0
eta<-0
i1<-F
i2<-F
weight_constraint<-rep(1/(ncol(weight_func)-1),ncol(weight_func)-1)
lambda_cross<-lambda_smooth<-0
lambda_decay<-c(0,0)
lin_expweight<-F
shift_constraint<-rep(0,ncol(weight_func)-1)
grand_mean<-F
b0_H0<-NULL
c_eta<-F
weights_only<-F
weight_structure<-c(0,0)
white_noise<-F
synchronicity<-F
cutoff<-pi
lag_mat<-matrix(rep(0:(L-1),ncol(weight_func)),nrow=L)
troikaner<-F

These are the standard-settings for the simplest estimation problem:

lambda=eta=0 together with cutoff<-pi: MSE-norm (no customization),
d=0: stationary series (OK when working with differenced economic data),
i1=i2=F: no filter constraints, together with weight_constraint<-rep(1/(ncol(weight_func)-1),ncol(weight_func)-1) and with
shift_constraint<-rep(0,ncol(weight_func)-1)
lambda_cross=lambda_smooth=0 and lambda_decay=c(0,0): no regularization,
lag_mat<-matrix(rep(0:(L-1),ncol(weight_func)),nrow=L): equally-sampled data (no mixed-frequency),
b0_H0<-NULL: no embedding of expert knowledge
troikaner<-F: avoid numerically consuming computations useful when using regularization (which is not the case in the default MSE-setting). Setting troikaner=T would just consume more time/heat for the same result (in this standard MSE-setting).

These parameters and their usage (outside of the standard MSE-setting) are analyzed and discussed in dedicated chapters of MDFA and the specific problem-structures and user-priorities are illustrated by sample (R-) code. The left-over parameters (such as for example synchronicity or c_eta or...) in the above list are 'experimental' (not discussed in MDFA).

In the book (the Rnw-file), the above standard (MSE-) parameter-setting is sourced via

###################################################
### code chunk number 15: dfa_ms
###################################################
source(file=paste(path.pgm,"control_default.r",sep=""))

However, I supplied also a dedicated MSE-function in the MDFA-package

###################################################
### code chunk number 17: dfa_ms
###################################################
head(MDFA_mse)

together with other additional dedicated functions

head(MDFA_mse_constraint)
head(MDFA_cust)
head(MDFA_cust_constraint)
head(MDFA_reg)
head(MDFA_reg_constraint)

MDFA_mse is a wrapper which conditions the generic mdfa_analytic function for 'standard MSE' applications as described above: looking at the corresponding function-call shows that it requires much less parameters to be run. Illustrative examples are provided in the sample code (the examples in chapter 4).
MDFA_mse_constraint is a wrapper for MSE-problems with additional filter constraints (the function-call includes additionally i1 and i2, among others).
MDFA_cust is a wrapper for customized filters without regularization and without constraints (lambda, eta and cutoff appear in the corresponding call and i1,i2 vanish, among others).
MDFA_cust_constraint then adds filter constraints.
MDFA_reg allows regularization (lambda_cross, lambda_decay and lambda_smooth appear in the corresponding function call).
MDFA_reg_constraint then adds filter constraints.

The additional functions will appear in later chapters of the book and the corresponding parameters and their effects as well as the relevant empirical context are discussed at length, see MDFA. All dedicated functions call the generic mdfa_analytic, after suitable initializations of (hyper-) parameters.

The Four Steps from Input-Data to Target-Estimate (Filter Output)

How do we proceed to obtain an estimate of the target (the output of the optimal multivariate filter) from some input data? Here's the complete sequence:

Step 1: A data-matrix is set-up according to the above 'recipe'
Step 2: The data-matrix is transformed into the frequency-domain either by using the DFT (function spec_comp above) or any preferred 'spectrum' alternative (for example model-based, see chapter 9 in MDFA)

This is an important step because the user can shape the criteria by specifying a preferred way by means of which the data is 'seen' in the MDFA-function(s).
This step is crucial when embedding classic approaches into MDFA.

Step 3: The transformed data, as collected in weight_func, is fed to any MDFA function(s) which then proceeds to optimization and returns a filter

The function returns a filter which is optimal according to a particular (hyper-) parameter configuration: MSE, customization, constraints, regularization, mixed-frequency, revisions, fore/now/backcasting,stationary/non-stationary series,...
A multivariate filter is a matrix of filter coefficients: each column of the filter matrix has been optimized for the corresponding explanatory series (in the original data matrix).
Once the coefficients are returned, the MDFA-function is 'done'

Step 4: the data must be filtered in the time-domain.

For each explanatory series in the data-matrix the user must apply the corresponding filter in the filter-matrix
The final estimate (the output of the multivariate filter) is obtained by aggregating the individual filter outputs cross-sectionally (equally-weighted), see for example MDFA, exercise 5 in section 4.7.1 (two lines of code...)
So we now have the target-estimate: the output of the multivariate filter...

Why did I not include the last filtering-step in the MDFA-functions? Simple: I want a 'beautiful wall' (a clear cut) between time-domain and frequency-domain and I don't want time series to soil the MDFA functions (call me sectarian).

In a nutshell:

Estimation/optimization is performed in the frequency-domain (because of some structural advantages)
Filtering is performed in the time-domain (outside the MDFA-functions).

Maybe I'll add a filter function to the package...

Replicating the Examples

The remaining code in MDFA_Legacy_MSE.r replicates chapter 4 in MDFA (generic MSE). I will discuss some of the examples, in particular the bivariate leading indicator design, in a separate blog-post.

Search This Blog

Signal Extraction and Forecasting

Next Time