NbbTools
INTRODUCTION
The
National Bank of Belgium produces many official statistics, like, among others,
financial statistics, the balance of payments, the national accounts or the
external trade statistics.
In the
production and in the analysis of those statistics, we are continuously
confronted with some time series problems:
- Outliers
detection and estimation of missing values are a constant concern in the
production process.
- The
estimation of some figures relies on complex statistical methods: business
surveys have to be treated by seasonal adjustment procedures, quarterly
national accounts are partly based on temporal disaggregation techniques,
...
- A
critical analysis of the statistics often requires the modeling of the
series.
- ...
The program Tramo-Seats of Gomez-Maravall-Caporello offers efficient
solutions to several of those problems. It has been used at the National Bank of
Belgium for a long time.
However,
some of its constraints and limitations - essentially technical - appeared
progressively.
Statistical
algorithms must often be integrated in completely different tasks/environments.
Outliers detection, for example, can be used in batch processing of many series;
seasonal adjustment must be embedded in some automated production chains, like
the business surveys; advanced graphical interfaces should also be available for
detailed analysis, while, for some unskilled employees,
black-box functions integrated in Excel are the preferred solution. The
current implementations of Tramo-Seats (DOS programs or TSW)
do not have the purpose of satisfying all those needs.
A first attempt to improve the integration of Tramo-Seats in different
programs was the encapsulation of the file based input/output used by the
FORTRAN module in a programming interface (using the COM technology). However,
the use of file based I/O led to substantial performance penalties and to
several shortcomings or complications in results retrieval. A better solution,
that replaced I/O operations by direct function calls into the FORTRAN library,
was then realized in collaboration with G.L. Caporello.
Despite its interest, that last solution appeared quite quickly
frustrating. Indeed, it didn't provide a true object-oriented (OO henceforth)
solution: the results supplied by the FORTRAN modules were "dead"
information; it was not possible to enrich those structures by the
methods/properties that should have been part of their definition, as is usual
in OO design; that was an annoying limitation in building advanced interfaces or
extensions. But more importantly it didn't fit with our wish to get a coherent
time series framework. Tramo-Seats is designed to provide an efficient solution
to some very specific problems. Nevertheless some of its concepts/algorithms can
be very useful to tackle other ones. Unfortunately, the closed character of the
program doesn't allow their reuse.
These considerations explain why we decided to build a completely new open
software library in the time series domain, using Tramo-Seats as a guideline.
As far as technology is concerned, OO components based on standard
technologies form a very interesting
solution in terms of integration, extensibility an reusability. Because their
underlying technology is largely accepted by the software realm,
OO components can quite easily be embedded in many different
environments, from commercial software to a variety of tools for in-house
developments. Java is a popular solution when portability matters, while .NET
becomes the norm for Windows applications.
We provide implementations of our time series library in both
technologies, using the same object-model. It should also be stressed that,
compared to other more traditional development languages like FORTRAN or C,
those technologies yield much more robust solutions.
General presentation of the library
The library presented here is not a program in the usual sense. It is a
toolbox that can be used to solve a lot of problems in the time series domain.
So it is intended more for advanced users or for programmers than for
"pure" statisticians.
It is
designed to provide efficient solutions that can be easily plugged in a variety
of applications, ranging from rich graphical interfaces to batch processing.
The library can be seen as an extension of standard languages (C#, VB or
Java) with concepts that should be meaningful to statisticians. The content of
the library has been defined as follows: the main high-level concepts/algorithms
handled by Tramo-Seats were identified in a first step. Then, the underlying
lower-level entities were recursively defined. New high-level concepts that do
not necessary belong to the Tramo-Seats sphere were gradually added (for
instance temporal disaggregation methods, structural models or the X11
algorithm). Existing lower-level entities have been, as far as possible, reused
or adapted. So the library provides a variety of interconnected objects, ranging
from high-level entities to very basic ones. The most important concepts are
listed below:
- High-level
concepts/algorithms
- Tramo-Seats
- Temporal
disaggregation
- Estimation
of structural models
- X11
- ...
- Medium-level
concepts
- Time
series
- ARIMA
models
- UCARIMA
models
- Wiener-Kolmogorov
filters
- State
space forms, Kalman filters/smoothers
- GLS
with ARIMA noise
- Structural
models
- ...
- Low-level
concepts
- Mathematics
- Complex
numbers
- Polynomials
(with a special handling of unit roots)
- Rational
functions
- Linear
filters (finite, semi-infinite, infinite)
- Matrix
(including the usual decomposition algorithms: Cholesky, QR, LU, SVD,
...) and some current operations in time series analysis.
- Function
optimization (Levenberg-Marquardt, BFGS, ...)
- Statistics
- Descriptive
statistics
- Random
numbers
- Statistical
distributions
- Statistical
tests
- Regression
Different
applications - Stand-Alones or MS-Excel Add-Ins - based on the time series
libraries are also available for end-users. Once again, their should be
considered as prototypes.