# Interpolation methods and the Hagan-West paper

Interpolation is a very useful technique for extracting data when the available information does not come in a continuous form.

From a non-technical point of view, any inference or decision process (sometimes subconsciously) is based on a kind of interpolation or best fitting or regression of the available informations. We as people are normally quite good at generalising (often too fast) from the little amount of information that we have about other people, situations, or even numerical data. This is possible because our brain can recognise patterns and see trends in any kind of data. However, technically speaking, interpolation is more that just finding a trend.

Technically, we are often given a discrete set of data corresponding to a certain function which is known at specific points, or nodes (for example, we have made an experiment for specific input values and measured the outputs  corresponding to that input), and is otherwise unknown.  In principle this is a multi-dimensional problem, and the interpolating hyper-surface will give an idea of the missing information. In fact, even if it is true that such a hyper-surface can always be numerically constructed,  however the uniqueness issue remains. Given the same input data, many different constructions can be engineered, all satisfying to various -more or less realistic- criteria, and all passing through the same input points.

This non-uniqueness property is due to the overwhelming amount of freedom that one has when setting up an interpolation procedure. On one side, the interpolation function could be almost anything, on the other side the only necessary constraint is that the function pass through the nodes. The latter sounds like a very strong constraints, but in reality it is not, since it is not enough to fix many parameter that are otherwise undetermined and free. What we can do is: we can play with additional constraints that make our solution look good. By doing that we can steer the outcome to have certain properties. We will see now how this works in practice by considering the one-dimensional case. The function is known only at the specific input values, and is otherwise unknown.

In the one-dimensional case, the solution that we are looking for is a one-dimensional function f(x). The function is known only at the specific input values of the x variable, and is otherwise unknown. For example, in this plot some values of an unknown function f are shown in correspondence to the input x variable equal to 0.1, 1, 4, 9, 10, 20, 30 (we don't need to concern ourself about what on the x and y axis is). The question is now: can we construct a curve that passes through all the points, and if so how? The answer to this question is yes, and there are many ways to do it.

Typical features of the interpolating function that we can play with when we perform an interpolation procedure are the following:

• Passing through the points. This is always guaranteed by definition on interpolation.
• Continuity. It is always guaranteed by construction.
• Smoothness. Depending an the method used in the interpolation, the interpolating function can be more or less smooth at the node points (it is of course always smooth away from the nodes). For instance, in linear interpolation, it is not differentiable by definition, since the sloop changes abruptly at the node points, while in spline interpolation differentiability is achieved by requiring proper boundary conditions for the derivatives at the nodes.
• Piecewise. The interpolation function is typically defined piecewise. This means that the function is determined by using only a few of the data points (typically two or three consecutive ones) plus enforcing additional constraints at the boundaries of each interval.
• Zig zag behaviour. This strongly depends on the chosen method. Some methods (e.g. the polynomial interpolations) manifest a strong oscillating -and unrealistic- behaviour between consecutive data points; others instead (e.g. the monotone preserving cubic spline) use an additional constraints to guarantee some degree of monotonicity in the solution.

There are many methods available in the mathematical literature. Generically speaking, we have a few categories. The simplest method is linear interpolation, where one simply connects any two consecutive points with a straight line. As a result, the curve will show non-differentiable spikes., which is not always what we are looking for. In order to account for such an unwanted feature, there are several polynomial-like interpolation techniques (including the splines), where a polynomial curve is drawn through consecutive points. The degree of smoothness is now connected to the choice of how many derivatives we want to be constrained to be continuous. The drawback is however a strong -and probably unrealistic- oscillating behaviour between the node points, which is not an intrinsic feature of the true function but a spurious effect created by the chosen interpolation method. Finally, in order to take this oscillations into account, some algorithms have been developed that enforce monotonicity of the interpolating function, in such a way that the trend of the curve follows the trend of the data points.