Hessian Estimation NOT only for Minimization Problems

Cuchulainn · February 19th, 2024, 5:19 pm

Hi,
I was able to use WinZip to unzip the .7z without any hassle (but Winzip is a 7-day trial).
Code compiled in one go without warnings.

A good 101 test case is based on the yugely popular thread (50 shades of [$]e^5[$]).

viewtopic.php?t=79451

The code is

#include <cmath>
#include "nl2sol.h"
#include <iostream>
// DJD 2024, compiled under C++20 and Visual Studio 2019

/* Questions

 1. how to tweak performance parameters.
 2. vector case + Jacobian stuff.

*/ 

using value_type = double;

// 50 other ways to compute exp(5)
// https://forum.wilmott.com/viewtopic.php?t=79451

value_type objFunc(value_type x) 
{ // x ~ 148.413 (accurate to 3 decimal places)

 value_type a = 5.0 - std::log(x);
 return a * a;
}


// V2: try std::function
class MyFunctor : public NL2SOL::Functor 
{
public:
 bool operator()(const value_type* x, std::size_t n_x, value_type* f, std::size_t n_f)
 { // Maybe a signature w/o pointers?

 *f = objFunc(*x);
 return true;
 }
};

void Statistics(const NL2SOL& solver)
{
 std::cout << "Return code: " << solver.ReturnCode << '\n'; // 6
 std::cout << "Achieved accuracy: " << solver.AchievedAccuracy<< '\n'; 
 std::cout << "RMS: " << solver.RMS << '\n';
 std::cout << "#iterations: " << solver.Iterations << '\n';
 std::cout << "# function evaluations: " << solver.FunctionEvaluations << '\n';
}

int main()
{
 std::cout << "NL2SOL" << '\n';

 // tweak tolerances
 value_type argTol = 1.0e-3;
 value_type funTol = 1.0e-3;
 // tweak tolerances

 NL2SOL ns;
 NL2SOL ns2(argTol, funTol);

 value_type x;
 std::size_t n_x = 1;
 std::size_t n_f = 1;

 MyFunctor func;
 value_type result = ns.tryToSolve(func, &x, n_x,n_f);

 std::cout << "And the answer is: " << x << '\n'; // gives 148.408
 std::cout << std::exp(5.0) << '\n'; // 148.413

 Statistics(ns);


 MyFunctor func2;
 value_type result2 = ns2.tryToSolve(func2, &x, n_x, n_f);

 std::cout << "And the answer is: " << x << '\n'; // gives 148.402
 std::cout << std::exp(5.0) << '\n'; // 148.413

 Statistics(ns2);
}

pj · February 19th, 2024, 6:24 pm

1) 7zip is completely free for both Windows and Unices (try '7z' in any standard Unix/Linux etc shell). See https://7-zip.org/.
2) Forgive: did your test succeed?
3) looks like a one dimensional test case. The package comes with a multidimensional example : fitting a polynomial through scattered data (which does of course have a linear solution and we use that for validation in the spreadsheet). The real power of NL2SOL is its ability to find constrained least squares minima.
3) not sure what your issue is about the pointer interface: it enables you to pass this one coefficient in stack memory and provides compatibility with absolutely anything under the sun. That's btw the reason for all these functions in C++ based on 'iterators' (since that type of interface also works with raw pointers). You can always add more frontends as you wish, e.g., pass in std::vector<double>& which gives you memory safety (and that is how I use it most of the time in C++). The pointer interface, however, enables us to put other frontends before it, e.g., I have used exactly this code with a python frontend to give me access to NL2SOL from python. Ditto from C#. Etc.

Have fun!

Cuchulainn · February 19th, 2024, 8:55 pm

I just did the 1d case to get things up and running. yes, it works.
Raw pointers are OK, but C++ has evolved in the last 15 years. I hardly ever use or need raw pointers these days.

pj · February 20th, 2024, 9:52 am

It appears that we are getting a little sidetracked here, or maybe getting off on the wrong foot.

The code package I offer at jaeckel.org is a minimalist C/C++ interface to what is (mainly) machine-translated Fortran code. Anyone is at liberty to use it any way they wish and have their own frontend to it.

I, for example, have used it with C# frontends, Python frontends, and even a frontend into an in-house developed scripting language (written in the language-language EBNF and compiler-compiled with ANTLR). This enabled the user to specify problems in C# or Python or that in-house language (where we used it, among other things, for commodity futures [and swap rates etc] time series mean reversion parameter calibration, e.g., maximum likelihood based, non-linear variations of calibration etc. etc.). I have used it for yield curve construction, parametric volatility surface calibration to market option prices, FX/credit risk correlation calibration, and myriad of other applications.

As for an 'evolved' C++ example, kindly have a look at the function fit_polynomial() in fit_polynomial.cpp. Its purpose is to demonstrate in Excel the result of a calibration of a polynomial to some scattered data. The function essentially does just this:

fit_polynomial(const XLOper & order, const XLOper & abscissas, const XLOper & ordinates, const XLOper & use_jacobian) {
    PolynomialFit poly(abscissas.to_vector(), ordinates.to_vector(), use_jacobian | false);
    std::vector<double> coeffs(order.to_int()+1);
    NL2SOL nl2sol;
    nl2sol.tryToSolve(poly, coeffs, poly.data_length());
    ...

The syntax 'use_jacobian | false' in this context means that the input XLOper 'use_jacobian' defaults to 'false' (if omitted in the worksheet function in Excel, the value 'false' is used). Anyone who does not like this syntax has no obligation to use it - do your own thing!

The invocation of NL2SOL is done here via the NL2SOL class member function (aka 'method')

   double tryToSolve(Functor& functor, std::vector<double> &x, size_t n_f);

which, as you can see, uses a 'std::vector<double> &' data type, and no raw pointers, and the NL2SOL algorithm safely takes care of the function value output vector memory allocation (and clean-up) of size n_f. Obviously, the bespoke functor class PolynomialFit has to comply with the NL2SOL interface, and that is how I wrote it, but you or anyone else can write whatever code they wish. There is no need to criticize this generic interface here - it is provided 'as is', and it is provided as it is for very good reasons (compatibility, simplicity, speed, etc.). NL2SOL itself has other reasons for the raw pointer interface, e.g., it keeps more than one copy of the x[] vector and may invoke the functor or Jacobian on any of those copies as it sees fit. It keeps all memory in one monolithic block which in turn is a performance benefit, so no need to criticize that either.

Btw, thank you for sharing with us that you hardly ever use or need raw pointers these days, and rest assured none of use would have assumed that you do, nor, I presume, do any of us, unless, of course there is absolutely no other way (without major compromises), such as in the context of interfacing between run time environments or when dealing with machine-translated code. Please forgive my tongue-in-cheek there... we must retain a sense of humour!

NL2SOL has also been used at hedge funds for identification of globally optimal investment choices (the details are obviously confidential).

With the provided interface, it is straightforward to adapt other language frontends, e.g., I can see immediately how one would expose this to Java (based on my experience with C# which is virtually identical to Java in this aspect). I bet that sooner or later someone will have interest in a frontend to Rust.

All such interfacing is necessarily always based on the common denominator between run time environments, and that is the raw data pointer / raw function pointer (which is literally a code execution entry pointer). None of this is affected by the evolution of any language, C++ or otherwise. Ultimately, it's machine-hardware/OS-native-sized integers, and pointers are of course just that.

I am offering my 25(+)-years-curated version of the code 'as is' as a courtesy to the community. The only value I add is the few algorithm/bug fixes that are explained in detail in my notes right beneath my NL2SOL.7z download link at jaeckel.org - please also have a look at what I wrote about the history, other usages of NL2SOL, etc. As for the meaning of NL2SOL's control parameters and return values, I refer to the original authors' documentation, available via their two mathematical articles and their detailed textual explanations in the comments of their source code (see nl2sol.cpp). Regarding the minimalist C++ frontend, e.g., the Jacobian definition, I am not sure how this description could be made any clearer (you could always look up 'column-major' on the web), but anyone is of course at liberty to do so in their own copy of the code:

  // The Jacobian function may be left undefined in which case finite-differencing is used.
  virtual bool have_jacobian() const { return false; }

  // The return value FALSE indicates that x[] was not inside the admissible domain, i.e., that constraints are violated.
  // NOTE:
  //  The Jacobian function must store the output matrix J in the output vector J[] in column-major order (ForTran).
  //  For example, if n_f is the length of f[], and n_x is the length of x[], then J(k,l) is to be stored in J[k+l*n_f].
  //  The element J(k,l) represents  the derivative
  //                    ∂ f[k]
  //                    ------
  //                    ∂ x[l] .
  virtual bool jacobian(const double *x, size_t n_x, size_t n_f,
                        double *J /* J[k+l*n_f] = ∂f[k]/∂x[l]. The length of J[] must be at least n_f*n_x */){
    return false;
  }

I also give my advice from experience that NL2SOL is the best I have ever worked with for these many purposes mentioned above (and I know of other teams who have done exhaustive comparisons with other algorithms and came to the conclusion that NL2SOL was superior by a wide margin to all else they tested).

Besides all of the above advocating of NL2SOL, I must caution and/or concede that NL2SOL is not a method that outright and only uses a gradually collected sequence of vectors to estimate the Hessian. When NL2SOL in its evaluation course considers the Jacobian to be the next best step to make progress, it will evaluate it, either by the user provided function, or by finite-differencing approximation (which is done as a sequence of gradient vector computations). BFGS is closer to the idea of "Hessian estimation", as mentioned by the originator of this thread, but I found BFGS to be much, much, much slower in all applications where a local gradient approximation is admissible, i.e., can be used as an alternative (not all underlying problems allow that: a model function that is based on a random forest, for example, is internally a piecewise constant function with many many different levels, meaning, the local gradient is always zero! A similar issue arises in certain Monte-Carlo method training stages - see the chapter on Bermudan swaption pricing in my book for an example where I did use BFGS for the same reason).

So, after all this, I point out with remorse that my posting about NL2SOL is, alas, effectively, off-the-original-topic.
Nevertheless, I am hopeful that it may be of interest and/or use to some.

I finish with the polite request to kindly honour my conclusion:

as always, positive feedback is welcome.

Cuchulainn · February 20th, 2024, 1:05 pm

Thank you for your insights. I thought my commment was fairly innocuous.
I don't doubt the quality. I was just saying I have other ways to define functors.

What took me a few minutes to understand was this code which I diligently used in test program. I personally never use ponters as input argument, but just saying.

   virtual bool operator()(const double *x, size_t n_x, double *f /* f=f(x) */, size_t n_f) = 0;

katastrofa · February 20th, 2024, 7:39 pm

Is it because pointers as function arguments are not in line with the OOP/ encapsulation principle, Sensei?

Cuchulainn · February 20th, 2024, 8:32 pm

Is it because pointers as function arguments are not in line with the OOP/ encapsulation principle, Sensei?

Well, since you ask

pointers as function arguments are kind of out-of-fashion, they are a throwback to C. I always used (const) refferences since 1990.

Here is a swap function using both approaches.

https://www.geeksforgeeks.org/passing-b ... ce-in-cpp/

Cuchulainn · February 20th, 2024, 8:33 pm

Is it because pointers as function arguments are not in line with the OOP/ encapsulation principle, Sensei?

Well, since you ask

pointers as function arguments are kind of out-of-fashion, they are a throwback to C. I always used (const) references since 1990.

[color=var(--color-black)]Overall, Use references when you can, and pointers when you have to.[/color]

Here is a swap function using both approaches.

https://www.geeksforgeeks.org/passing-b ... ce-in-cpp/

pj · February 27th, 2024, 11:59 am

Of course, in higher level languages it is preferable to have whatever safety is possible.

The provided interface is by absolutely totally no stretch of the imagination a "throwback".

It is the low-level interface needed to gain access from all higher level languages.
I kept this/provided this to be able to gain access from Python, C#, C++, etc.
Everyone should use whatever is right for them, but the only common interface across all languages is the machine data type, and that is the raw pointer.

If you are not sure what this means, take any standard example C++ output file from running SWIG against some C++ classes to expose them to, say, Python, or any other (non-C++) language. You will find myriad of adapter layer code that works with raw pointers. That's because that's the only way. If you want to keep the full flexibility to use automated higher-level-interface generators (such as SWIG), it has to appear as a raw pointer since such code-generators will always try to add intermediate objects when they see references.

Somehow this thread that is supposed to be about wonderful multidimensional root-finding / nonlinear least squares methods got hijacked about a programming feature that is really (imho) irrelevant in the context. It is a minor technical detail.

Incidentally, references, at raw execution time, become raw pointers - that is how I exposed the demo function to Excel in file fit_polynomial.cpp:

extern "C" DLL_EXPORT XLOper * fit_polynomial(const XLOper & order, const XLOper & abscissas, const XLOper & ordinates, const XLOper & use_jacobian)

Excel looks for a function called "fit_polynomial" in the dll called 'NL2SOL.xll' and calls it with four pointer arguments.
I declared the function in the code with references because that means inside the function I deal with references, as advocated by other commentators here, but Excel gets the interface it needs. This works fine. My 'higher level application code' in this case is that function fit_polynomial() and if you care to have a quick look at it, you will see it does not use a single pointer, nor a pointer-based interface (at least not directly).

In contrast, the interface declaration of the underlying NL2SOL algorithm is exposed from (what once was) Fortran code in C notation for use in whatever application above, and the only common protocol between such layers is the raw pointer (or raw 64 bit data word on contemporary hardware). This is the place where "you have to".

That is incidentally how all Python code interacts with underlying C/C++ libraries, from numpy to Python's own standard library, see The Python Standard Library for reference. It is the same for C# etc. etc.

Hope this helps to distinguish between what is a cross-run-time-environment interface and how you design your code above the interface (or below, whichever way you see it).

Cuchulainn · February 27th, 2024, 3:18 pm

Somehow this thread that is supposed to be about wonderful multidimensional root-finding / nonlinear least squares methods got hijacked about a programming feature that is really (imho) irrelevant in the context. It is a minor technical detail.

I must say, you put those hijackers in their place.

Personally, optiminisation is something that is important. What specific topics? I am interested in constrained, global minima and ODE gradient systems.

Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems

Re: Hessian Estimation NOT only for Minimization Problems