Levenberg-Marquardt with penalty functions

lisconli · April 7th, 2008, 5:30 pm

When the penalty function term can be added directly to EACH objective term, the summation of which is the objective function. The problem can solved through Lourakis C++ Implementation of constrained LM.But what if the penalty terms are summed independently and then added to the objective function?As mentioned in John Hull's book about the calibration of Hull&White's model and LMM model. The objective function is Sum[fi^2]+ Sum[Pi^2]Please note the difference: when the penalty term can be added to directly to each objective term, the problem is just another form of unconstrained one.But when the summation of Penalty terms is added to the objective function, how can we transfer it to the unconstrained one?Hopefully, I explain it correctly.Thanks.

Cuchulainn · April 7th, 2008, 7:04 pm

My take on this problem is that add a penalty or barrier term (usually a nonlinear function of the constraints) to the objective function (f:R^n -> R^1) but the modification only comes into play when the constraints are violated and the term increases as the violation increases.A similar technique is used for American option PDE.I am testing the Lourakis C (!) code but having problems, partly due I think to the way to use it.Have tested the Alglib C functions for LevMar for unconstrained optimisation, so it would not be too difficult to modify the source code. Again, the C code is not the most easy to read but is well structured in this case. QuotePlease note the difference: when the penalty term can be added to directly to each objective term, the problem is just another form of unconstrained one.But when the summation of Penalty terms is added to the objective function, how can we transfer it to the unconstrained one?The second sentence is not clear to me. Could you explain?

lisconli · April 7th, 2008, 7:38 pm

Cuchulainn, thanks for your suggestions. As you mentioned, then each penalty is added to EACH objective term. The main problem is here is that this penalty will mix up with the objective term as there is square in the later calculation(for the Aglib code or Lourakis, we can only provide each objective term). But according to John Hull's book, the penalty is added to the objective function, the summation of square of each objective term.How could I modify it to fit the requirement of LM: providing each term without square?

lisconli · April 7th, 2008, 8:06 pm

What we want to minimize is the objective function f=Sum_(i=1 to m)(Fi^2)+Sum_(i=1 to n)(Pi^2). Please note that m is the number of functions, Fi , to be evaluated, and n is the number of variables, Pi is the penalty term.If we add each penalty term to each objective term, the objective function becomes f = Sum_(i to m)(Fi+Pi)^2, which mix the penalty term and objective term anddiffers from what we expected. I am just no sure the above one will work.Thanks

street · April 8th, 2008, 2:31 am

In either case you have an unconstrained problem. By construction the penalty terms in a sense "dualize" ( using that loosely) the constraints. Also does n have to be the number of variables? That is not entirely clear.But the method still should work, no? I mean one can still compute the gradient and Hessian of the objective function.

Cuchulainn · April 8th, 2008, 1:45 pm

QuoteOriginally posted by: streetIn either case you have an unconstrained problem. But the method still should work, no? I mean one can still compute the gradient and Hessian of the objective function.There is a new function consisting of the 'old' function in combination with the penalty parameter and the penalty function. Then it is a problem of solving a series of unconstrained problems. The main question is to to choose a good penalty function. QuoteAs mentioned in John Hull's book about the calibration of Hull&White's model and LMM model. The objective function is Sum[fi^2]+ Sum[Pi^2]Lisc,Which page number is this on?

MS5 · July 12th, 2015, 4:43 pm

I'm reviving this thread only because I have an immediate need to implement LM with penalty functions and if my planned approach is a bad idea in practice, I'd be interested in hearing from those who have tried or know of better alternatives. Since one specifies the vector function whose square is the objective function, augment the dimension of this vector function by the number of penalty functions to hold the resulting scalar values of each penalty. The usual LM routines (e.g. from minpack or C++-izations of them) should then work as usual. The only concern/caveat at this moment I've thought of is the performance cost this may entail. One can always change the definition of the vector function so that it holds one non-trivial element--namely, the square root of the full objective function. However, convergence likely suffers and worsens with the size of the problem.

Cuchulainn · July 14th, 2015, 11:19 am

QuoteOriginally posted by: MS5I'm reviving this thread only because I have an immediate need to implement LM with penalty functions and if my planned approach is a bad idea in practice, I'd be interested in hearing from those who have tried or know of better alternatives. Since one specifies the vector function whose square is the objective function, augment the dimension of this vector function by the number of penalty functions to hold the resulting scalar values of each penalty. The usual LM routines (e.g. from minpack or C++-izations of them) should then work as usual. The only concern/caveat at this moment I've thought of is the performance cost this may entail. One can always change the definition of the vector function so that it holds one non-trivial element--namely, the square root of the full objective function. However, convergence likely suffers and worsens with the size of the problem.Since no one has responded, here's my 2 cents;I fear LevMar with penalty might be slow (we have to solve a sequence of LevMar by making the parameter smaller), what about new independent variables and geting a) box consttaint (DEvol) of b) unconstrained (LevMar)?example in 2d0 <= x10 <= x2 <= x1/a (a > 0)0 <= x1 + a x2 <= 6// (a)X1 = x1 + a x2X2 = x2 - x1/a0 <= X1 <= 6-2 a <= X2 <= 0check(!! solve fore x1, x2)// (b) y1, y2 spaceX1 = 6 sin^2 y1X2 = -2 a + 2a sin^2 y2Now y1 1n y2 live in unconstrained space.That idea.I am guessing that constraints would not be too dissimilar to[$]\theta \geq 0 \\|\rho| \leq 1 \\\eta \geq 0 \\0 < \gamma < 1 \\\eta (1 + |\rho|) \leq 2[$]