Hi Alan, thanks for the question.
First of all, also for anyone else reading this thread, I am not an expert on Malliavin calculus either. All technical questions on Malliavin is probably best answered by E. Alos - I believe her email is on the arXiv site. My very limited contribution to the paper is to introduce/explain the d2=0 approximation to them and related practical matters, and to propose the use of Malliavin to generalise it to fractional models and quantify error bounds as I felt Malliavin was well suited to it. They have been too kind to include my name in the paper.
To go back to your question: yes you could do it that way (calibrate to varswaps and then MC) and you will get the exact volswap price. But, aside from the computational cost for pricing volswaps, you have assumed a particular model.
The benefit of the d2=0 model-independent approximation (model independent within the class of fra tional stoch vol models) is that you do not need to assume any model. The paper shows that for any correlation and Hurst parameter value for whatever fractional stoch vol model it is technically valid to approximate the volswap price using d2=0, and errors can be rigorously quantified. We took the fractional SABR model only as a possible model to run numerical test against. As you say there is sensitivity from correlation, even for d2=0 point, but the d2=0 point appears to be the least correlation sensitive model-independent point on the IV. There is a point on the IV which is completely insensitive to correlation, and that point would equal the exact volswap price, which is model-dependent.
You can see from the tables that for negative correlation (equity) and zero correlation (FX more or less), for H > 1/2 and longer tenors the d2=0 is quite accurate. For H < 1/2 and longer tenors it deteriorates. For H < 1/2 and short tenors it's 'ok'. This is encouraging in the sense that e.g.
Comte and Renault (1998) have argued that volatility persistence can be explained by SV model with H > 1/2, as reflected in long term implied vol smiles that are still too steep to be explained by traditional SV models (H = 1/2) but which can be explained by SV model with H > 1/2
There are other non-parametric approximations to volswap prices, such as Carr-Lee of course. But it's not clear (at least to me) how easy it would be to extend Carr-Lee to fractional SV. Furthermore, also not clear how to quantify error between exact volswap and Crr-Lee approximation.
Next to fractional models, you could also imagine that the observed smiles are driven by an LSV model. The d2=0 approx has not been generalised to LSV, I don't know if that is possible.
Hope this clarifies somewhat.