November 25th, 2011, 6:43 pm
QuoteOriginally posted by: CuchulainnQuoteOriginally posted by: Traden4AlphaIt would seem that the guts of the engine would include:* Sample generators (RNGs, weighted sampling, antithetic sampling, low-discrepancy, etc.)* Sample functions (converts a "random number" to an application-specific quantity)* Path constructors (constructs an event path as a sum of function-mapped samples, including path-dependencies)* Event aggregators (aggregates across samples or paths to compute MC statistics, MC-based solution estimates, and MC quality estimates)* MC run managers (spawns/handles multiple MC runs such as with different application parameter values or MC control parameter values)On top of this would be an application layer that maps a user's domain application to a configuration of the MC engine components (e.g., defining sample generator, path constructor, event aggregator, etc.). The application layer could/should intermediate between a "friendly" interface on to MC and the more complex hardware-specific internals (e.g. SIMD/GPU details).Talking a data/task view, I see 3 'levels': 1) market/calibrated data and models 2) Simulated data (e.g. paths) and 3) MIS, high-level data and high-scenarios. These can be separately designed and they lead to parallel task and data decomposition. The above list of subtasks will be in one of the 3 parent tasks.Important in the engine are efficiency (very fast), functionality (what kinds of SDEs to support), maintainabiliy (add plug ins, narrow interfaces between components.).What do you think?Could you please explain a bit more this 3 levels view?In the meantime a few down-to-earth thoughts on the MC engine elaborating on T4A's ones, some of them obvious (but in need for explicit approval), others opening a Pandora vase...- antithetic variates: is it better to write a wrapper over existing random uniform *vector* generators, sending out antithetic pairs one component at a time (u1, 1-u1, u2, 1-u2...), thus leaving function sampling untouched and requiring only ex post fixes to sample variance estimators... or to proceed on the function as g(u):=(f(u)+f(1-u))/2? however this implicitly means that we actually need such an *explicit* f(u) interface somewhere! (which is often not the case) - both are ugly in some sense (we could refer to these two approaches "external AV" and "internal AV" for brevity).- control variates: what helpful infrastructure can be provided? the regression coefficient? most of the work must be done by users anyway implementing the actual control variate for each problem, and machinery for the controlled estimator seems a bit silly/trivial/superfluous, unless it's all about the coefficient and maybe controlling the bias from adaptivity.- support for QMC points/vectors - possibly unified with classic PRNGS; otherwise retrofitting LDS points to a univariate framework is feasible but ugly (=perverse) and unleashes a cascade of other issues (had to do it once; never again please!); the opposite adapter is the most sensible approach - other QMC support: lots of things (list never ends actually), but a minimum set would include at least fast quantiles, Sobol', bridging, PCA and smoothers/periodizers; I'd also love to be able to easily connect our adaptive LDS engine (both as LDS server as well as model "client") - simd/vector-readyness - meaning also that aggregation operators will need to operate on at least two levels; unless all data is cached for a final pass; and at the other end random generators might need changes/rewrites too.-- Are boost accumulators simd ready? I don't think so, especially quantiles, but at least first simple adapters can be prepared. -- There might also be interactions with the weighted sample topic below.- random generators: as for the two random syntaxes provided by boost/C++ I am not sure about which one is technically the best(definitely the simple nonuniform_sampler(uniform_RNG()) is best for novice users, but might have drawbacks... haven't really tried it in our framework yet). E.g. there might be issues with caching/persistency. - parallel PRNG/QMC streams support: there are 2(+1 fake) main approaches: split one huge stream in shorter chunks or use multiple generator parameters. In the first case there are 3 algos for on-the-fly calculations (only one is fast enough), or I could provide a set of ready generator states for MT or WELL (and then the WELL generator too). The second approach also requires slow precalculation for MT, but other generators might be faster (e.g. combined LCGs). In both approaches it's not trivial to deal with reproducibility of results (do we want it at all?); especially once load balancing gets into the game. - Or one simply "randomly" initializes generator states and *hopes* for nonoverlaps/noncorrelation (but this is a strange bet with unwarranted assumptions).- kind of parallel engine: embarassingly parallel is easy, but obviously a realworld MC engine should do much more: how to best parallelize early exercise? adaptive methods (even a simple control variate!)? accumulators such as quantile? subsimulations? and what about load balancing? without forgetting that distributed systems, SMP and SIMD all raise different issues which will interact... - Instead of dealing with shared objects first and later going distributed, I would think about serialization&c from the start, retrofitting might become a pain. One might forget about SMP peculiarities at first and start with something more general; optimized algos for SMP can always be added later. Clearly this topic interacts a lot with the other discussion between static vs dynamic polymorphism vs components&signals (which warrants a dedicated thread imho).- early exercise algos: if we want to provide something automatic, here is a big question mark: LS is definitely not enough, how advanced do we want to go? - AND shall the framework be closed/blackbox or open to (intrusive) third party algos? not an easy design issue!- subsimulation: nowadays the MC engine should be able to handle multiple levels of simulation: derivative portfolio VaR is already >=2 levels. Some (american) pricers already have 2 simulation levels by themselves, and it's easy with credit and multiperiod optimization to get to 3-4 levels and beyond. This will become more and more common so the framework should be ready (there are also specific algorithms improving results in such situations; also some QMC developments might help; and of course in some of the cases one might just use regression). Especially if we want the lib to be used for pretrade simulation it must all be damn FAST. This is a Major Point imho, otherwise it's just YetAnotherQuantLib.- weighted samples: unify their treatment, they come from various sources: stratified/importance sampling&measure changes, weighted MC/entropy pooling, unequal weight integration rules... they need to be supported atleast by aggregators: quantiles, moments, regressions, early exercise... better a uniform concept for weight+sample pair or an explicit type? as the coupling is there for algorithmic reasons anyway, it's probably better to go for the type. Still I'm not an expert in metaprogramming so I'd be glad if someone could check what boost::accumulators is doing with those binary operators in numeric::functional exactly.--weighted samples must allow for vectors (e.g. for regression with importance sampling)! per se obvious, but: what about vector/matrix containers, as they are not as lightweight as scalars?- in general everything should work on vectors (or tensors) so as to have a unified interface to multivariate distributions, see also the requirements of antithetic variates, QMC, etc...- GPU will be a mess, just a reminder :-)There'd be other issues, but better stop here for now Here requirements and less mature thoughts are mixed because I feel one major topic is reaching modularity while satisfying interoperability required by numerical constraints, so it's better to discuss all together first. Just two cents.