Cuch, naturally ProfileBeforeOptimizing goes without saying.At the same time: Herb Sutter, Andrei Alexandrescu, C++ Coding Standards: 101 Rules, Guidelines, and Best Practices, Item 9. Don't pessimize prematurely."Yet we should not pass up our opportunities in that critical 3%." (hence the profiling)E.g., DesignForPerformanceSee the discussion in PrematureOptimization -- in particular, the paragraph:"However, PrematureOptimization can be defined (in less loaded terms) as optimizing before we know that we need to.Optimizing up front is often regarded as breaking YouArentGonnaNeedIt (YAGNI). But by the time we decide that we need to optimize, we might be too close to UniformlySlowCode to OptimizeLater. We can use PrematureOptimization as a RiskMitigation strategy to push back the point of UniformlySlowCode, and lower our exposure to the risk of UniformlySlowCode preventing us from reaching our performance target with OptimizeLater.For those who don't work to strict memory or CPU cycle limits, PrematureOptimization is an AntiPattern, since there is only cost and no benefit. For those who do, it is often confused with poor coding, or with misguided attempts at writing optimal code.A common misconception is that optimized code is necessarily more complicated, and that therefore optimization always represents a trade-off. However, in practice, better factored code often runs faster and uses less memory as well. In this regard, optimization is closely related to refactoring, since in both cases we are paying into the code so that we may draw back out again later if we need to. We don't (yet) have PrematureRefactoring regarded as CategoryEvil.Another common misconception is that any level of execution speed, or resource usage, can be achieved once the code is complete. There are both practical and physical limits given any target platform. PrematureOptimization is not a solution to this, but it can help us DesignForPerformance. When working in an environment where resources are less limited, this is unlikely to be a problem. "Target architecture is important: for instance, if we want to have a code highly scalable w.r.t. massive paralellism on GPUs, a DesignForPerformance would probably require avoiding mutable shared states -- precisely what would hurt the performance on current CPUs. Hence, I don't think there are fixed sweet spots that we could just put in the big-design-up-front. Identifying the moving parts (mutable shared states) and the differences that cause them to be moving (target architecture: CPUs vs. GPUs) might lead to a better design overall.And we shouldn't forget what we know: there was a thread here that has done some profiling on QuantLib code and identified the performance bottlenecks -- AFAIR, those were multiple virtual fn. calls and the removal brought a significant speedup -- using what we know from this is not necessarily "making decisions without first measuring", agreed?
Last edited by Polter
on October 22nd, 2011, 10:00 pm, edited 1 time in total.