November 22nd, 2006, 3:32 pm
QuoteOriginally posted by: cdmurray80That's very clever...I didn't think of anything like that! It's sort of an O(1) update with a lag once in a whileI think, if all of the probability updates increase, rather than decrease, event probabilities, you can get guaranteed asymptotic average update time of O(1). All you do is wait until N updates accumulate...at this point your structure in memory is twice as big, and you just regenerate it all from scratch. So the average cost is O(1) for (N-1) of the updates, and O(N) for 1 of the updates, given an average of (1 / N ) ( (N-1)O(1) + O(N) ) = O(1)...very cool!!Actually, if all you have are probability increases, there's much less reason to regenerate. If the event-CDFdata structure is an array and you use bisection search, then adding new probability mass (in the form of added entries of existing events), incurs little work. If the search process to lookup an event is O(log(N)), then the growing N due to added event segments adds little to the search/lookup time. You can probably wait much longer than O(N) between regenerations so that the average regeneration cost might be less then O(1).QuoteOriginally posted by: cdmurray80What I worry about is if there are a lot of probability downgrades. In this case, your random draws may "miss" any event, and you have no performance guarantee, unless I misunderstand.........Or: you start out with one event with an overwhelmingly large probability, and it gets downgraded. Now almost all of the numbers drawn will be misses.Yes, you do understand. Probability downgrades (especially ones on high-probability events) do erode performance quickly and severely. Performance drops (and becomes stochastic) in proportion to the total probability removed from the set of intervals. If the event set contains a highly skewed distribution and a few high probability events are downgraded, then performance suffers quickly. One partial solution is to order the event-CDF data structure to put all the high-probability events to one end or subtree of structure. One can then do a partial regeneration only on that end by recompacting the CDF intervals (and dropping the total CDF size used in drawing rands).