Serving the Quantitative Finance Community

 
User avatar
ThomasJ
Topic Author
Posts: 1
Joined: October 9th, 2007, 2:39 am

Faster than O(n) min & max of time series

March 3rd, 2011, 4:39 pm

Here's the situation: I have a moving window of length T over a series of prices. I want to ask, "Between some time q and now, what is the min and max of the time series?" ([q,now] is in [T,now], i.e. I'm always asking about some time range in my window). I'm willing to spend O(1) every time I add an element to the list, but I'd like to spend less than O(n) when I query for the min and max. Is there a better than O(n) solution?
 
User avatar
Hansi
Posts: 41
Joined: January 25th, 2010, 11:47 am

Faster than O(n) min & max of time series

March 3rd, 2011, 4:46 pm

Is q constant? Could you store binned averages for specific ranges?
 
User avatar
quantmeh
Posts: 0
Joined: April 6th, 2007, 1:39 pm

Faster than O(n) min & max of time series

March 3rd, 2011, 4:54 pm

can you store min/max for each data point from time 0?
 
User avatar
ThomasJ
Topic Author
Posts: 1
Joined: October 9th, 2007, 2:39 am

Faster than O(n) min & max of time series

March 3rd, 2011, 5:34 pm

QuoteOriginally posted by: HansiIs q constant? Could you store binned averages for specific ranges?Unfortunately, no, q is not constant.quantmeh, how would it help to store min/max for each data point from time 0? I need to ask between [q,now], not [0,q]
 
User avatar
quantmeh
Posts: 0
Joined: April 6th, 2007, 1:39 pm

Faster than O(n) min & max of time series

March 3rd, 2011, 7:34 pm

QuoteOriginally posted by: ThomasJquantmeh, how would it help to store min/max for each data point from time 0? I need to ask between [q,now], not [0,q]if it happens so that min[0,now]<min[0,q], then min[q,now] = min[0,now]otherwise it's O[n]
 
User avatar
nwhitehe
Posts: 0
Joined: March 3rd, 2006, 6:57 am

Faster than O(n) min & max of time series

March 3rd, 2011, 8:01 pm

Here's a thought about the problem, not a full solution yet though.Construct a binary tree. The root node contains the min and max values and their locations within [0,now]. The left side records the min and max values and locations on [0,now/2]. The right side records the min/max for the interval [now/2,now]. The binary tree keeps subdividing the interval down to intervals of length 1 (base case).To get the min/max for an interval [a,b] requires doing a binary search down the tree for a and again for b. Using the data on these two paths down the tree, you can get the min/max for the interval [a,b]. If n is the number of data points, the binary searches take O(log n).Constructing the original tree I think takes O(n) time if you do it properly (not naively). The right way is to solve the recursive problems first, then use this info to get the data for the current node. I.e. at the top of the tree, get the min/max for [0,now/2] and [now/2,now] from recursion. Take the min of the mins and max of the maxes to get the min/max for the top node, rather than naively loop through everything again.But then I'm not sure about how to add one new data point quickly into the already computed tree. It seems like it should be possible in O(log n), but I'm not sure exactly how it would work.Interesting problem!
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Faster than O(n) min & max of time series

March 3rd, 2011, 10:32 pm

Here's one possible approach that should average O(log(n)) for most sequences. I've not tried it so you might want to ponder it carefully before taking my word that it works. Perhaps one trick is to remove all incoming values that will never be a maximum in any [q,now] window. For example, if X[now] > X[now-1], then the X[now-1] value will never be the maximum in any [q,now] interval. Thus, you can cull low values that can't be maxima from a list of candidate values as they come in. You could build a double-ended list where each element stores an (X[timestamp],timestamp) pair of candidate maxima. On this List, the first item is the oldest item on the list and the last item is the most recent candidate maximum value on the list.Manage the list:1. IF time stamp on the first_item > T THEN Delete First_Item2. IF X[now] > X[Last_Item], THEN delete Last_Item and repeat step 23. append (X[now],now) to listThis list manager creates and manages a data structure of potential candidate maxima in [T,now]. It will contain a monotonically non-increasing list of values with their corresponding monotonically-increase timestamps (assuming now>T). Step 2 of the algorithm could be O(n) worst case, but only in the bizarre case where X is steadily falling and then one gets a massive jump with a new X that's higher than any X in the entire [T,now] range. (For extra credit, one might be able to construct a better search algo for step 2 but I doubt it would be worth it for most semi-random time series)To find the maximum value in a [q,now] window, do a bisection search on the list looking for the element with last timestamp within the [q,now] range and that will be the answer. That search should be O(log(n))Duplicate the above logic for the minima with the value-comparison conditionals appropriately reversed. EDIT: Making the conditional in step 2 a "≥" instead of ">" will improve performance somewhat although it also means the algorithm will report the most recent timestamp if two data points are tied for highest value (the version with ">" would report the older timestamp in the case of a tie).
Last edited by Traden4Alpha on March 3rd, 2011, 11:00 pm, edited 1 time in total.
 
User avatar
quantmeh
Posts: 0
Joined: April 6th, 2007, 1:39 pm

Faster than O(n) min & max of time series

March 4th, 2011, 2:18 am

t4a, it's not gonna work
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Faster than O(n) min & max of time series

March 4th, 2011, 12:08 pm

QuoteOriginally posted by: quantmeht4a, it's not gonna work You may be right, but I'd love to hear why you think it won't work.EDIT: I've pondered this a bit more and can't find a flaw. (Doesn't mean there isn't one).
Last edited by Traden4Alpha on March 3rd, 2011, 11:00 pm, edited 1 time in total.
 
User avatar
quantmeh
Posts: 0
Joined: April 6th, 2007, 1:39 pm

Faster than O(n) min & max of time series

March 4th, 2011, 5:20 pm

QuoteOriginally posted by: Traden4AlphaEDIT: I've pondered this a bit more and can't find a flaw. (Doesn't mean there isn't one).i'll trash your flawless algo on weekend. i promise
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Faster than O(n) min & max of time series

March 4th, 2011, 5:31 pm

QuoteOriginally posted by: quantmehQuoteOriginally posted by: Traden4AlphaEDIT: I've pondered this a bit more and can't find a flaw. (Doesn't mean there isn't one).i'll trash your flawless algo on weekend. i promise Thank you! (And I do mean that because it's become like a brain teaser to me)
 
User avatar
ThomasJ
Topic Author
Posts: 1
Joined: October 9th, 2007, 2:39 am

Faster than O(n) min & max of time series

March 4th, 2011, 5:42 pm

QuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: quantmehQuoteOriginally posted by: Traden4AlphaEDIT: I've pondered this a bit more and can't find a flaw. (Doesn't mean there isn't one).i'll trash your flawless algo on weekend. i promise Thank you! (And I do mean that because it's become like a brain teaser to me) If you guys figure it out over the weekend, let us know!
 
User avatar
tradelink
Posts: 0
Joined: March 9th, 2010, 9:55 pm

Faster than O(n) min & max of time series

March 4th, 2011, 5:46 pm

unless I don't understand the problem, I think the premise of t4a's idea is correct.... his suggested implementation of it I don't have the time to review.
 
User avatar
ryans
Posts: 0
Joined: March 4th, 2011, 12:28 pm

Faster than O(n) min & max of time series

March 5th, 2011, 2:54 pm

Why don't you maintain a sorted list of the T elements? Each element can be a tuple of (Price, Unsorted Index) which is sorted by Price. Thus, if you are looking for the min/max over q the worst case is O(T-q) when looking from the beginning/end of the list. The overhead comes in maintaining the list where you can either pay for the time to search the unsorted or extra memory to maintain another list mapping from Unsorted index to sorted index. Am I misunderstanding something?
 
User avatar
quartz
Posts: 3
Joined: June 28th, 2005, 12:33 pm

Faster than O(n) min & max of time series

August 5th, 2011, 8:54 am

Maybe T4A method is O(log n) on average and O(n) worst case? Oh well I should read it first But hey, the tree method is so simple and effective there's no reason to use more convoluted approaches.
Last edited by quartz on September 4th, 2011, 10:00 pm, edited 1 time in total.