<t>Let's see. We have a sample of X_i, each a vector, then MLE is argmax \sum Ln p(X_i | parm). Sometimes we only observe one point X, e.g. time series, then it becomes maximizingLn p(X | parm) = Ln p(X[1] ... X[n] | parm) = \sum Ln p(X[j] | X[j-1] ... X[1] , parm) QuoteOriginally posted by: rezaalm...