Serving the Quantitative Finance Community

 
User avatar
outrun
Topic Author
Posts: 4573
Joined: January 1st, 1970, 12:00 am

Re: Is there a name for this transform method?

June 15th, 2017, 8:00 am

Thanks Alan. 
Maybe it would be an idea to start a competition if we want to get an overview of methods and their relative performances? Adding all this a-priori knowledge in some manual designed model is precisely something I want to avoid and outsource to the neural network.
I think the ability (or not) of the neural network to outperform traditional econometric methods is a critically important idea. So far, I have learned that your cross-entropy performance measure is close to a likelihood in MLE, so it indeed appears possible to compare things on essentially the same basis. But, I am interested in learning these new methods, not the old ones! So, I am trying to do that ... slowly ... while tanning. :D 
That's exactly what it is. It's a log likelihood!
 In ML the two most important elements (IMO) are the "cost function" and "how to check what performance will be on unseen data". Most of the deep learning models work with cost minimazation, and there are simple rules that tell you what cost metric to use for what type of problem. If you pick the wrong metric you'll get a wrong solution (I ran into that a coupe of times). Cross entropy is really good for classification type of problems -like "from which of these 100 bins do you think tomorrow's return will get drawn?"-. 
Here is a nice intro
Enjoy the sun! 
 
User avatar
outrun
Topic Author
Posts: 4573
Joined: January 1st, 1970, 12:00 am

Re: Is there a name for this transform method?

June 16th, 2017, 8:07 pm

I think this MIT lecture give a very good intro about Recurrent Neural Networks (RNN), which is the equivalent to the latent vol state in stock vol models.

The bit starts at 5:55
 
User avatar
outrun
Topic Author
Posts: 4573
Joined: January 1st, 1970, 12:00 am

Re: Is there a name for this transform method?

June 17th, 2017, 10:24 am

After trying lots and losts of networks and techniques (and first improving training by a factor 100 with better code and hardware), I think I've squeezed all predictability out of the data for the "10 sec ahead density forecasts". It hovers around 4.07 cross entropy (equiv. 59/100 buckets), which is kind of nice. How to leverage that -if possible- is still something to examine.

My focus will now switch on multivarate analysis, looking at 1.000-10.000 of stocks in parallel and underlying behaviour. What I've learned is that lack of data is the main bottleneck on single stock daily resolution data and the overcome that you need to go "wider". Another thing I've learned is that NN are very good in learning (latent) representations. 

I'll probably start a new thread in a different subforum when the scope is clear, some people find the topic of this thread unclear because it keep moving around. It started out technical but now it's more of a blog. On the other hand I don't want to start dozens of threads for every liitle new experiment/direction since the audience for these types of topics is small here. 
 
User avatar
Alan
Posts: 2958
Joined: December 19th, 2001, 4:01 am
Location: California
Contact:

Re: Is there a name for this transform method?

June 18th, 2017, 3:40 pm

I finally upgraded my Mathematica to latest version. They have a major addition since V11 of neural net stuff, including GPU support (don't have a GPU, but still in the market for a new desktop system). Anyway, I was browsing the Wolfram Community forum, which had this nice post:

http://community.wolfram.com/groups/-/m/t/1016315,

which led me to this:

Higgs Data Challenge: Particle Physics meets Machine Learning

There is some nice commentary there that probably has some applicability to finance applications, esp. the discussion of the characteristics of the winning entry.
 
User avatar
outrun
Topic Author
Posts: 4573
Joined: January 1st, 1970, 12:00 am

Re: Is there a name for this transform method?

June 18th, 2017, 4:14 pm

Thanks for the link!
Very interesting. T4A and I did quite​ a few Kaggle competitions, and we would have used the 2nd place method I expect. You don't see many neural network amongst the Kaggle winners, and I have to agree that good use of statistical methods is the key to success. Competitions are a great way to test your knowledge in the real world, we started out very confused, why weren't out awesome ideas working??

I remember T4A and I did a competitions ( I think it was finding causal relations in data) and during model development you always get feedback about the performance of you model on a small test set. The final rank will be on a large unseen set. People were beating us in the small test set ranking, but we decided to stick to good statistical method that don't overfit and which give slighly lower scores on small data set. I remember that it was a scary thing to do, pick a model with a lower score. During the final ranking we shot up to 3rd place and lots of people were upset about the shakeout. We on the other hand were very pleased that our hypothesis (that everyone was overfitting) ended up being true!

The 1st place Particle Physics winner seems to use "ensemble" methods (average of lots of models) and the common "dropout" method used in NN that helps prevent overfitting. Cool peek behind the scene!

So, well be seeing various machine learning models in your next book (once to bought that new machine?)? :-)
 
User avatar
Alan
Posts: 2958
Joined: December 19th, 2001, 4:01 am
Location: California
Contact:

Re: Is there a name for this transform method?

June 18th, 2017, 4:30 pm

Interesting! I just learned two new terms today: "bagging" and "dropout".

I do have a strong idea for my next book, relying on more conventional parameterized models. But I should definitely think about whether or not some machine learning method could improve upon that. I am so early on the learning curve for those that I don't even know how to think about that yet. Thanks for the suggestion!
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Re: Is there a name for this transform method?

June 19th, 2017, 1:24 pm

In addition to the spending a lot of time thinking about the cost function and overfitting, outrun and I also spent time on two other key facets of the problem:

First, we spent a lot of time constructing intermediate statistical indicators that pre-processed the raw data into some kind of a normalized value that we felt should be strongly indicative of the goal estimator (which was to classify each data set into {X->Y, Y->X, Z->X,Y, or no causality} ). The machine learning methods then did their magic on those intermediate indicators (over 100 of the them). Thus, machine learning still seems to benefit from a lot of domain expertise about the underlying expected invariants, structure, and estimators for the system. Yet, at the same time, we did not try to find or pick a "silver bullet" indicator. It was more of a smorgasbord approach -- giving the machine learning methods a buffet of prepared/preprocessed data to choose from.

Second, we did a lot of visualization to understand both the properties of the underlying data and what our code was doing to it. In some cases, these plots revealed important quirks in the data. In the context of contrasting machine learning versus conventional parametric models, this kind of visualization helps find discrepancies between the empirical data and the theoretical modeler's model(s). And whether those discrepancies reflect errors in the coding errors, measurement errors, or modeling errors is another huge open issue. Obviously, code can be and should be fixed but the later two errors have diametrically opposite solutions because one assumes the discrepancy is "noise" and the other assumes it is "signal"!
 
User avatar
Alan
Posts: 2958
Joined: December 19th, 2001, 4:01 am
Location: California
Contact:

Re: Is there a name for this transform method?

June 19th, 2017, 3:54 pm

Very interesting -- thanks for the elaboration. There's a very interesting finance/trading application that comes to mind which is the causal link between volatility and returns (realized or expected).  Continuous-time finance theory suggests the relationships will be very close to "simultaneous". Casual observation seems to confirm that. But there are many issues of hidden states and micro-structure. 

For trading, one issue is the lead-lag relationship and causality between the active vol ETF's (VXX/XIV) and broad market index moves. Even if simultaneous at the level of, say, O(1) sec, I suspect there is a lot of fine detail going on at the very high-frequency level. 

Googling I see this -- is that the one?
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Re: Is there a name for this transform method?

June 19th, 2017, 5:33 pm

Yes, that's the Kaggle we did. It was quite interesting because the data set of data sets included all 9 combinations of X = {binary, categorical, numerical} x Y = {binary, categorical, numerical} and included both true empirical and simulated data sets.

Yet I suspect all of these data set were pretty straight forward in contrast to what happens in the financial markets. The four biggest "causality" issues that I see are:

1) The distributed nature of the system and the impossibility of a single global clock: This affects not only the system dynamics but also any attempt to instrument the system and estimate "correct" lead-lag. Even the most basic question of did X happen before or after Y has a non-binary answer as X and Y approach simultaneity as measured by any given observer.

2) The game-theoretic reasoning by which participants anticipate the future evolution of many variables based on many other variables: The regress of "I know that he knows that I know that he knows that .... X has happened in the market" means market participants may act as if some future event has happened. I'm not sure what you would call a system that acts as if it has memory of the future.

3) Data on price flows much faster than data on fundamentals. This why HFT seems is so scary to me. Moreover, the very existence of ostensibly private information (and it's leakage) justifies reactions to changes in price without confirmation of changes in fundamentals. Depending on the nature of the Price(t) = f(Price(t-∆)), all manner of pathological dynamics can occur. (I'm reminded on some humorous Amazon.com used-book pricing stories is which an obscure non-rare book gets priced into the stratosphere because two pricing bots are calculating a price relative to the posted price which leads to exponential increase in price.)

4) More broadly, some participants assume EMH is true and make no attempt to estimate a "correct" price for the instrument they are trading. That is, they assume the market price is fair. Yet in doing so, they fail to contribute to the very price finding process that EMH depends on. No doubt the market can be efficient if a sufficient percentage of transactions have at least one diligent estimator as a counterparty but what happens if the percentage of non-diligent participants approaches 1? (This was a key element of the long-running discussion/debate I had with numbersix on the Blank Swan).
 
User avatar
mkoerner
Posts: 2
Joined: April 8th, 2009, 12:17 pm

Re: Is there a name for this transform method?

July 9th, 2017, 12:48 pm

I think this MIT lecture give a very good intro about Recurrent Neural Networks (RNN), which is the equivalent to the latent vol state in stock vol models.
If you are interested in RNNs applied to stochastic volatility models, "A Neural Stochastic Volatility Model" might be worth reading.

I have tried to use a simpler version as a generative model, but never really managed to get it to learn/calibrate in a stable fashion.