SERVING THE QUANTITATIVE FINANCE COMMUNITY

 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: Looking for hardware recommendations

July 28th, 2017, 9:56 pm

Sounds indeed like a good way to rule things out. Only thing is I can't reboot the server now :-( (and I need to find that MC code again!).
No need to reboot your server as I did it with mine a few days ago and got this run time for my Test 1:
Test 1: ST: 162s, MT no HT: 40.45s
Now that HT is back ON, I forced the same test to use only 4 threads (while 8 are available now) and I get:
Test 1: ST: 162s, MT no HT: 45s
So it seems that while the ST performance stays the same, having HT ON means that if you set 4 threads to work (as many as the physical cores), doesn't result in the same performance as when 4 cores are used w/o HT, but rather in a 10% loss.
Thats because the machine will be doing other things on those cores besides your threads? If it does then those 4 threads on those 4 cores need to give up time slice resources. If you have HT ON then it can grab a bit of HT resources and be faster.
 
User avatar
Billy7
Posts: 282
Joined: March 30th, 2016, 2:12 pm

Re: Looking for hardware recommendations

July 28th, 2017, 11:57 pm

Sorry Thjis, I edited my post a little after you started your reply I guess. HT was ON actually for that 45s. You're right though, we can never be too precise about this as background processes can't be stopped 100%, but the same goes when you try to use all available threads, no? Anyway, what I did was to kill everything obvious that may be using the CPU and give high priority to the test process. Also obviously looking at the task manager or process explorer to make sure indeed that "nothing" else is going on and run it 6-7 times and report the fastest one.
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: Looking for hardware recommendations

July 29th, 2017, 8:53 am

Thanks Billy!
What puzzles me is that with HT ON you get 31.45 if you use either 4 or 8 threads, which is a parallel speedup of more than 4? One would expect to no be able to do faster that ST/4 with 4 threads right?

Thoughts ?
 
User avatar
Billy7
Posts: 282
Joined: March 30th, 2016, 2:12 pm

Re: Looking for hardware recommendations

July 29th, 2017, 10:59 am

Maybe my previous posts were confusing. Let me summarize in an effort to make it clearer.
No, with HT ON I get 45 secs when I use only 4 out of the 8 available threads.
With HT turned OFF from the BIOS and so using 4 out of the 4 available threads I get 40.4 secs. And that's indeed 1/4 of the ST time of 162 secs, which implies 100% parallel efficiency. With HT ON and using all 8 threads I get 31.45 secs, so more than a 4x speedup as expected. 
So it just seems to me that I get a very similar behavior to that suggested by your graph and I think we can agree that the only way to be sure about what HT adds, is to first turn it off. Right?
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: Looking for hardware recommendations

July 29th, 2017, 11:05 am

Yes that's clear, agree.. Another option it to compare things to ST it seems. HT On or OFF doesn't seen to affect ST.

Are you on Linux?
 
User avatar
Billy7
Posts: 282
Joined: March 30th, 2016, 2:12 pm

Re: Looking for hardware recommendations

July 29th, 2017, 11:23 am

Indeed it doesn't seem to affect ST in this particular case. Though I do remember reading comments from people who claimed that with HT ON they saw a lower ST performance than with HT OFF, but of course I'm not sure about the validity of these claims. So I think the most proper way is to turn it off. No, I'm on Win7:-)
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: Looking for hardware recommendations

July 29th, 2017, 11:56 am

There might still be something else going on. I think it's strange that with HT ON you don't get 100% parallel efficiency up to 4 threads. Maybe I can find the raw numbers of my plot, I can't tell if the ST is 4mln/sec or 4.4/sec (10% higher for ST as you observed), if I had to guess​ I would say it looks like 100% efficiency instead of 90% with HT ON in my machine
 
User avatar
Billy7
Posts: 282
Joined: March 30th, 2016, 2:12 pm

Re: Looking for hardware recommendations

July 29th, 2017, 12:29 pm

Maybe there's something else. Maybe also the fact that you have 2 CPUs affects measurements as well? But if I were a betting man (which I am occasionally!) I'd bet that if you redid your test with HT OFF, you'd get more than 32.5mln using 8 cores.  Admittedly this hunch is also driven by the fact that I've never seen anyone claiming close to 40% HT improvement.

See here for a relevant discussion:
http://blog.beamr.com/blog/2016/02/11/d ... r-threads/
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: Looking for hardware recommendations

July 29th, 2017, 3:42 pm

I will have to do that, I'll try in the coming days. There is also a possible difference between the windows and Linux thread scheduler, and the compilers we use, and the threading software..

To be tested!
 
User avatar
Cuchulainn
Posts: 60518
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Looking for hardware recommendations

August 1st, 2017, 4:24 pm

As a sanity check, can you apply Amdahl's law? What is the serial fraction?
http://www.datasimfinancial.com
http://www.datasim.nl

Approach your problem from the right end and begin with the answers. Then one day, perhaps you will find the final question..
R. van Gulik
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: Looking for hardware recommendations

August 1st, 2017, 4:49 pm

As a sanity check, can you apply Amdahl's law? What is the serial fraction?
Yes,.. that's why I'm keep been doubtful (the only way forward is for me to test, I honestly don't know).

The thing that doesn't make sense to me is:

* When enabling MT the performance of the normal cores is claimed to drops 10%
* When testing a single thread on a single core the speed is not it affected by enabling MT.
 
User avatar
Cuchulainn
Posts: 60518
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Looking for hardware recommendations

August 1st, 2017, 7:54 pm

Would it be possible to reverse engineer to shoe the major tasks in a task dependency graph?
Also, maybe you said it already, but which MT library are you using?

More threads does not necessarily mean better speedup.

i.e. also, do you have a 'picture' of the data flow like page 6 here?
https://www3.nd.edu/~zxu2/acms60212-402 ... c-05-1.pdf
http://www.datasimfinancial.com
http://www.datasim.nl

Approach your problem from the right end and begin with the answers. Then one day, perhaps you will find the final question..
R. van Gulik
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: Looking for hardware recommendations

August 1st, 2017, 8:24 pm

If a very simple MC experiment: you have N (large number) of jobs and T (small number) worker threads. Each job is something like "do 1.000 Monte Carlo draws from some distribution and report back the average".

If the efficiency is 100%, and if a single thread does 1 job is 1 sec, then all jobs are expected to be processed in ceil[N/T] seconds, right? However, with C cores this will be ceil[N/C] if T is a multiple of C?


(I'm working on it now! I'll have plots in 10 min I hope)
 
User avatar
Cuchulainn
Posts: 60518
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Looking for hardware recommendations

August 1st, 2017, 8:37 pm

If a very simple MC experiment: you have N (large number) of jobs and T (small number) worker threads. Each job is something like "do 1.000 Monte Carlo draws from some distribution and report back the average".

If the efficiency is 100%, and if a single thread does 1 job is 1 sec, then all jobs are expected to be processed in ceil[N/T] seconds, right? However, with C cores this will be ceil[N/C] if T is a multiple of C?


(I'm working on it now! I'll have plots in 10 min I hope)
I suppose you have something along the lines of tasks for 
1. Path Evolve
2. RNG
3. Path assembler
4. Pricer
http://www.datasimfinancial.com
http://www.datasim.nl

Approach your problem from the right end and begin with the answers. Then one day, perhaps you will find the final question..
R. van Gulik
 
User avatar
Cuchulainn
Posts: 60518
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Looking for hardware recommendations

August 1st, 2017, 8:39 pm

If a very simple MC experiment: you have N (large number) of jobs and T (small number) worker threads. Each job is something like "do 1.000 Monte Carlo draws from some distribution and report back the average".

If the efficiency is 100%, and if a single thread does 1 job is 1 sec, then all jobs are expected to be processed in ceil[N/T] seconds, right? However, with C cores this will be ceil[N/C] if T is a multiple of C?


(I'm working on it now! I'll have plots in 10 min I hope)
I suppose you have something along the lines of tasks for 1 job: 
1. Path Evolve (many paths)
2. RNG (single or multiple threads)
3. Path assembler
4. Pricer

Threads are possible; what about C++ futures which do load balancing and scheduling automatically?
(The OS scheduler knows how to schedule better than developer-based code IMO).
http://www.datasimfinancial.com
http://www.datasim.nl

Approach your problem from the right end and begin with the answers. Then one day, perhaps you will find the final question..
R. van Gulik
ABOUT WILMOTT

PW by JB

Wilmott.com has been "Serving the Quantitative Finance Community" since 2001. Continued...


Twitter LinkedIn Instagram

JOBS BOARD

JOBS BOARD

Looking for a quant job, risk, algo trading,...? Browse jobs here...


GZIP: On