Hi all,
More and More companies are talking about implement the QA lib onto FPGA infrastructure. Does anyone have some thought about this and want to share with us? Thanks Xin Zhao |
Hi,
I have just begun a project to run a proof-of-concept of a quant algorithm running on an Altera Stratix-II GX FPGA, which has 192 multipliers on-chip. It's technically quite challenging, to say the least, but the returns are potentially huge. People talk about seeing performance speed-ups of between ten and a thousand-fold. I'd be happy to achieve 100x increase in throughput of the algorithm that I'm porting to FPGA. (A European Lookback Call). Anyone else looking at this technology? Regards, Peter Moreton Getronics UK Ltd ________________________________ From: [hidden email] on behalf of Xin Zhao Sent: Thu 13/07/2006 14:15 To: [hidden email] Subject: [Quantlib-users] FPGA (field programmable gate array) Hi all, More and More companies are talking about implement the QA lib onto FPGA infrastructure. Does anyone have some thought about this and want to share with us? Thanks Xin Zhao |
When you talking about porting, what kind of procedure you are using? Do you have some existing tool to port the existing code to the new hardware or need to rewrite the algorithm by using some native language assuming there is no OO concept at that level.
Thanks! Xin On 7/13/06, Moreton, Peter <[hidden email]> wrote: Hi, |
Xin,
Porting to an FPGA involves migrating an existing C/C++ routine to a hardware definition language, such as Verilog or VHDL, and then compilation of the HDL into a bitmap which you upload to the FPGA for execution. There are tools available to cross-compile C/C++ to the HDL (Handel-C, SystemC, Impulse-C, SystemCrafter-C), but the process is not fully automatic, yet. The key skill is to be able to split an existing sequentiallly-executed C program into code that decomposes into multiple, parallel streams. So, for example, a current generation Opteron CPU can execute 3 integer instructions in parallel, assuming the compiler generates binaries that are 'sympathetic' to the CPU architecture. A big FPGA could have up to 512 dedicated multipliers on chip. So, an operation on a 512 element array could be executed in (512 / 3) = 170 cycles on an Opteron, and in 1 cycle on the FPGA. ( a huge simplification, but you get the idea... ) At the hardware level OO concepts don't exist, and are a bad idea since they will take you away from the hardware. FPGA's are about hardware. Regards, Peter Moreton Getronics UK Ltd ________________________________ From: Xin Zhao [mailto:[hidden email]] Sent: Thu 13/07/2006 14:44 To: Moreton, Peter Cc: [hidden email] Subject: Re: [Quantlib-users] FPGA (field programmable gate array) When you talking about porting, what kind of procedure you are using? Do you have some existing tool to port the existing code to the new hardware or need to rewrite the algorithm by using some native language assuming there is no OO concept at that level. Thanks! Xin On 7/13/06, Moreton, Peter <[hidden email]> wrote: Hi, I have just begun a project to run a proof-of-concept of a quant algorithm running on an Altera Stratix-II GX FPGA, which has 192 multipliers on-chip. It's technically quite challenging, to say the least, but the returns are potentially huge. People talk about seeing performance speed-ups of between ten and a thousand-fold. I'd be happy to achieve 100x increase in throughput of the algorithm that I'm porting to FPGA. (A European Lookback Call). Anyone else looking at this technology? Regards, Peter Moreton Getronics UK Ltd ________________________________ From: [hidden email] on behalf of Xin Zhao Sent: Thu 13/07/2006 14:15 To: [hidden email] Subject: [Quantlib-users] FPGA (field programmable gate array) Hi all, More and More companies are talking about implement the QA lib onto FPGA infrastructure. Does anyone have some thought about this and want to share with us? Thanks Xin Zhao ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ QuantLib-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-users |
Peter,
Thanks! As you said, FPGA's are about hardware. Does it mean the code will be hard to maintain and hardware dependent? Xin On 7/13/06,
Moreton, Peter <[hidden email]> wrote: Xin, |
Xin, yes and no. I think it is true to say that FPGA ported code will be targetted to a particular device family and I/O system to gain the maximum performance possible, but code that has been ported to a particular FPGA device could be quickly re-ported to another FPGA model or family.
Sure, the FPGA code will be harder to maintain that ANSI C or C++, but perhaps worth the effort *if* the ported code runs 100x faster than a regular CPU. ( So that a one-thousand node GRID can be replaced with ten FPGA servers? ) Peter Moreton Getronics UK Ltd ________________________________ From: Xin Zhao [mailto:[hidden email]] Sent: Thu 13/07/2006 15:35 To: Moreton, Peter Cc: [hidden email] Subject: Re: [Quantlib-users] FPGA (field programmable gate array) Peter, Thanks! As you said, FPGA's are about hardware. Does it mean the code will be hard to maintain and hardware dependent? Xin On 7/13/06, Moreton, Peter <[hidden email]> wrote: Xin, Porting to an FPGA involves migrating an existing C/C++ routine to a hardware definition language, such as Verilog or VHDL, and then compilation of the HDL into a bitmap which you upload to the FPGA for execution. There are tools available to cross-compile C/C++ to the HDL (Handel-C, SystemC, Impulse-C, SystemCrafter-C), but the process is not fully automatic, yet. The key skill is to be able to split an existing sequentiallly-executed C program into code that decomposes into multiple, parallel streams. So, for example, a current generation Opteron CPU can execute 3 integer instructions in parallel, assuming the compiler generates binaries that are 'sympathetic' to the CPU architecture. A big FPGA could have up to 512 dedicated multipliers on chip. So, an operation on a 512 element array could be executed in (512 / 3) = 170 cycles on an Opteron, and in 1 cycle on the FPGA. ( a huge simplification, but you get the idea... ) At the hardware level OO concepts don't exist, and are a bad idea since they will take you away from the hardware. FPGA's are about hardware. Regards, Peter Moreton Getronics UK Ltd ________________________________ From: Xin Zhao [mailto:[hidden email]] Sent: Thu 13/07/2006 14:44 To: Moreton, Peter Cc: [hidden email] Subject: Re: [Quantlib-users] FPGA (field programmable gate array) When you talking about porting, what kind of procedure you are using? Do you have some existing tool to port the existing code to the new hardware or need to rewrite the algorithm by using some native language assuming there is no OO concept at that level. Thanks! Xin On 7/13/06, Moreton, Peter <[hidden email]> wrote: Hi, I have just begun a project to run a proof-of-concept of a quant algorithm running on an Altera Stratix-II GX FPGA, which has 192 multipliers on-chip. It's technically quite challenging, to say the least, but the returns are potentially huge. People talk about seeing performance speed-ups of between ten and a thousand-fold. I'd be happy to achieve 100x increase in throughput of the algorithm that I'm porting to FPGA. (A European Lookback Call). Anyone else looking at this technology? Regards, Peter Moreton Getronics UK Ltd ________________________________ From: [hidden email] on behalf of Xin Zhao Sent: Thu 13/07/2006 14:15 To: [hidden email] Subject: [Quantlib-users] FPGA (field programmable gate array) Hi all, More and More companies are talking about implement the QA lib onto FPGA infrastructure. Does anyone have some thought about this and want to share with us? Thanks Xin Zhao ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 <http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642> _______________________________________________ QuantLib-users mailing list [hidden email] <mailto:[hidden email]> https://lists.sourceforge.net/lists/listinfo/quantlib-users ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ QuantLib-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/quantlib-users |
Free forum by Nabble | Edit this page |