| ชื่อเรื่อง | : | Compilation Techniques for High-Performance Embedded Systems with Multiple Processors |
| นักวิจัย | : | Franke, Bjorn |
| คำค้น | : | Multiple Processors , Compilation Techniques , High-Performance Embedded Systems |
| หน่วยงาน | : | Edinburgh Research Archive, United Kingdom |
| ผู้ร่วมงาน | : | - |
| ปีพิมพ์ | : | 2547 |
| อ้างอิง | : | http://hdl.handle.net/1842/568 |
| ที่มา | : | - |
| ความเชี่ยวชาญ | : | - |
| ความสัมพันธ์ | : | - |
| ขอบเขตของเนื้อหา | : | - |
| บทคัดย่อ/คำอธิบาย | : | Institute for Computing Systems Architecture Despite the progress made in developing more advanced compilers for embedded systems, programming of embedded high-performance computing systems based on Digital Signal Processors (DSPs) is still a highly skilled manual task. This is true for single-processor systems, and even more for embedded systems based on multiple DSPs. Compilers often fail to optimise existing DSP codes written in C due to the employed programming style. Parallelisation is hampered by the complex multiple address space memory architecture, which can be found in most commercial multi-DSP configurations. This thesis develops an integrated optimisation and parallelisation strategy that can deal with low-level C codes and produces optimised parallel code for a homogeneous multi-DSP architecture with distributed physical memory and multiple logical address spaces. In a first step, low-level programming idioms are identified and recovered. This enables the application of high-level code and data transformations well-known in the field of scientific computing. Iterative feedback-driven search for “good” transformation sequences is being investigated. A novel approach to parallelisation based on a unified data and loop transformation framework is presented and evaluated. Performance optimisation is achieved through exploitation of data locality on the one hand, and utilisation of DSP-specific architectural features such as Direct Memory Access (DMA) transfers on the other hand. The proposed methodology is evaluated against two benchmark suites (DSPstone & UTDSP) and four different high-performance DSPs, one of which is part of a commercial four processor multi-DSP board also used for evaluation. Experiments confirm the effectiveness of the program recovery techniques as enablers of high-level transformations and automatic parallelisation. Source-to-source transformations of DSP codes yield an average speedup of 2.21 across four different DSP architectures. The parallelisation scheme is – in conjunction with a set of locality optimisations – able to produce linear and even super-linear speedups on a number of relevant DSP kernels and applications. |
| บรรณานุกรม | : |
Franke, Bjorn . (2547). Compilation Techniques for High-Performance Embedded Systems with Multiple Processors.
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Franke, Bjorn . 2547. "Compilation Techniques for High-Performance Embedded Systems with Multiple Processors".
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Franke, Bjorn . "Compilation Techniques for High-Performance Embedded Systems with Multiple Processors."
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom , 2547. Print. Franke, Bjorn . Compilation Techniques for High-Performance Embedded Systems with Multiple Processors. กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom ; 2547.
|
