Coarse Grain Reconfigurable Architectures

Coarse Grain Reconfigurable Architectures

DASS 2003 und SDA 2003 Dresden, May 8-9, 2003 Reiner Hartenstein Kaiserslautern University of Technology Data-Stream-based Reconfigurable Computing Kaiserslautern University of Technology new terms (only the terms are new, however, not their subject) Software: you all know Hardware: you all know Morphware: structurally programmable hardware Configware: sources for programming morphware Flowware*: similar to software, but data counter manipulation: data streams instead of instruction streams *) no relations to dataflow machine (dead area) 2003, [email protected] 2 clean terminology and taxonomy needed for comprehensibility http://hartenstein.de ... which data item flowware at which time at which port time flowware history: 1980: data streams (Kung, Leiserson) DPA

time 1996+: SCCC (LANL), SCORE, ASPRC, Bee (UCB), ... defines .... input data streams | | port # time x x x x x x - - - - x x x - - - - x x x x x x - - - - - - - x x x port # | | | | | | | |

| | | x x x (tutorials and courses available on all this) 2003, [email protected] | x x x | 1995: super systolic rDPA (Kress) x x x x x x 3 time x x x | Kaiserslautern University of Technology port # output data streams |

x x x http://hartenstein.de Kaiserslautern University of Technology programming: procedural vs. data-stream-based structural embedded systems: domain procedur al computing in ... time only* structural space and time hardwired program source software* instruction fetch currently (hardware +) (hardware +) configware + software** flowware flowware at loading before fabrication time **) software simulates flowware at run timereconfigurable: *) onlyat one

not programmable source runtime fully hardwired: needed data fetch CPU: algorithms fixed algorithms variable resources fixed resources fixed 2003, [email protected] emergin g reconfigurabl e algorithms variable resources variable 4 http://hartenstein.de Kaiserslautern University of Technology Digital System Platforms clearly distinguished (1) program source running on it platform hardware machine paradigm (not programmable) fine grain rGA (FPGA) morphwar e

coarse grain rDPU, rDPA reconfigurabl e data stream processor data stream processor (hardwired) instruction stream processor 2003, [email protected] configware flowware & configware anti machine flowware software 5 none von Neumann machine http://hartenstein.de Kaiserslautern University of Technology Crusty Computing Sciences more and more efforts yield only marginal improvements areas fade away dataflow machine s are dead shrinking supercomputin g conferences 98.5% vN-only this

monopoly is dangerous 2003, [email protected] [David Padua, John Hennessy] 6 http://hartenstein.de Dead Supercomputer Society Kaiserslautern University of Technology ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research Culler-Harris Culler Scientific Cydrome Dana/Ardent/ Stellar/Stardent [Gordon Bell, keynote at ISCA 2000] DAPP Denelcor Elexsi ETA Systems Evans and Sutherland Computer Floating Point Systems Galaxy YH-1 Goodyear Aerospace MPP Gould NPL Guiltech ICL Intel Scientific Computers International Parallel

Machines Kendall Square Research Key Computer Laboratories 2003, [email protected] 7 MasPar Meiko Multiflow Myrias Numerix Prisma Tera Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Supertek Supercomputer Systems Suprenum Vitesse Electronics http://hartenstein.de Kaiserslautern University of Technology Stealthy CS Crisis progress in CS stalled by qualification problems in industry and academia often hardware people needed to solve CS problems communication barriers between disciplines not only in embedded systems: comprehensibility barrier between procedural and structural mind set severe software quality problems exploding design cost and implementation cost 80% of designers hate their tools... ... unusable for SW people 2003, [email protected] 8 http://hartenstein.de

Kaiserslautern University of Technology What are the Challenges ? (1) [ST microelectronics, MorphICs, Dataquest, eASIC] w ] factor en s [H an s id th ba nd w at io n ar e Co m m un ic . (1 ) r a e y / 4 [M l

s e r oo ] w a 4y 10y Em be dd e d so ftw 90% by 2010 [D TI * la w ] la 2 1 *)0Department of Trade and Industry, London 2003, [email protected] 10 12 18 9

months http://hartenstein.de McKinsey Curve: dynamics of R&D disciplines Kaiserslautern University of Technology maturity of a discipline new discipline on top of it .... new CS by innovation CS discipline gets crustedchallenges .... saturation: limitations met evangelists .. .. consolidation innovation challenges and motivation evangelists create awareness year fundmental issues 2003, [email protected] 10 http://hartenstein.de Kaiserslautern University of Technology History of Computing but awareness still missing .... ... still ignored by most CS curricula maturity its already existing ... 1967

1957 2007 1987 1977 1997 classical CS mainframes technology issue and business model 2003, [email protected] new CS PC data streams ... morphware 11 free rider ? http://hartenstein.de here? Semiconductor Revolutions Kaiserslautern University of Technology Mainstream Silicon Application is switching every 10 Years standard e v a W s o t o m i

k a M 1957 custom mainframes LSI, MSI 1977 ASICs, accels le ab ur fig on 1967 proc. memory 1987 c re TTL 2007 1997 PC data streams ... technology issue and morphware business model Trittbrettfahrer http://hartenstein.de here? 12 ?

2003, [email protected] Kaiserslautern University of Technology The next EDA Industry Revolution time of Makimotos 3rd wave EDA industry paradigm switching every 7 years 2006 [Hartenstein] 1999 courtesy [Keutzer / Newton] 1992 1985 (Co-) Compilation Data-Stream-based DPU arrays Synthesis: Cadence, Synopsys ... Schematics entry: Daisy, Mentor, Valid ... 1978 Transistor entry: Applicon, Calma, CV ... 2003, [email protected] 13 http://hartenstein.de Kaiserslautern University of Technology its time for a new CS CS crisis: qualification problems opportunities next EDA wave: high level languages urging us

embedded systems: hw/cw/sw co-design flowware configware its time for a new CS ... .... a dichotomy of 2 machine paradigms 2003, [email protected] 14 http://hartenstein.de Kaiserslautern University of Technology - Matter & Antimatter + The World of Anti Matter machine paradigm: Anti Atom g n i n n i p s n o r t c e l E The World of Matter machine paradigm: the -

Atom P 2003, [email protected] 15 + n o r t i os g n i n spin http://hartenstein.de Kaiserslautern University of Technology Matter & Antimatter of Informatics : CPU - Anti Machine paradigm + m a e r t s n o i t c u r t s

in g n i n n i p ) s n n a m u e N n o (v nothing central ! DPU + 2003, [email protected] 16 - ng i n n i p s m a e r t s data http://hartenstein.de Kaiserslautern University of

Technology Drafting a Road Map The talk gives a draft of a road map toward a symbiosys of basic computing paradigms What delays the break-through of Reconfiguable Computing ? 2003, [email protected] 17 http://hartenstein.de Kaiserslautern University of Technology Legend: download von Neumanninstruction M stream instruction machine stream I/O Machine paradigms (reconf.) M Flowware I/O DPU CPU instruction sequencer data-stream memory machine Software

2003, [email protected] Configware 18 data address generator (data sequencer) asM* data stream DPU or rDPU http://hartenstein.de flowware: data streams spinning around Kaiserslautern University of Technology + heavy anti atoms: DPA = DPU array + + DPU - DPU - DPU DPU - DPU - DPU

DPU DPU DPU DPA + + - + 2003, [email protected] - - 19 + - DPA + - + http://hartenstein.de Kaiserslautern University of Technology Legend: Machine paradigms download von Neumanninstruction (reconf.) M

stream instruction machine stream I/O M Flowware I/O DPU CPU instruction sequencer I/O data-stream memory machine Software M M M M Configware M M M M M data address generator (data sequencer) asM* data stream DPU or rDPU M memory I/O (r)DPA (r)DPU

2003, [email protected] 20 http://hartenstein.de Kaiserslautern University of Technology rDPA example SNN filter KressArray Mapping Example rout thru only array size: 10 x 16 = 160 rDPUs Legend: rDPU not used 2003, [email protected] backbus connect used for routing only backbus connect 21 operator and routing port location not usedmarker http://hartenstein.de Kaiserslautern University of Technology PACT XPP: Reference Module: XPU128 CoProcessor ALU - PAE ALU Ctrl

CFG XPP128 ALU-Array PAE core Full 32 or 24 Bit Design 2 X PACs (Cluster) 2 Configuration Hierarchies 128 X ALU-PAEs Evaluation Board (2001) 32 X 1Kbyte RAM-PAEs XDS Development Tool with 8X I/O Elements Simulator [Jrgen Becker, Univ. Karlsruhe] PAE Core is 32- or 24-Bit ALU with DSP-Instruction Set and Controller Connecttions: Inputs + Outputs (Channels) + Events 2003, [email protected] 22 http://hartenstein.de Kaiserslautern University of Technology Throughput vs. Efficiency area used by application T. Claasen et al.: ISSCC 1999 *) R. Hartenstein: ISIS 1997 MOPS / mW 1000 L

100 ed r i dw le r a h 10 rab u g nfi o c ble (re a s r A figu rDP 1 sta 0.01 0.001 2 2003, [email protected] 1 0.5 0.25 23 L P DS L

S L L S S c) i g o l on rs c o s e s r s( r oce A p G t se FP n o i t essor c c o u r p r micro inst d r a d n

0.1 Wiring by abutment: 32 Bit example )* g n i put m co L L S L L 1 Bit CLB resources needed for reconfigurability 0.13 0.1 0,07 feature size http://hartenstein.de Kaiserslautern University of Technology Throughput vs. Flexibilityy MOPS / mW 1000 throughput 100 d r a h 10 o c e r

( As G P F n o i t c u instr 0.1 com ro p t e s )* g n i put c) i g o l P DS rs o s s ce hardwired coarse grain FPGAs

essor c o r p micro d r a d stan 0.01 0.001 ed r i w ble a r figu n o c ble (re a s r A u rDP nfig 1 ment: coarse grain goes far beyond bridging the gap T. Claasen et al.: ISSCC 1999 *) R. Hartenstein: ISIS 1997 von Neumann

flexibility 2 1 0.5 2003, [email protected] 0.25 0.13 0.1 0,07 feature size 24 http://hartenstein.de Kaiserslautern University of Technology Legend: Machine paradigms download von Neumanninstruction (reconf.) M stream instruction machine stream I/O data-stream memory machine M Flowware I/O DPU CPU instruction

sequencer Software data address generator (data sequencer) asM* data stream Configware DPU or rDPU embedded memory architecture* I/O M M M M M M M M M M memory I/O (r)DPA (r)DPU 2003, [email protected]*) new discipline: came just in time: 25 Herz et al.: Proc IEEE ICECS 2002 http://hartenstein.de Kaiserslautern University of Technology M Configware / Flowware Compilation M M

M high level source program M M M 2003, [email protected] M M M r. Data Path Array mapper M rDPA wrapper intermediate configware M M M M M data streams scheduler address generato r data 26

flowware sequencer http://hartenstein.de Kaiserslautern University of Technology An example by Nageldingers KressArray Xplorer Synthesizable Memory Communication Efficient Memory Communication should be directly supported by the Mapper Tools Legend: Optimized Parallel memory ports Memory Controller sequencers application not used http://kressarray.de 2003, [email protected] 27 http://hartenstein.de Kaiserslautern University of Technology Data-Stream-based Soft Machine instructions Memory (data memory)

Compiler Scheduler rDPA memory bank memory bank memory bank ... memory bank ... Sequencers memory bank (data stream generator) 2003, [email protected] 28 http://hartenstein.de Kaiserslautern University of Technology The Disk Farm? or a System On a Card? [Gordon Bell, The 500GB disc card LOTS of bandwidth A few disks replaced by >10s Gbytes RAM and a processor Jim Gray, ISCA2000] 14" MicroDrive: 2006: 9 GB, 50 MB/s ? (1.6X/yr capacity, 1.4X/yr BW)

Integrated IRAM processor Connected via crossbar switch growing like Moores law 16 Mbytes; ; 1.6 Gflops; 6.4 Gops 10,000+ nodes in one rack! 100/board = 1 TB; 0.16 Tflops 2003, [email protected] 29 http://hartenstein.de Kaiserslautern University of Technology computing paradigms and methodologies 1946: machine paradigm (von Neumann) 1980: data streams (Kung, Leiserson) 1989: anti machine paradigm introduced 1990: anti machine implementation methodology 1990: rDPU (Rabaey) 1994: anti machine high level programming language 1995: super systolic rDPA (Kress) 1996+: SCCC (LANL), SCORE, ASPRC, Bee (UCB), ... 1997: configware / software partitioning compiler (Becker) 2000: generator for rDPA with high memory bandwidth (tutorials and courses available on all this) 2003, [email protected] 30 http://hartenstein.de Kaiserslautern University of Technology Digital System Platforms clearly distinguished (2) program source running on it platform hardware machine paradigm

(not programmable) fine grain rGA (FPGA) morphwar e coarse grain rDPU, rDPA reconfigurabl e data stream processor data stream processor (hardwired) instruction stream processor 2003, [email protected] configware flowware & configware anti machine flowware software 31 none von Neumann machine http://hartenstein.de Kaiserslautern University of Technology Software Industry Software Industrys Secret of Success Procedural personalization via RAM-based

Machine Paradigm standard 1957 custom LSI, MSI 2003, [email protected] 1987 ASICs, accels 1977 32 1997 2007 e bl ra gu nfi 1967 co re proc., memory TTL http://hartenstein.de Kaiserslautern University of Technology Configware Industry ? Configware Industry

e r a Repeat Success Story by w g ll fi new Machine Paradigm ! e n h i o c ly: w a nic on ain q uali m t structural re rke fi e d personalization: a p e m ople RAM-based are ava before run time n i labl standard ot e 1957 custom LSI, MSI 2003, [email protected] 1987

ASICs, accels 1977 33 1997 2007 e bl ra gu nfi 1967 co re proc., memory TTL http://hartenstein.de Kaiserslautern University of Technology The Secret of not a niche market Success: Co- supporting platform-based design Compilation g n i k High is loo at it vN" machine paradigm level PL source Partitioner

anti machine paradigm CW SW Analyzer compiler / Profiler compiler SW code 2003, [email protected] CW Code 34 could provide the platforms supporting different platforms Resource Parameters http://hartenstein.de Kaiserslautern University of Technology thank you thank you for your patience 2003, [email protected] 35 http://hartenstein.de Kaiserslautern University of Technology >>> END END 2003, [email protected] 36 http://hartenstein.de Kaiserslautern

University of Technology Xputer Lab >>> Appendix University of Kaiserslautern Appendix for discussion 2003, [email protected] 2001, [email protected] 37 http://hartenstein.de http://KressArray.de Kaiserslautern University of Technology The Secret of not a niche market Success: Co- supporting platform-based design Compilation g n i k High is loo at it vN" machine paradigm level PL source Partitioner anti machine paradigm CW SW Analyzer compiler / Profiler compiler SW code 2003, [email protected]

CW Code 38 should provide the platforms supporting different platforms Resource Parameters http://hartenstein.de Kaiserslautern University of Technology Machine Paradigms machine category Computer (the Machine: v. Neumann) driven by: Instruction streams data streams (no dataflow) engine principles instruction sequencing sequencing data streams state register single program counter (multiple) data counter(s) Communication path set-up . ( instruction fetch ) data path The Anti Machine at run time

at load time resource DPU (e.g. single ALU) DPU or DPA (DPU array) etc. operation sequential parallel pipe network etc. also hardwired implementations* *) e g. Bee project Prof. Broderson 2003, [email protected] 39 http://hartenstein.de Kaiserslautern University of Technology language category both deterministic operation sequence driven by: state register address computation Instruction fetch parallel memory bank access 2003, [email protected] Programming Language Paradigms Computer Languages Languages f. Anti Machine procedural sequencing: traceable, checkpointable read next instruction, read next data item, ve goto (instr. addr.),

goto (data addr.), r jump (to instr. addr.), jump (to data addr.), to y e instr. loop, loop nesting data loop, loop nesting, l e as y ar no parallel loops, escapes, parallel loops, escapes, n instruction stream branching data stream branching program counter data counter(s) m ul massive memory t ip overhead avoided cycle overhead G l memory cycle overhead overhead avoided interleaving only no restrictions 40 AG e s http://hartenstein.de Kaiserslautern University of Technology Xputer Lab Jrgen Beckers Co-DE-X Co-Compiler supporting platform-based design University of Kaiserslautern X-C Computer machine paradigm

X-C is C language extended by MoPL Partitioner Xputer machine paradigm X-C GNU C Analyzer compiler / Profiler compiler Host KressArray Software Configware 2003, [email protected] 2001, [email protected] 41 DPSS p o o L fors s n a Tr ation m supporting different platforms Resource Parameters http://hartenstein.de http://www.fpl.uni-kl.de KressArray Family generic Fabrics: a few examples Select Kaiserslautern University of Technology Select mode, number, width of NNports 16

Function Repertory 8 32 24 2 rDPU 4 select Nearest Neighbour (NN) Interconnect: an example http://kressarray.de 2003, [email protected] + 42 routthroug h only more NNports: rich Rout Resources rout-through and function Examples of 2nd Level Interconnect: layouted over rDPU cell no separate routing areas ! http://hartenstein.de Kaiserslautern University of Technology Impact of Makimotos wave Configware Industry Software Industrys Secret of Success Personalization

(CAD) before fabrication standard 2003, [email protected] ASICs, accels 1977 43 1997 2007 e bl ra gu nfi LSI, MSI 1987 structural personalization: RAM-based before run time co re 1967 custom Procedural personalization via RAM-based Machine Paradigm proc., memory TTL 1957 Repeat Success Story by new Machine Paradigm !

http://hartenstein.de Kaiserslautern University of Technology (procedural) The Dominance of the Submarine Model ... structurally disabled Its time to attack the software faculty dictatorship.Get involved! 2003, [email protected] Hardware ... indicates, that our CS education system produces zillions of mentally disabled Persons completely disabled to cope with solutions other than software only 44 http://hartenstein.de Kaiserslautern University of Technology However, CS Education . is basedcurrent on the Submarine Model This model disables ... Algorithm Brain usage: procedural-only re a w

t Sof procedural high level Programming Language Assembly Language Hardware invisible: under the surface Hardware Software Faculty Colleagues shy away from the Paradigm Shift:Brain hurts? - cant be: their [email protected] has been amputated this 2003, 45 http://hartenstein.de Kaiserslautern University of Technology Hardware and Software as Alternatives procedural structural Algorithm partitioning Brain Usage: both Hemispheres Hardware, Configware Software Hardw/Configw Software only Software only & Hardw/Configw 2003, [email protected] 46 http://hartenstein.de

Kaiserslautern University of Technology Why Coarse Grain instead of FPGA ? Sources: Proc ISSCC, ICSPAT, DAC, DSPWorld lic o FPGA t s y physical rs e p su FPGA logical 100 000 000 000 Transistors / chip 10 000 000 000 1000 000 000 100 000 000 10 000 000 physical logical ory m me ssor e c o opr micr reduced reconfigurability overhead by up to ~ 1000 100 000 10 000

2003, [email protected] ~ 10 000 FPGA routed 1000 000 1980 1000 ~ 10 1990 2000 47 2010 drastically much fastersmaller loading configuration memory a lot of more benefits http://hartenstein.de Kaiserslautern University of Technology Second Blossom of CS progress in CS stalled by qualification problems in industry and academia Communication barriers between disciplines Exploding design and implementation cost Not only in embedded systems: comprehensibility barrier between procedural and structural mind set Severe software quality problems Bad hardware / configware design tools: more than 80% of designers hate their tools 2003, [email protected] 48 http://hartenstein.de

Kaiserslautern University of Technology Procedural vs. structural progress in CS stalled by qualification problems in industry and academia like microprocessors also morphware is RAMbased secret of sucsess of software industry Could configware industry repeat this success story ? Configware will remain a niche market, unless it Comes along with hardware / configware / software co-design 2003, [email protected] 49 http://hartenstein.de Kaiserslautern University of Technology Algorithms and Data Structures People ... have to go beyond pointers, queues, and stacks # 2003, [email protected] 50 http://hartenstein.de Kaiserslautern University of Technology roadmap old CS lab course philosophy: given an application: implement it by a program -/new CS freshman lab course environment: Given an application: a) implement it by writing a program b) implement it as a morphware prototype c) Partition it into P and Q c.1) implement P by software c.2) implement Q by morphware c.3) implement P / Q communication interface

2003, [email protected] 51 http://hartenstein.de Kaiserslautern University of Technology Algorithms and Data Structures ... have to go beyond pointers, queues, and stacks Extend by including algorithmic issues in software /morphware/ hardware migration additional levels of parallelism: chaining, pipelining, systolic, super-systolic, wavefront arrays additional data structures and storage organization: the new distributed memory discipline 2003, [email protected] 52 http://hartenstein.de Kaiserslautern University of Technology Computer Organization / Architecture ... have to go beyond von Neumann, Extend by including nested machines, address generators the anti machine paradigm Extended taxonomy of platforms: procedural, structural, hardwired, reconfigurable, zhybrid systems 2003, [email protected] 53 http://hartenstein.de Kaiserslautern University of Technology Languages and Compilers ... have to go beyond von

Neumann, Extend by including Configware / flowware compilers, Procedural / structural co-compilers (data-procedural) flowware languages 2003, [email protected] 54 http://hartenstein.de Kaiserslautern University of Technology Semiconductor Revolutions e v a W s o t o m i k a M software people standard hardware people 2003, [email protected] 1977 ASICs, accels d e r u t struc esign

d I S L V new breed (M&C) 55 1997 2nd design crisis LSI, MSI data s m a e str 2007 1987 1st design crisis custom 1967 le ab ur fig on 1957 n o i t c u r t s s

n m proc., i a re t s memory new breed needed c re TTL Mainstream Silicon Application is switching every 10 Years : p a g n o i t p a u c i n n a mu gy cle m o C inolo Term http://hartenstein.de Kaiserslautern University of Technology 2003, [email protected] EDA the main bottleneck 56 http://hartenstein.de

Kaiserslautern University of Technology guess it ! 2003, [email protected] Biggest Mistake of EDA 57 http://hartenstein.de Kaiserslautern University of Technology [Richard Newton] Innovation Stalled ? What is next after VHDL ? 2003, [email protected] 58 http://hartenstein.de Kaiserslautern University of Technology Flowware and Software Software: instruction-stream-based i. e. based on program counter manipulation Flowware: data-stream-based i. e.based on data counter manipulation Software and lowware: like 2-eiige Zwillinge einfhren 2003, [email protected] 59 http://hartenstein.de Kaiserslautern University of Technology

Models (1) 1. There is a very wide variety of architectures 2. Most papers have bad organization: to show authors creativeness often less relevant details are stressed in a confusing mix of abstraction levels 3. Architectures are not described in terms of a common model 4. a common model is existing but its usually ignored 5. We need a comprehensible taxonomy of architectures 2003, [email protected] 60 http://hartenstein.de Kaiserslautern University of Technology Models (2) 1. Reconfigurable instructions et extension 2. Reconfigurable co-processor 2a. FPGA 2b. Coarse grain I omit 3: hardwired accelerators I do not talk about reconfigurable instruction set processors M&C structured VLSI design: max no. Of transistors within regular strcutures Craig Mudge: regularity factor - structured Configware Design 2003, [email protected] 61 http://hartenstein.de Kaiserslautern University of Technology http://www.uni-kl.de >> history & terminology history & terminology skyrocketing requirements destructive von Neumann monopoly high mask cost

low battery capacity new compilation model conclusions 2003, [email protected] 62 http://hartenstein.de Kaiserslautern University of Technology Semiconductor Revolutions e v a W s o t o m i k a M software people standard hardware people 2003, [email protected] 1977 ASICs, accels d e r u t struc esign d I

S L V new breed (M&C) 63 1997 2nd design crisis LSI, MSI data s m a e str 2007 1987 1st design crisis custom 1967 le ab ur fig on 1957 n o i t c u r t s s n m

proc., i a re t s memory new breed needed c re TTL Mainstream Silicon Application is switching every 10 Years : p a g n o i t p a u c i n n a mu gy cle m o C inolo Term http://hartenstein.de Kaiserslautern University of Technology Terminology: DPU versus CPU ... DPU: data path unit DPA (r) DPA: DPU array GA: gate array rDPU: reconfigurable DPU rDPA: reconfigurable DPA (r) DPU

rGA: reconfigurable GA DPU is no CPU: there is nothing central - like in a DPA 2003, [email protected] 64 CPU DPU DPU instruction sequencer http://hartenstein.de Kaiserslautern University of Technology flowware defines .... time x x x DPA time port # time - - - x x x - - - - x x x x x x - - - - - - - x x x port # |

| | | | | | | | | | x x x ... software manipulates the program counter 2003, [email protected] | x x x x x x - 65 time x x x | flowware manipulates the data counter(s) ... | input data streams |

| ... which data item at which time at which port x x x x x x port # output data streams | x x x http://hartenstein.de Kaiserslautern University of Technology History of data-streams 1980: data streams (Kung, Leiserson) 1995: super systolic rDPA (Kress) 1996+: SCCC (LANL), SCORE, ASPRC, Bee (UCB), ... (tutorials and courses available on all this) 2003, [email protected] 66 http://hartenstein.de Kaiserslautern University of Technology http://www.uni-kl.de >> skyrocketing requirements

history & terminology skyrocketing requirements destructive von Neumann monopoly high mask cost low battery capacity new compilation model conclusions 2003, [email protected] 67 http://hartenstein.de Kaiserslautern University of Technology What are the Challenges ? (1) se la ns w ] [ST microelectronics, MorphICs, Dataquest, eASIC] an factor Co m m un ic a tio n ba n dw id th

[H 2 ra g te n I n e d y t i s (1 r) a e y / .4 s e r ] o w o [M la 4y ty i s n e d n ) r o i a t

e a /y gr 2 e . t 1 n ( i r o s es c o r p n o i t 1 0 2003, [email protected] 10 12 68 18 months http://hartenstein.de Kaiserslautern University of Technology Changing Models of Computing software design

Software (procedural) hardware/ software co-design hardware Software spec downloading I/O RAM data path instruction sequencer von Neumann 2003, [email protected] downloading RAM host CAD hardwired accelerator(s) hardware 69 the problem with typical CS people: -the dominance of von Neumann - they cannot partition - they cannot migrate hardware people needed http://hartenstein.de

Kaiserslautern University of Technology http://www.uni-kl.de >> destructive von Neumann monopoly history & terminology skyrocketing requirements destructive von Neumann monopoly high mask cost low battery capacity new compilation model conclusions 2003, [email protected] 70 http://hartenstein.de Kaiserslautern University of Technology Which machine paradigm ? von Neuman does not support morphware 2003, [email protected] 71 http://hartenstein.de Kaiserslautern University of Technology What about CS people ? CS people TTL 1957 proc., memory

1967 LSI, MSI 2003, [email protected] 1987 ASICs, procedural accels programming languages, compiler computer architecture 1977 72 FPGAs 1997 2007 soft CPUs coarse grain http://hartenstein.de Kaiserslautern University of Technology Flag ship example: annual IEEE ISCA conference Statistics [David Padua, John Hennessy, et al.] series the Datenflow Machine is dead vN Parallelism: Resignation ? N v %

98.5 Interconnect Fabrics: taken over by the opposition: Reconfigurable Computing 2003, [email protected] 73 http://hartenstein.de Kaiserslautern University of Technology There are more Levels of Parallelism Process level ignore d by t y pi c a l C & i gn o S peop r e d by le CS cur ricula Loop Level (data-stream-based, pipe nets, etc.) Instruction Level (VLIW etc.) RT Level (special architectures etc.) Logic Level (FPGAs) 2003, [email protected] 74 http://hartenstein.de Kaiserslautern University of Technology What are the Challenges ? (2) [ST microelectronics, MorphICs, Dataquest, eASIC]

w ] factor 1 *)0Department en s [H an s id th ba nd w at io n ar e Co m m un ic Em be dd e d so ftw 90% by 2010 [D TI *

la w ] la 2 n (1 4y .2 ty (1 i s n de tion a r g te or in s s e c pro r) /yea 10y ar e y / 7 0 . w] (1 a l s n erso

t t a P [ width d n a b y Memor of Trade and Industry, London 2003, [email protected] io at r g te In y sit n de ) ar e /y .4 re o o [M ] aw l s 10 12 75

18 ) months http://hartenstein.de Kaiserslautern University of Technology software design Software Changing Models of Computing (procedural) hardware/ configwar e/ software Software software co-design Configware co-design Software (structural) downloading I/O RAM data path instruction sequencer von Neumann downloading RAM host CAD

hardwired accelerator(s) reconf. hostaccelerator(s) RAM RAM hardware/ configware/software co-design Hardware 2003, [email protected] downloading 76 Morphware http://hartenstein.de Kaiserslautern University of Technology no von Neumann bottleneck ? typical CS people: how to provide more performance to these people ? think in terms of machine models: sequencing instruction by instruction cannot be turned into hardware people new machine paradigm needed which does not have a von Neumann bottleneck the anti machine has no von Neumann bottleneck data streams instead of an instruction stream flowware instead of software 2003, [email protected] 77 http://hartenstein.de Kaiserslautern University of Technology Just in time

The new distributed memory discipline: just in time to implement the anti machine. [3] M. Herz et al. (invited): Memory Organization for Data-Stream-based Reconfigurable Computing; Proc. ICECS 2002 2003, [email protected] 78 http://hartenstein.de Kaiserslautern University of Technology http://www.uni-kl.de >> high mask cost history & terminology skyrocketing requirements destructive von Neumann monopoly high mask cost low battery capacity new compilation model conclusions 2003, [email protected] 79 http://hartenstein.de Kaiserslautern University of Technology What are the Challenges ? (3) [ST microelectronics, MorphICs, Dataquest, eASIC] factor [H an s la w at

io n ba nd w id th [D TI * ftw ar e so Co m m un ic be dd ed Em avoid applicationspecific silicon ! en s ] la w ] 2 io at r g te In n

y sit n de (1 ) ar e /y .4 re oo [M ] aw l s 3y ) r a ye / 5 (1.2 1.2/year) 4y ( t s sity n e o c ion d grat RErocessor inte N p nd a ea r ) 1.07/y

( k ] w a l ns atterso P [ h t Mas andwid mory b 10y Me 30y 1 10 0 *) Department 12 18 months of Trade and Industry, London 2003, [email protected] 80 http://hartenstein.de Kaiserslautern University of Technology Coarse grain vs. Fine grain Reconfigurability: fine grain (FPGAs, rGAs) coarse grain (PACT AG, Munich) multi grain (e. g. by slice bundling)

2003, [email protected] 81 http://hartenstein.de Kaiserslautern University of Technology http://www.uni-kl.de >> low battery capacity history & terminology skyrocketing requirements destructive von Neumann monopoly high mask cost low battery capacity new compilation model conclusions 2003, [email protected] 82 http://hartenstein.de Kaiserslautern University of Technology What are the Challenges ? (4) [ST microelectronics, MorphICs, Dataquest, eASIC] factor id th ba nd w at io n Co m m un

ic Em be dd ed so ftw ar e [D TI * [H an s la w en s ] la w ] 2 y sit n de (1 ) ar e /y .4 re oo

[M ] aw l s 3y t cos E NR e 5/y 2 . 1 ( ar) 4y r) and /yea 2 . k 1 s ( ty Ma ensi d n tio egra t n i or 10y cess o r p ea r ) (1.07/y ]

w a l ns atterso P [ h t wid y band Memor io at r g te In n 30y Battery capacity (1.03/year) 1 10 0 *) Department 12 18 months of Trade and Industry, London 2003, [email protected] 83 http://hartenstein.de Kaiserslautern University of Technology Algorithmic cleverness Very high throughput on low power slow FPGAs may

be obtained only by algorithmic cleverness - not yet taught by CS & CSE at Universities an urgent educational problem. 2003, [email protected] 84 http://hartenstein.de Kaiserslautern University of Technology http://www.uni-kl.de >> new compilation model history & terminology skyrocketing requirements destructive von Neumann monopoly high mask cost low battery capacity new compilation model conclusions 2003, [email protected] 85 http://hartenstein.de Kaiserslautern University of Technology What are the Challenges ? (5) [ST microelectronics, MorphICs, Dataquest, eASIC] en s ] [H an s la w at

io n ba nd w id th [D TI * ftw ar e so Co m m un ic be dd ed Em new 2 compilation techniques needed ! supported by a new machine paradigm la w ] factor g te In ity s n

de (1 ) ar e /y .4 ig s e re o o [M o c n 2y ] aw l s mp ity x le /y 4 . (1 t cos E NR r) a e e

5/y 2 . 1 ( ar) r) and /yea 2 . k 1 s ( ty Ma ensi d n tio ar) e egra y t / n i 5 or (1.1 cess y o r t i p uctiv d ea r ) o r p (1.07/y ] r w

a l e n ons desig andwidth [Patters yb Memor tio ra n d Battery capacity (1.03/year) 1 10 0 *) Department 12 18 3y 4y 5y 10y 30y months of Trade and Industry, London 2003, [email protected] 86 http://hartenstein.de Kaiserslautern University of Technology http://www.uni-kl.de

>> conclusions history & terminology skyrocketing requirements destructive von Neumann monopoly high mask cost low battery capacity new compilation model conclusions 2003, [email protected] 87 http://hartenstein.de Kaiserslautern University of Technology Conclusion No, we are not ready for the breakthrough, since our computing education is obsolete, because of the von Neumann monopoly. But all ingredients are available to jazz up our CS & CSE curricula 2003, [email protected] 88 http://hartenstein.de Kaiserslautern University of Technology >>> thank you thank you for your patience 2003, [email protected] 89 http://hartenstein.de Kaiserslautern University of

Technology scalability The Scalability Problem The Routing congestion Problem grows with the size of the FPGA 2003, [email protected] 90 http://hartenstein.de Kaiserslautern University of Technology SNN filter KressArray Mapping Example http://kressarray.de rout thru only array size: 10 x 16 = 160 rDPUs Legend: rDPU not used 2003, [email protected] backbus connect used for routing only backbus connect 91 operator and routing port location not usedmarker http://hartenstein.de Kaiserslautern

University of Technology Xplorer Plot: SNN Filter Example [13] http://kressarray.de 2 hor. NNports, 32 bit 3 vert. NNports, 32 bit route-thru-only rDPU 2003, [email protected] + result operand 92 operator operand route thru backbus connect http://hartenstein.de Kaiserslautern University of Technology Conclusion: all knowledge needed is available machine paradigm courses / embedded tutorials: languages full day courses: compilation techniques anti architectural resources sequencing methodology: hw & sw hw / sw partitioning methodology parallel memory IP core and module generator vendors anything else needed 2003, [email protected] 93

http://hartenstein.de Kaiserslautern University of Technology ... has a chance Configware Industry has a Chance 2003, [email protected] 94 http://hartenstein.de Kaiserslautern University of Technology Conclusions the anti machine is the way to go for massive parallelism, also data-intensive applications reconfigurable anti machine for high performance with short product life cycles, unstable standards reconfigurable for low cost low volume production sparepart problem: needs new infrastructures Giga FPGAs highly promising - only by a new design flow: configware could repeat the success of software industry 2003, [email protected] 95 http://hartenstein.de Kaiserslautern University of Technology Paradigm Shifts: Nick Tredennicks view why 2 program sources ? instruction-streambased computing: reconfigurable computing:

algorithms variable algorithms variable resources fixed resources variable programmable 2003, [email protected] 96 http://hartenstein.de Kaiserslautern University of Technology Compilation for (r)DPA of anti machine high level source program (software notation) parameters wrapper expression morphware tree DPU library configware code generators mapper scheduler streamware flowware 2003, [email protected] 97 http://hartenstein.de Kaiserslautern University of Technology Moore's Misleading predictors

Law misleading is becoming predictor of a future developments. 2003, [email protected] 98 http://hartenstein.de Kaiserslautern University of Technology High mask cost High mask cost may be avoided completely by morphware use, or, partly by GAs (ASICs). 2003, [email protected] 99 http://hartenstein.de Kaiserslautern University of Technology Fault tolerance Morphware is the only way to obtain fault-tolerant ICs. 2003, [email protected]

100 http://hartenstein.de Kaiserslautern University of Technology World-wide services FPGAs may provide an important benefit for world-wide services and all other after sales consequences 2003, [email protected] 101 http://hartenstein.de Kaiserslautern University of Technology Re-configurable Hardware ?? Terminology has been highly confusing Re-configurable Hardware ?? this Hardware is not hard ! its Morphware We need a concise terminology: a consensus is on the way 2003, [email protected] 102 http://hartenstein.de Super Pipe Networks Kaiserslautern University of Technology The key is array systolic

array mapping, rather t han architectur e applications regular data dependencies only supersystolic rDPA * pipeline properties shape resources linear only uniform only mapping linear projection or algebraic synthesis simulated annealing or P&R algorithm no restrictions scheduling (data stream formation) (e.g. force-directed) scheduling algorithm *) KressArray [1995] 2003, [email protected] 103

http://hartenstein.de Kaiserslautern University of Technology An example by Nageldingers KressArray Xplorer Synthesizable Memory Communication Efficient Memory Communication should be directly supported by the Mapper Tools Legend: Optimized Parallel memory ports Memory Controller sequencers application not used http://kressarray.de 2003, [email protected] 104 http://hartenstein.de Kaiserslautern University of Technology Stream-based Soft Machine instructions Memory (data memory) Compiler Scheduler

rDPA memory bank memory bank memory bank ... memory bank ... Sequencers memory bank (data stream generator) 2003, [email protected] 105 http://hartenstein.de JPEG zigzag scan pattern Kaiserslautern *> Declarations University of goto PixMap[1,1] EastScan is step by [1,0] end EastScan; HalfZigZag; SouthWestScan uturn (HalfZigZag) Technology 4 SouthScan is step by [0,1] endSouthScan; 1 3

SouthWestScan is loop 8 times until [1,*] step by [-1,1] endloop end SouthWestScan; HalfZigZag is EastScan loop 3 times SouthWestScan SouthScan NorthEastScan EastScan endloop end HalfZigZag; 2003, [email protected] x y HalfZigZag data counter data counter data counter data counter 106 HalfZigZag 2 NorthEastScan is loop 8 times until [*,1] step by [1,-1] endloop end NorthEastScan; http://hartenstein.de Similar Programming Language Paradigms ve Kaiserslautern University of Technology

ry easy to lear n language category both deterministic sequencing driven by: Computer Languages Xputer Languages procedural sequencing: traceable, checkpointable read next instruction, read next data object, goto (instruction addr.), goto (data addr.), jump (to instruction addr.), jump (to data addr.), instruction loop, data loop, instruction loop nesting data loop nesting, no parallel loops, parallel data loops, instruction loop escapes, data loop escapes, instruction stream branching data stream branching 2003, [email protected] 107 http://hartenstein.de Kaiserslautern University of Technology GAG Scheme GAG = Generic Address Generator B0 [| L0

Limit Stepper GAG 2003, [email protected] A A | L | ] limit B0 Address Stepper | Base Stepper A 108 http://hartenstein.de Kaiserslautern University of Technology GAG: Address Stepper GAG: Address Stepper ] [ B0 [| maxStepCount

init tag L A | | Base stepVector B0 Limit GAG = Generic Address Generator | A A Step Counter +/ =o Escape Clause End Detect L | | ] limit A Address 2003, [email protected]

109 endExec http://hartenstein.de Kaiserslautern University of Technology Generic Sequence Examples L0 A a) Limit Slider b) d) Address Stepper A c) e) 2003, [email protected] f) B0 Base Slider GAG g) 110 http://hartenstein.de Kaiserslautern University of Technology Slider Operation Demo Example

address floor F ceiling B0 A B x 2003, [email protected] y B 111 L0 C L L http://hartenstein.de Kaiserslautern University of Technology What are the Challenges ? [ST microelectronics, MorphICs, Dataquest, eASIC] factor id th ba nd w at io n Co m m

un ic Em be dd ed so ftw ar e [D TI * [H an s la w en s ] la w ] 2 y sit n de (1 ) ar e /y .4 re

oo [M ] aw l s 3y t cos E NR e 5/y 2 . 1 ( ar) 4y r) and /yea 2 . k 1 s ( ty Ma ensi d n tio egra t n i or 10y cess o r p ea r ) (1.07/y

] w a l ns atterso P [ h t wid y band Memor io at r g te In n 30y Battery capacity (1.03/year) 1 10 0 *) Department 12 18 months of Trade and Industry, London 2003, [email protected] 112 http://hartenstein.de Kaiserslautern University of Technology What are the Challenges ? [ST microelectronics, MorphICs, Dataquest, eASIC]

factor id th ba nd w at io n Co m m un ic Em be dd ed so ftw ar e [D TI * design complexity: +40%/year doub 2y design productivity: +15%/year doub 5y SIA roadmap] [H an s la w en s ] la

w ] 2 y sit n de (1 ) ar e /y .4 re oo [M ] aw l s 3y t cos E NR e 5/y 2 . 1 ( ar) 4y r) and /yea 2 . k 1 s

( ty Ma ensi d n tio egra t n i or 10y cess o r p ea r ) (1.07/y ] w a l ns atterso P [ h t wid y band Memor io at r g te In n 30y Battery capacity (1.03/year) 1 10 0 *) Department

12 18 months of Trade and Industry, London 2003, [email protected] 113 http://hartenstein.de Kaiserslautern University of Technology >> Outline Morphware Changing Models by SoC Development New Machine Paradigm needed The Dichotomy of Paradigms Outlook http://www.uni-kl.de 2003, [email protected] 114 http://hartenstein.de Kaiserslautern University of Technology The Morphware Market fine-grained: coarse-grained: rDPUs: configurable functional blocks PACT AG, Munich, Germany http://pactcorp.com cLBs, rLBs: configurable logic blocks

Lattice 15% Altera 37% Actel 6% Xilinx 42% total: $3.7 Bio Top 4 PLD Manufacturers 2000 [Dataquest] > $7 billion by fastest 2003. growing semiconductor market segment PLD vendors and their alliances provide libraries of soft IPs Configware Market 2003, [email protected] 115 http://hartenstein.de Kaiserslautern University of Technology Coarse grain vs. Fine grain Reconfigurability: fine grain (FPGAs, rGAs) coarse grain (PACT AG, Munich) multi grain (e. g. by slice bundling) 2003, [email protected] 116 http://hartenstein.de Kaiserslautern University of Technology Xplorer Plot: SNN Filter Example

[13] http://kressarray.de 2 hor. NNports, 32 bit 3 vert. NNports, 32 bit route-thru-only rDPU 2003, [email protected] + result operand 117 operator operand route thru backbus connect http://hartenstein.de Kaiserslautern University of Technology Morphware only: some soft CPU core examples core architecture platform MicroBlaze 125 MHz 70 D-MIPS 32 bit standard RISC 32 reg. by 32 LUT RAMbased reg. Xilinx up to 100 on one FPGA Nios 16-bit instr. set

Altera Mercury Nios 50 MHz 32-bit instr. set Altera 22 D-MIPS Nios 8 bit Altera Mercury core architecture platform Leon 25 Mhz SPARC ARM7 clone ARM uP1232 8bit CISC, 32 reg. 200 XC4000E CLBs REGIS 8 bits Instr. + ext. ROM 2 XILINX 3020 LCA Reliance-1

12 bit DSP Lattice 4 isp30256, 4 isp1016 1Popcorn-1 8 bit CISC Altera, Lattice, Xilinx gr1040 16-bit gr1050 32-bit My80 i8080A FLEX10K30 or EPF6016 YARD-1A 16-bit RISC, 2 opd. Instr. old Xilinx FPGA Board DSPuva16 16 bit DSP Spartan-II xr16 RISC integer C SpartanXL 2003, [email protected] Acorn-1

118 1 Flex 10K20 http://hartenstein.de Kaiserslautern University of Technology soft CPUs in academic teaching UCSC: 1990! Mraldalen University Chalmers University Cornell University Gray Research Georgia Tech Hiroshima City Univ. 2003, [email protected] 119 Michigan State Univ. de Valladolid Virginia Tech Washington U. St. Louis New Mexico Tech UC Riverside Tokai University http://hartenstein.de Kaiserslautern University of Technology >> New Machine Paradigm needed Morphware Changing Models by SoC Development

New Machine Paradigm needed The Dichotomy of Paradigms Outlook http://www.uni-kl.de 2003, [email protected] 120 http://hartenstein.de Kaiserslautern University of Technology >> The Dichotomy of Paradigms Morphware Changing Models by SoC Development New Machine Paradigm needed The Dichotomy of Paradigms Outlook http://www.uni-kl.de 2003, [email protected] 121 http://hartenstein.de Kaiserslautern University of Technology >> Outlook Morphware Changing Models by SoC Development New Machine Paradigm needed The Dichotomy of Paradigms Outlook http://www.uni-kl.de 2003, [email protected] 122 http://hartenstein.de Kaiserslautern University of

Technology Why fine grain ? no specific silicon: low production volume (aerospace, automotive, military, industrial controllers, et al.) the spare part problem design flow coming Giga-FPGA 2003, [email protected] 123 http://hartenstein.de Kaiserslautern University of Technology Configware Industry vs. Software Industry can configware industry repeat the success story? RAM-based Compatibility Scalability Education problems 2003, [email protected] 124 http://hartenstein.de Kaiserslautern University of Technology Problems of Parallelism enormous speed-ups: factor of 3 to >10 000 Software to FPGA migration: algorithmic cleverness missing, no education no methodology for interconnect estimation Software to rDPA migration methodology only in special areas (DSP, wireless ....) the area of parallel algorithms needs orientation of its scope ... 2003, [email protected]

... far beyond a complete re- traditional platforms 125 http://hartenstein.de Kaiserslautern University of Technology Evolution of FPGA and its design flow [ la S. Guccione] soft FPGA core CPU HLL Compiler HLL Compiler soft rDPA rDPA core Schematics/ HDL Netlister Netlist Place and Route . . FPGA core inter faces inter faces CPU Memory

core core as soon as Giga FPGA is available CPU Memory core core Bitstream User Code HLL Compiler Executable 2003, 2002, [email protected] [email protected] 126 Compiler http://hartenstein.de http://KressArray.de Kaiserslautern University of Technology ASIC emulation ASIC emulation / Rapid Prototyping: to replace simulation Quickturn (Cadence), IKOS (Synopsys), Celaro (Mentor) hours of compilation run: inefficient since netlist-based: ... ... ASIC emulators will become obsolete soon by RTR: in-circuit execution debugging instead of emulation new business model: upgradable morphware is the product emulation for solving the spare part problem in many areas 2003, [email protected] 127 http://hartenstein.de Kaiserslautern

University of Technology the wrong machine paradigm Nasty pMatter erfor a nce prom extremely blem s power hungry and area inefficient CPU + Data Path instruction sequencer reconfigurable? central von Neumann bottleneck RAM Instruction Fetch Overhead Address Computation Overhead 2003, [email protected] 128 http://hartenstein.de Matter vs. Antimatter: CPU vs. DPU DPU - Data

Path instruction sequencer 2003, [email protected] 129 data stream data streams + + + + + Kaiserslautern University of Technology DPU Data Path Unit http://hartenstein.de Kaiserslautern University of Technology CPU + Data Path CPU: RAM+ simple machine based paradigm + scalability + relocatability + compatibility = secret of success

of software industry RAM instruction sequencer 2003, [email protected] 130 http://hartenstein.de Kaiserslautern University of Technology for configware industry is missing: FPGA compatibility, fully scalable FPGA, relocatable configuration code Success Factors rDPUs and rDPAs do much better than FPGAs data stream based property instruction stream based reconfigurable fine grain (FPGA) coarse grain hardwired yes yes (hardwired) RAM-based yes machine

paradigm yes no feasible** available available compatibility yes limited good** feasible feasible scalability yes no good* (hardwired) code relocatability yes good* (hardwired) success of software industry 2003, [email protected] available** good** no

**) mapping coarse grain onto FPGA 131 *) if KressArray used http://hartenstein.de Kaiserslautern University of Technology >>> Problems with Concurrency The Computer Architecture Crisis The Impact of Reconfigurable Platforms The Dichotomy of Models Parallelism Conclusions http://www.uni-kl.de 2003, [email protected] 132 http://hartenstein.de Kaiserslautern University of Technology + - + - Parallelism by Concurrency independent instruction streams + + +

- + - 2003, [email protected] + - difficult coordination Data Path Data Path Data Path instruction sequencer instruction sequencer instruction sequencer .... Data Path instruction sequencer Bus(es) or switch box 133 massive run time http://hartenstein.de overhead Kaiserslautern University of Technology >> The Dominance of Embedded Systems

The Computer Architecture Crisis The Impact of Reconfigurable Platforms The Dichotomy of Models Parallelism Conclusions http://www.uni-kl.de 2003, [email protected] 134 http://hartenstein.de Kaiserslautern University of Technology Summary of the Anti Machine Paradigm anti language primitives are almost the same (slightly extended) anti machine execution potential dramatically more powerful is provides drastically more flexibility not always replacing von Neumann 2003, [email protected] 135 http://hartenstein.de JPEG zigzag scan pattern Kaiserslautern *> Declarations University of goto PixMap[1,1] EastScan is step by [1,0] end EastScan; HalfZigZag; SouthWestScan uturn (HalfZigZag)

Technology 4 SouthScan is step by [0,1] endSouthScan; 2 1 SouthWestScan is loop 8 times until [1,*] step by [-1,1] endloop end SouthWestScan; HalfZigZag is EastScan loop 3 times SouthWestScan SouthScan NorthEastScan EastScan endloop end HalfZigZag; 2003, [email protected] x y HalfZigZag data counter data counter data counter data counter 136 HalfZigZag 3 NorthEastScan is loop 8 times until [*,1] step by [1,-1] endloop

end NorthEastScan; http://hartenstein.de Kaiserslautern University of Technology >> Address Generators for Data Streams (data streams introduced earlier in this session) Introduction Smart Address Generators Address Generators for Data Streams Customized Memory Organization Conclusions http://www.uni-kl.de 2003, [email protected] 137 http://hartenstein.de Kaiserslautern University of Technology 2-D Generic Data Sequence Examples a) b) c) d) 2003, [email protected] e) f) 138 g) http://hartenstein.de

Kaiserslautern University of Technology GAU generic address unit Scheme GAG = Generic Address Generatorc d e h s i l b u p 1990 in Limit Slider GAU 2003, [email protected] B0 [ L0 A 139 | L | | ] limit B0

Address Stepper A | A Base Slider all 3 are copies of the same BSU stepper circuit http://hartenstein.de Kaiserslautern University of Technology GAG: Address Stepper GAG: Address Stepper ] [ B0 [| maxStepCount init tag L A | | Base stepVector B0 Limit GAG = Generic Address

Generator | A A Step Counter +/ =o Escape Clause End Detect L | | ] limit A Address 2003, [email protected] 140 endExec http://hartenstein.de Kaiserslautern University of Technology GAG Slider Model floor L0 Limit Stepper

B0 A Address Stepper A B0 A [ B0 Base Stepper GAG Generic Address Generator L0 ] L0 A [ 2003, [email protected] ceiling sliders ] 141 http://hartenstein.de Kaiserslautern University of Technology GAG Complex Sequencer Implementation GAU GAU L0 A Limit Slider

B0 Address Stepper A L0 A Base Slider Limit Slider Address Stepper GAU A L0 A Limit Slider GAG B0 Address Stepper A Base Slider GAU GAG SDS Base Slider GAU Generic Address Generator 2003, [email protected] VLIW

stack B0 142 all `been published in 1990 http://hartenstein.de Kaiserslautern University of Technology GAG Slider Operation Demo Example address floor F ceiling B0 A B x 2003, [email protected] y B 143 L0 C L L http://hartenstein.de Kaiserslautern University of Technology The microelectronics spare part

problem Demand: several decades of availability [Hartenstein 2002] e. g. car price: ~25% electronics IC vo mark lum et e nd f a m so y e d a r ili t /ye ilab a av IC ph ex ysi pe ca /ye ctan l life ars ce 2 1 0.5 2003, [email protected] 0.25 ICs do not survive storage time Original fab line is no more existing 0.13 0.1 0,07 feature size 144 http://hartenstein.de Kaiserslautern University of Technology

The microelectronics spare part problem [Hartenstein 2002] IC vo mark lum et e nd f a m so y e d a r ili t /ye ilab a av IC ph ex ysi pe ca /ye ctan l life ars ce 2 1 0.5 2003, [email protected] 0.25 key problem in many application areas: medical, aerospace, automotive, other transportation, military, industrial equipment controllers, et al. 0.13 0.1 0,07 feature size 145 http://hartenstein.de Dead Supercomputer Society

Kaiserslautern University of Technology ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research Culler-Harris Culler Scientific Cydrome Dana/Ardent/ Stellar/Stardent [Gordon Bell, keynote at ISCA 2000]. DAPP Denelcor Elexsi ETA Systems Evans and Sutherland Computer Floating Point Systems Galaxy YH-1 Goodyear Aerospace MPP Gould NPL Guiltech ICL Intel Scientific Computers International Parallel Machines Kendall Square Research Key Computer Laboratories 2003, [email protected] 146 MasPar Meiko Multiflow Myrias Numerix

Prisma Tera Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Supertek Supercomputer Systems Suprenum Vitesse Electronics http://hartenstein.de Kaiserslautern University of Technology CS: young ? dynamic? after >10 technology generations ... .. but the von Neumann Paradigm is 1 4004 2 8008 ... the vN Microprocessor still the dominant 3 8086 is a methusela, the steam doctrine ... 4 80286 ... still pushing he basic 5 80386 engine of the silicon age. 6 80486 models from the times of 7 P5 (Pentium) mainframe dinosaurs 8 P6 (Pentium Pro / Pentium II) th nd rd th th th

th th Microelectronics is ignored (except falling cost of computational effort) e o R t e A m i t s i It s i n n o o i c i t l i a t s n o e i g or e u d e d t r n e e i v

r o o [email protected] 2003, 9th Pentium III 10th .... 11th ....... s e c n e i c s g n i t u p e v i com t a v r e s n o c a r t l u ar e e

l i n e s : g n i y a s d i o to a v 147 http://hartenstein.de Kaiserslautern University of Technology better to go for reconfigurable platforms [Dataquest] PLD market > $7 billion by 2003. fastest growing segment of semiconductor market IP reuse and silicon reuse FPGAs are going into every type of application 2003, [email protected] 148 http://hartenstein.de Kaiserslautern University of Technology Throughput vs. Flexibility T. Claasen et al.: ISSCC 1999 *) R. Hartenstein: ISIS 1997 MOPS / mW 1000 throughput 100

d har 10 1 ed r i w g)* n i t pu m o c ble a r gu fi n gic eco o l r ( le PAs r ab P DS hardwired anti machine FPGAs u g rs fi

o s n s o ce o r Rec p et s n o i ssor t e c c o u r p micro instr d r a nd rD 0.1 sta 0.01 0.001 the anti machine goes far beyond bridging the gap 2 1 0.5 2003, [email protected]

0.25 von Neumann flexibility 0.13 0.1 0,07 feature size 149 http://hartenstein.de Kaiserslautern University of Technology Why coarse grain ? 2003, [email protected] 150 http://hartenstein.de Kaiserslautern University of Technology consensus digital system platforms: Terminology is near DPU data path unit rDPU reconfigurable DPU DPA data path array (DPU array) rDPA reconfigurable DPA RA reconfigurable array ISP instruction set processor AM anti machineinstruction set processor AMP data stream processor* rAMP reconfigurable AMP *) no dataflow machine FPGA FPL PLD CPLD

field-programmable gate array field-programmable logic programmable logic device complex PLD categories of morphware: morphware use reconfigurable logic reconfigurable computing 2003, [email protected] programming source platform category hardware (not programmable) machine paradigm none ISP software von Neumann morphwar e configware FPGA: none data stream streamwar processor (AMP) e anti machine reconfigurable streamware & AMP (rAMP) configware granularity (path width) (re)configurable blocks fine grain (~1 bit) CLBs coarse grain (e.g. 32 bits)

rDPUs (e.g. ALU-like) multi granular: by slice bundling rDPU slices (e.g. 4 bits) 151 http://hartenstein.de Kaiserslautern University of Technology >> Problems to be solved Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead

Soft CPU HLLs Problems to be solved 2003, [email protected] 152 http://hartenstein.de Kaiserslautern University of Technology EDA industry shift into CS mentality [Wojciech Maly] patches instead of engineering innovation stalled many years ago 85% users hate their tools netlist-based: do not care about efficiency, ... ... do not care about transistor density 2003, [email protected] 153 http://hartenstein.de Kaiserslautern University of Technology [Jonathan Rose] FPGAs Give You Instant Fabrication Get to Market Fast Fix em quick Zero NRE Charges Low Risk Low Cost at good volume 2003, [email protected]

154 http://hartenstein.de Kaiserslautern University of Technology The Crisis of Computing Sciences Computing Sciences are in a severe crisis Computing curricula are obsolete because of strictly enforced procedural-only blinders Computer Architecture and related areas have lost leadership in digital system implementation CS ignores > 90% processors in embedded systems: 10 times more programmers will write embedded applications than computer software by 2010 A disruptive promising therapy introduced by new approaches coming with Reconfigurable Computing 2003, [email protected] 155 http://hartenstein.de Kaiserslautern University of Technology Ubiquitous embedded systems Embedded systems means: 20 billion processors (2001) hardware / software co-design > 90% in embedded systems 10 times more programmers will write embedded applications than computer software by 2010 configware / software co-design Thats where our graduates will go hardware / configware / software co-design

2003, [email protected] 156 http://hartenstein.de Kaiserslautern University of Technology The Situation in Computing Sciences Computing Sciences are in a severe crisis New fundamentals and R&D directions are inevitable my mission: getting you involved All knowledge needed is readily available ... ... even from Computing Sciences Silicon application and EDA provide useful concepts Reconfigurable Computing has the remedy 2003, [email protected] 157 http://hartenstein.de Kaiserslautern University of Technology the edu gap has dramatic consequences Key R&D scenes are drying out or dying because of a lack of qualified researchers the embedded system design crisis gets worse because of a lack of qualified designers many innovative products cannot be sold because of a lack of qualified customers the edu gap is widening dramatically because of a lack of qualified educators 2003, [email protected] 158 http://hartenstein.de Kaiserslautern University of Technology Super Pipe Networks The key is

array systolic array supersystolic DPA applications regular data dependencies only mapping, rather th an pipeline properties shape resources linear only uniform only mapping scheduling (data stream formation) linear projection or algebraic synthesis simulated annealing or P&R algorithm no restrictions architecture (e.g. force-directed) scheduling algorithm *) KressArray [ASP-DAC-1995] 2003, [email protected]

159 http://hartenstein.de Kaiserslautern University of Technology .... its an alternative culture .... now the area is going mainstream: a rapidly widening audience of non-specialists gets interested ... severe communication gaps due to educational deficits not only to users: still many hardware and EDA experts ask: isnt it just logic design on a strange platform ? it is time to clarify and popularize fundamental aspects and to explain, that it is a fundamentally different culture 2003, [email protected] 160 http://hartenstein.de von Neumann Computer: the wrong Machine Paradigm Kaiserslautern University of Technology Xputer Xputer LabLab University Kaiserslautern University of of Kaiserslautern Computer Computer tightly coupled by compact instruction code loosely coupled by decision data bits only

Compiler von s e c n e RAM r eNeumann ff i d e m so e r c a e does not support does p e s r m the a instructions esoft data paths r t s a t a d Sequencer Datapath Datapath program hardwired cou nter: reconfigurable

Xputer :The The Soft Machine Paradigm Compiler Scheduler instructions (multiple) sequencer Datapath Array data reconfigurable counter s also for hardwired state register 2003, [email protected] 2001, [email protected] RAM Xputer Xputer 161 (anti machine) http://hartenstein.de Kaiserslautern University of Technology Semiconductor Revolutions e v a W s o t o

m i k a M standard custom procedural programming proc., memory 1967 LSI, MSI 1977 structural programming 2007 1987 ASICs, accels le ab ur fig on 1957 The Programmable System-on-a-Chip is the next wave c re TTL hardwired Mainstream Silicon Application is switching every 10 Years 1997

d e h is 9 l b Pu 198 in algorithm: fixed algorithm: variable algorithm: variable resources: fixed resources: fixed resources: variable vN machine Tredennicks paradigm Paradigm Shifts 2003, [email protected] ne i h c a m i 162 ant anti machine paradigm paradigm http://hartenstein.de Kaiserslautern University of Technology Impact of Data-stream- Embedded based ... Hardware/ Configware Industry

Repeat Success Story by new Machine Paradigm ! standard structural personalization: hardwired before fabrication custom LSI, MSI 2003, [email protected] 1977 1987 ASICs, accels 163 1997 2007 e bl ra gu nfi 1967 co re proc., memory TTL 1957 qua lifie d peo ple

are ava not ila b le http://hartenstein.de Kaiserslautern University of Technology Rapidly growing CS education gap Our computing curricula are obsolete introduction is strictly procedural-only vN-only use of terms like computer organisation, computer structures, computer architecture graduates are not prepared to the real world most applications for embedded systems (>90% by 2010) our graduates are unable to compete with EE graduat only a few % curricula need to be changed my mission: getting 2003, [email protected] you involved 164 http://hartenstein.de Kaiserslautern University of Technology Binding Time vs. Computing Domain Binding time: (Set-up of Communication Channels) at run time microprocessor parallel computer array processor at loading time Reconfigurable Computing at compile time later fabrication step

supersystolic arrays systolic arrays before fabrication programming domain: 2003, [email protected] time domain (procedural) 165 time & space (hybrid) ASICs full custom ICs space domain (structural) http://hartenstein.de Kaiserslautern University of Technology Why Coarse Grain instead of FPGA ? Sources: Proc ISSCC, ICSPAT, DAC, DSPWorld lic o FPGA t s y physical rs e p su FPGA logical 100 000 000 000 Transistors / chip

10 000 000 000 1000 000 000 100 000 000 10 000 000 physical logical ory m me ssor e c o opr micr reduced reconfigurability overhead by up to ~ 1000 100 000 10 000 2003, [email protected] ~ 10 000 FPGA routed 1000 000 1980 1000 ~ 10 1990 2000 166 2010 drastically much fastersmaller loading

configuration memory a lot of more benefits http://hartenstein.de Kaiserslautern University of Technology What are the differences ? vN* computing: Reconfigurable Computing: computing in time computing in space and time instruction fetch at run time instruction fetch at compile time procedural data scheduling programming instruction scheduling *) vN stands for von Neumann structural programming i. e. Data-stream-based also hardwired implementations** instruction fetch before fabrication **) e g. Bee project Prof. Broderson 2003, [email protected] 167 http://hartenstein.de Kaiserslautern University of Technology Instruction generalized: including complex

expressions and other datapaths strong impact on the machine paradigm ! 2003, [email protected] Basics of Binding Time time of Instruction Fetch run time microprocessor parallel computer loading time Reconfigurable Computing compile time 168 http://hartenstein.de Kaiserslautern University of Technology Data-stream-based Parallelism See my other talk ICECS 2002 IEEE 9th International Conference on Electronics, Circuits and Systems Dubrovnik, Croatia September 15-18, 2002 (invited paper) Michael Herz, Agilent Technologies Reiner Hartenstein, University of Kaiserslautern Miguel Miranda, Erik Brockmeyer, Francky Catthoor, IMEC, Leuven 2003, [email protected]

Memory Organisation for Datastream-based Reconfigurable Computing 169 http://hartenstein.de Kaiserslautern University of Technology Machine paradigms von instruction Neumann M instruction stream data-stream machine M Flowware memory data address generator stream machine (data sequencer) data path I /O I /O (ALU) data asM* stream d a ta p a th u n it

CPU instruction sequencer Software Configware DPU or rDPU embedded memory architecture* I/O M M M M M M M M M M memory I/O (r)DPA (r)DPU 2003, [email protected] 170 http://hartenstein.de Kaiserslautern University of Technology An example by Nageldingers KressArray Xplorer Synthesizable Memory Communication Efficient Memory Communication should be directly supported by the Mapper Tools Legend:

Optimized Parallel memory ports Memory Controller sequencers application not used http://kressarray.de 2003, [email protected] 171 http://hartenstein.de Kaiserslautern University of Terminology has Technology been highly ############### confusing Em be 2 n tio a r eg t In y si t n de ) ar e y 4/ . (1

4y s re ] o o aw [M l ar) 2/ye . 1 ( ity dens n o i rat nteg i r o cess aw] pro sons l ) r e t t a P ear th [ (1.07/y andwid b y r o Mem 10y 30y Battery capacity (1.03/year) 1 0 *) Department

10 12 18 24 of Trade and Industry, London 2003, [email protected] 172 36 48 http://hartenstein.de months dd Co ed m so m ftw un ar ic e at [D io TI n *] ba nd w id th [H an se la ns w ]

factor Kaiserslautern University of Technology Semiconductor Revolutions e v a W s o t o m i k a M standard custom procedural programming proc., memory 1967 LSI, MSI 1977 structural programming 2007 1987 ASICs, accels le ab ur fig on

1957 The Programmable System-on-a-Chip is the next wave c re TTL hardwired Mainstream Silicon Application is switching every 10 Years 1997 d e h is 9 l b Pu 198 in algorithm: fixed algorithm: variable algorithm: variable resources: fixed resources: fixed resources: variable vN machine Tredennicks paradigm Paradigm Shifts 2003, [email protected] ne i h c a m i 173

ant anti machine paradigm paradigm http://hartenstein.de Kaiserslautern University of Technology No vN bottleneck The anti machine has no von Neumann bottleneck. 2003, [email protected] 174 http://hartenstein.de Kaiserslautern University of Technology 3 different mind sets hardware people TTL 1957 CS people proc., memory 1967 LSI, MSI 1977 new breed needed 1987 ASICs, accels FPGAs

2007 soft CPUs coarse grain 1997 Common terminology needed 2003, [email protected] 175 http://hartenstein.de Kaiserslautern University of Technology Throughput vs. Flexibility T. Claasen et al.: ISSCC 1999 *) R. Hartenstein: ISIS 1997 MOPS / mW 1000 throughput 100 d har 10 1 ed r i w g)* n i t pu m o c

ble a r gu fi n gic eco o l r ( le PAs r ab P DS hardwired anti machine FPGAs u g rs fi o s n s o ce o r Rec p et s n o i ssor t e c c

o u r p micro instr d r a nd rD 0.1 sta 0.01 0.001 the anti machine goes far beyond bridging the gap 2 1 0.5 2003, [email protected] 0.25 von Neumann flexibility 0.13 0.1 0,07 feature size 176 http://hartenstein.de Kaiserslautern University of Technology Programming sources von Neumann

instruction stream machine hardware resources fixed algorithms variable software hardwired only morphware resources variable configware algorithms variable streamware flowware 2003, [email protected] 177 Anti machine data stream machine reconfigurable or hardwired http://hartenstein.de Kaiserslautern University of Technology Some soft CPU core examples core architecture platform MicroBlaze 125 MHz 70 D-MIPS 32 bit standard RISC 32 reg. by 32 LUT RAMbased reg.

Xilinx up to 100 on one FPGA Nios 16-bit instr. set Altera Mercury Nios 50 MHz 32-bit instr. set Altera 22 D-MIPS Nios 8 bit Altera Mercury core architecture platform Leon 25 Mhz SPARC ARM7 clone ARM uP1232 8bit CISC, 32 reg. 200 XC4000E CLBs REGIS

8 bits Instr. + ext. ROM 2 XILINX 3020 LCA Reliance-1 12 bit DSP Lattice 4 isp30256, 4 isp1016 1Popcorn-1 8 bit CISC Altera, Lattice, Xilinx gr1040 16-bit gr1050 32-bit My80 i8080A FLEX10K30 or EPF6016 YARD-1A 16-bit RISC, 2 opd. Instr. old Xilinx FPGA Board DSPuva16 16 bit DSP Spartan-II xr16

RISC integer C SpartanXL 2003, [email protected] Acorn-1 178 1 Flex 10K20 http://hartenstein.de Kaiserslautern University of Technology UCSC: 1990! Mraldalen University, Eskilstuna, Sweden Chalmers University, Gteborg, Sweden Cornell University Gray Research Georgia Tech Hiroshima City University, Japan 2003, [email protected] FPGA CPUs in teaching and academic research Michigan State Universidad de Valladolid, Spain Virginia Tech Washington University, St. Louis New Mexico Tech UC Riverside Tokai University, Japan 179 http://hartenstein.de Kaiserslautern University of Technology

Loop Transformation Examples sequential processes: resource parameter driven Co-Compilation loop 1-16 body endloop host: loop 1-8 trigger endloop loop 1-8 fork body body loop 1-8 loop 9-16 endloop body body endloop endloop loop 1-4 trigger endloop join loop unrolling strip mining 2003, [email protected] reconf.array: 180 loop 1-2 trigger endloop http://hartenstein.de Kaiserslautern University of Technology However, CS Education . is basedcurrent on the Submarine

Model This model disables ... Algorithm Brain usage: procedural-only re a w t Sof procedural high level Programming Language Assembly Language Hardware invisible: under the surface Hardware Software Faculty Colleagues shy away from the Paradigm Shift:Brain hurts? - cant be: their [email protected] has been amputated this 2003, 181 http://hartenstein.de Kaiserslautern University of Technology Hardware and Software as Alternatives procedural structural Algorithm partitioning Brain Usage: both Hemispheres Hardware, Configware Software

Hardw/Configw Software only Software only & Hardw/Configw 2003, [email protected] 182 http://hartenstein.de Kaiserslautern University of Technology (procedural) The Dominance of the Submarine Model structurally disabled Hardware .. indicates, that our CS Education System produces Zillions of Mentally Disabled Persons completely disabled to cope with Solutions other than Software only 2003, [email protected] 183 http://hartenstein.de Kaiserslautern University of Technology Design Space Exploration Systems interactive status evaluation status generation [66] no

abstract models rule-based 1992 [67] yes prediction models device generator DIA 1998 [68] yes prediction from library rule-based DSE for RAW 1998 [49] no analytical models analytical ICOS 1998 [76] no fuzzy logic greedy search DSE for Multimedia 1999

[77] no simulation branch and bound yes fuzzy rule-based simulated annealing Explorer System year source DPE 1991 Clio Xplorer 1999 [11] [50] 2003, [email protected] 184 http://hartenstein.de History of Computing Kaiserslautern University of Technology Makimotos Wave 1957 1977 a ur fig on 1967

proc. memory 1987 c re TTL 2007 1997 e bl classical CS mainframes new CS PC ? 2003, [email protected] 185 http://hartenstein.de Kaiserslautern University of Technology Wintel Business Model Billion US-$ US Market [forrester] Million Devices delivered in the U.S. 20 [IDC] 15 20 PC r e m nsu 1500 $

Co 1000 $ 10 Cons av. reumer PC [forressale ($) ter] 1997 1998 1999 2000 2001 2002 2003, [email protected] 186 http://hartenstein.de Kaiserslautern University of Technology standard custom procedural programming proc., memory 1967 LSI, MSI 1977 structural programming 2007 1987

ASICs, accels le ab ur fig on 1957 hardwired c re TTL Tredennicks Paradigm Shifts 1997 2 sources algorithm: fixed algorithm: variable algorithm: variable resources: fixed resources: fixed resources: variable vN machine paradigm 2003, [email protected] 187 new machine paradigm needed http://hartenstein.de

Recently Viewed Presentations

  • Perfect Active System - University of Florida

    Perfect Active System - University of Florida

    Which teacher is loved? Interrogative Pronouns Singular (adj. in form same as relative prn) Interrogative Pronouns and Adjectives Plural Interrogatives vs Relative Pronoun Interrogatives vs Relative Pronoun Bad Break-up from Catullus From Catullus 8 Vale, puella - iam Catullus obdurat....
  • Arthurian Legends

    Arthurian Legends

    Knights of Legend . Is there any truth to the King Arthur Legend? As you watch, be sure to take note of something new you learned or something you now wonder. Characters to Know. Arthur Morgan Le Fay. Guinevere Galahad....
  • Empty slide to keep flashcards in order 1.

    Empty slide to keep flashcards in order 1.

    What results when the offspring Tt(F1) of true breeding parents self-pollinate (breed with themselves)? Tt x Tt. What is probability? ... AB blood. Describe the genotypes and phenotypes of each blood type: Type A. Type B. Type AB. Type O.
  • www.satelliteconferences.noaa.gov

    www.satelliteconferences.noaa.gov

    have similar vertical structure across the basins. This suggests that the development of vigorous convective cells over warm oceans are similar and understanding gained in one region is perhaps expendable to other areas. 2. Convective cells over the marine continent...
  • Light - Kindle Education

    Light - Kindle Education

    Distance measured upwards from the principal axis +ve distance. origin. Against the direction of incident light-ve distance. Distance measured downwards from the principal axis-ve distance. 1?+ 1?= 1? ? = distance of the image from the mirror.? = distance of...
  • University of Central Florida Executive Mba Progam Business ...

    University of Central Florida Executive Mba Progam Business ...

    The EU anthem is based on Beethoven's Ninth Symphony. As of today, there are 28 member states (includes over 508 million inhabitants, comprising 7.3% of the world's population, and covers 1,7047,787 square miles).
  • BAHAGIAN HAL EHWAL AKADEMIK & ANTARABANGSA Encik Abu

    BAHAGIAN HAL EHWAL AKADEMIK & ANTARABANGSA Encik Abu

    Baharum JABATAN KESELAMATAN Encik Abu Bakar bin Che Embi Puan Roshimah binti Hassan BAHAGIAN HAL EHWAL AKADEMIK & ANTARABANGSA Encik Abdul Razak bin Saidin Puan Hamidah binti Omar Encik Jaafar bin Chee Mat Encik Mohamed Maliki bin Mohamed Rapiee Cik...
  • ADVANCES IN AUTOMATION: BUSINESS AND TECHNOLOGY TRENDS Marshall

    ADVANCES IN AUTOMATION: BUSINESS AND TECHNOLOGY TRENDS Marshall

    Library Services Platform. Library-specific software.Designed to help libraries automate their internal operations, manage collections, fulfillment requests, and deliver services