Using Network Processors in Genomics Herbert Bos* † Kaiming Huang* {herbertb,khuang}@liacs.nl *Leiden † Vrije Universiteit, Netherlands Universiteit, Netherlands http://www.liacs.nl/~herbertb/projects/biocomp/ H. Bos – Leiden University 13/02/2004 1 Case study: BLAST ● ● ● ● search nucleotide/protein database for query BLAST discovers similarity rather than exact match two main phases: 1. scoring (registering where query and DNADB match) 2. alignment (dynamic programming) only the first phase on NPUs H. Bos – Leiden University 13/02/2004 2 Window matching H. Bos – Leiden University 13/02/2004 3 Window matching H. Bos – Leiden University 13/02/2004 4 Window matching H. Bos – Leiden University 13/02/2004 5 Window matching H. Bos – Leiden University 13/02/2004 6 Window matching ● naïve approach: roughly W*N*M comparisons ● does not scale ● string search algorithms: Aho-Corasick ● – all windows matched at the same time – shifting genome one nucleotide at a time – matching algorithm transformed in a DFA DFA may be quite large H. Bos – Leiden University 13/02/2004 7 Aho-Corasick ● Alphabet: acgt ● Window size: 3 ● Query: acgccga ● Windows: {acg,cgc,gcc,ccg,cga} H. Bos – Leiden University 13/02/2004 8 Aho-Corasick ● ● ● ● Alphabet: acgt 0 a t Window size: 3 c Query: acgccga 1 c 2 g 3 4 a Windows: {acg,cgc,gcc,ccg,cga} c g 7 s f(s) 1 0 2 4 3 5 H. Bos – Leiden University 13/02/2004 4 0 g 5 c 6 5 7 6 8 7 0 8 4 12 10 g 11 c 8 c 9 9 10 11 12 10 4 5 1 9 Aho-Corasick Alphabet: acgt ● Window size: 3 ● 0 a t c Query: acgccga ● 3 6 9 11 12 acg cgc gcc ccg cga s f(s) 1 0 2 4 3 5 H. Bos – Leiden University 13/02/2004 4 0 4 g 5 c 6 a Windows: {acg,cgc,gcc,ccg,cga} ● 1 c 2 g 3 5 7 6 8 c g 7 7 0 8 4 12 10 g 11 c 8 c 9 9 10 11 12 10 4 5 1 10 Aho-Corasick Alphabet: acgt ● Window size: 3 ● 0 a t c Query: acgccga ● 3 6 9 11 12 acg cgc gcc ccg cga s f(s) 1 0 2 4 3 5 H. Bos – Leiden University 13/02/2004 4 0 4 g 5 c 6 a Windows: {acg,cgc,gcc,ccg,cga} ● 1 c 2 g 3 5 7 6 8 c g 7 12 10 g 11 c 8 c 9 tacgcga 7 0 8 4 9 10 11 12 10 4 5 1 11 IXPBlast Architecture Gbps ports NPU (IXP1200) scratch DRAM Control Processor ME ME ME ME ME ME SRAM StrongARM Microengines PCI Bus PCI H. Bos – Leiden University 13/02/2004 12 IXPBlast Architecture Gbps ports NPU (IXP1200) scratch DRAM Control Processor ME ME ME ME ME ME SRAM StrongARM Microengines PCI Bus PCI H. Bos – Leiden University 13/02/2004 13 IXPBlast Architecture Gbps ports NPU (IXP1200) scratch DRAM Control Processor ME ME ME ME ME ME SRAM StrongARM Microengines PCI Bus PCI H. Bos – Leiden University 13/02/2004 14 IXPBlast Architecture Gbps ports t 0 a 1 c2 g3 NPU (IXP1200) c 4 g5 c 6 a 12 g scratch ME ME ME ME ME ME c 10 g 11 7 c8 c 9 DRAM Control Processor SRAM StrongARM Microengines PCI Bus PCI H. Bos – Leiden University 13/02/2004 15 IXPBlast Architecture Gbps ports t 0 a 1 c2 g3 NPU (IXP1200) c 4 g5 c 6 a 12 g scratch ME ME ME ME ME ME c 10 g 11 7 c8 c 9 DRAM Control Processor SRAM StrongARM Microengines PCI Bus PCI H. Bos – Leiden University 13/02/2004 16 IXPBlast Architecture Gbps ports t 0 a 1 c2 g3 NPU (IXP1200) c 4 g5 c 6 a 12 g scratch ME ME ME ME ME ME c 10 g 11 7 c8 c 9 DRAM Control Processor SRAM StrongARM Microengines PCI Bus PCI H. Bos – Leiden University 13/02/2004 17 IXPBlast: packet handling 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 ● packets read and processed in batches of 100.000 ● “spilling” must be taken into account ● currently no feedback H. Bos – Leiden University 13/02/2004 18 Results ● 232 MHz IXP1200 ~ 1.8GHz Pentium-4 ● 1611 Nucleotide query (MyD88) ● 1.4 GB genome (Zebrafish) ● – IXP1200: 90 sec with DFA – IXP1200: 129 sec with “trie” – P4: 132: 132 sec with “trie” number of matches: 524856 H. Bos – Leiden University 13/02/2004 19 Results Query size 1611 DNADB size 1.4 GB Impl. Performance P4 132 sec 1611 1.4 GB IXP1200 129 sec 1611 1.4 GB IXP1200 90 sec DFA H. Bos – Leiden University 13/02/2004 20 Conclusions ● ● ● ● NPUs are useful in other application domains Newer hardware is expected to perform much better “Throughput processors” Adapting our current approach to use BLAST tricks/heuristics H. Bos – Leiden University 13/02/2004 21 Network processors ● geared for high throughput ● used exclusively in network systems ● example: intrusion detection ● ● similar to looking for gene on in genomes differences Radisys ixp1200 board H. Bos – Leiden University 13/02/2004 22 Application domain: “Genomics” ● ● ● example: search genome for occurrence of “patterns” similar problems as IDS, poor performance on GPP cannot exploit parallelism – throughput-driven – how about FPGAs? – how about clusters? NPU – easier to program than FPGAs – cheaper than cluster computing – “on the desktop” IP never leaves the room H. Bos – Leiden University 13/02/2004 23