SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel • WHAT is validation? • WHY validate databases? • WHEN validate databases? • WHO validates databases? • HOW do we validate databases? • WHERE do we go from here? WHAT is validation? 1. checking a SLR against a fixed set of requirements 2. putting a quality stamp on a SLR as a result of this check 3. the evaluation of a SLR in a field test WHY validate databases? • increasing number of databases • high costs and price of databases • fair trade (SpeechDat) WHO validates databases • for SpeechDat? • for ELRA? • in general? WHO validates databases and WHEN? Validation scheduling Validator internal during production 1 after production 2 external 3 4 HOW do we validate databases? • procedure • check points • rank order • quality values Validation procedure SLR 1.Prevalidation (10 spk) 2. Validation 3. Revalidation OK? no yes Ready for distribution Check points documentation database format design speech files label files phonemic lexicon speaker & environment distributions orthographic transcriptions Rank order 1. indispensable: speech signals, orthographic transcription, documentation 2. some flaws allowed: design, speaker and environment distributions 3. not very important: label files, database format, lexicon Quality values • OK • Not OK, but acceptable • Not acceptable WHERE do we go from here? • validate databases from ELRA’s catalogue • bug report • SPEECON • NETWORK-DC • exchange ideas References • H. vd Heuvel et al., SLR Validation: present state of affairs and prospects, LREC 2000 • H. vd Heuvel, The Art of Validation, to appear in ELRA News oct 2000. • www.spex.nl/spex