Presentation on validation at COCOSDA

advertisement
SLR Validation: procedures
and prospects
Eric Sanders
Henk van den Heuvel
• WHAT
is validation?
• WHY
validate databases?
• WHEN
validate databases?
• WHO
validates databases?
• HOW
do we validate databases?
• WHERE
do we go from here?
WHAT is validation?
1. checking a SLR against a fixed set of
requirements
2. putting a quality stamp on a SLR as a
result of this check
3. the evaluation of a SLR in a field test
WHY validate databases?
• increasing number of databases
• high costs and price of databases
• fair trade (SpeechDat)
WHO validates databases
• for SpeechDat?
• for ELRA?
• in general?
WHO validates databases and
WHEN?
Validation scheduling
Validator
internal
during
production
1
after
production
2
external
3
4
HOW do we validate
databases?
• procedure
• check points
• rank order
• quality values
Validation procedure
SLR
1.Prevalidation (10 spk)
2. Validation
3. Revalidation
OK?
no
yes
Ready for distribution
Check points








documentation
database format
design
speech files
label files
phonemic lexicon
speaker & environment distributions
orthographic transcriptions
Rank order
1. indispensable: speech signals,
orthographic transcription,
documentation
2. some flaws allowed: design, speaker
and environment distributions
3. not very important: label files,
database format, lexicon
Quality values
• OK
• Not OK, but acceptable
• Not acceptable
WHERE do we go from here?
• validate databases from ELRA’s
catalogue
• bug report
• SPEECON
• NETWORK-DC
• exchange ideas
References
• H. vd Heuvel et al., SLR Validation:
present state of affairs and prospects,
LREC 2000
• H. vd Heuvel, The Art of Validation, to
appear in ELRA News oct 2000.
• www.spex.nl/spex
Download