The encoding of adjectives in the Dutch semantic database CORNETTO Isa Maks1 Piek Vossen1, 2, Roxane Segers1 Hennie van der Vliet1 1: 2: Faculty of Arts, Vrije Universiteit Amsterdam Irion Technologies, Delft LREC, Marrakech 28-29-30 May 2008 Presentation Overview The project Cornetto: overview The semantic Database: Cornetto The encoding of general language adjectives in Cornetto problems and solutions A possible strategy for disambiguating adjective senses with the help of the Cornetto database and the ontology LREC, Marrakech 28-29-30 May 2008 2 Project Cornetto Financed by NTU Dutch Language Union STEVIN: Dutch Flemish Research Programme for Dutch Lanuage and Speech Technology (2004-2011) Consortium partners VUA (Vrije Universiteit Amsterdam, General Linguistics Department) UvA (University of Amsterdam, Informatics Institute) K.U. Leuven (Katholieke Universiteit Leuven, Department of Computer Science) Irion Technologies BV Delft Results the Cornetto Database : a semantic database for Dutch Lexical Acquisition Toolkit Domain Acquisition Toolkit LREC, Marrakech 28-29-30 May 2008 3 COmbinatorial Relational NEtwork voor Taal Toepassingen Referentie Bestand Nederlands English Wordnet (PWN) Dutch Wordnet Align/Merge 1. 2. SUMO (KIF) Automatic Alignment Manual Correction Cornetto Entry -Lexical Unit -Synset -PWN-pointer -SUMO term Manual Correction 40.000 entries generic and central part of the language 4 antonym near-syn 5 Combining existing resources Automatic Alignment LUs and Synsets Manual Correction of most frequent entries - Adding missing senses Correcting conflicts Editing synsets Linking to SUMO, PWN - LREC, Marrakech 28-29-30 May 2008 6 Adjectives-problems Adjectives in general what is the real meaning of an adjective? a warm coat , warm water DWN adjective synsets not corrected manually large fuzzy synsets bad alignment with English Wordnet bad alignment with SUMO SUMO not tested for adjective concepts the alignment between SUMO and the PWN adjectives is corrected manually only for few cases - > the ontology is not complete LREC, Marrakech 28-29-30 May 2008 7 Adjectives-solution Using an existing classification for Adjectives (Hundsnurscher & Splett, 1982) coverage of 500 most frequent adjectives descriptional framework for all adjectives in German lexicographic approach used (and tested) in Germanet Merging with SUMO coverage of all general language concepts rich hierarchy axioms formal definitions LREC, Marrakech 28-29-30 May 2008 8 Merging H&S and SUMO H&S SUMO perception-related PerceptionalAttribute material-related PhysicalAttribute body-related ++ material-related spirit-related ++ TextureAttribute character/behaviour TraitAttribute mood-related EmotionalState relational RelationalAttribute GravityAttribute evaluative NormativeAttribute .. temporality-related ++ social-related RelationalAttribute SaturationAttribute ConsistencyAttribute PurityAttribute LREC, Marrakech 28-29-30 May 2008 9 SUMO’s Attribute hierarchy 10 11 SubjectiveAssessmentAttribute SUMO most PWN adjectives -> SubjectiveAssessmentAttribute he is a good cook, this is a complete failure a friendly woman, a heavy bag OK ? Cornetto friendly heavy good TraitAttribute GravityAttribute PositiveEvaluationAttribute (subsumed by SubjectiveAssessmentA) Ex.: Sense 1: Sense 2: aardig (kind) aardig (rather good) een aardige man (a kind man) een aardig boek (a nice book) LREC, Marrakech 28-29-30 May 2008 TraitAttribute PositiveEvaluationAttribute 12 ‘Word sense Disambiguation’ Can we make use of the ontology to disambiguate between Emotionrelated and Temperature-related senses (of one word) ? kil (1) (chilly, cold) een kille zomeravond (a chilly summer evening) kil (2) (chilly, unfriendly) een kille moeder (a cold mother) TemperatureAttribute warm (1) (warm, hot) het water is nog warm (the water is still warm) warm (2) (cordial, warm) een warme begroeting (a warm greeting) TemperatureAttribute TraitAttribute TraitAttribute LREC, Marrakech 28-29-30 May 2008 13 TraitAttribute SUMO Term attributed to the Noun chilly , unfriendly kille blik (look) FacialExpression kille begroeting (greeting) Greeting kille reactie (reaction) Expressing kille vijandigheid (hostility) ViolentContest kille berekening (calculation) Expressing kille woede (anger) EmotionalState kille passie (passion) EmotionalState kille sfeer (atmosphere) PsychologicalAttribute warm, cordial warme ontvangst (reception) Greeting warme steun (support) Cooperating warm applaus (applause) Expressing warm contact (contact) Communication uit een warm hart (heart) EmotionalState warme gevoelens (feelings) PsychologicalAttribute LREC, Marrakech 28-29-30 May Cornetto Workshop, VU, June, 7th2008 2007 14 SocialInteraction, PsychologicalAttribute expressing greeting gesture facialexpression Communication SocialInteraction Cooperation Contest ViolentContest EmotionalState PsychologicalAttribute CogniitiveAttribute SocialInteraction (Noun) + TraitAttribute (ADJ) PsychologicalAttribute (Noun) + TraitAttribute (ADJ) LREC, Marrakech 28-29-30 May 2008 15 TemperatureAttribute SUMO Term chilly , cold kille tocht (draught) GasMotion kille wind(wind) Wind kil marmer (marble) Mineral kille zomerdag(summer day) Day kille lucht (air) Air warm, hot warme zomer (summer) SeasonOfYear warm water (water) Water warm koffie (coffee) Food warm eten (food) Food warme jas (coat) Clothing warme junimaand (month) Month warm sopje (soapsuds) Water LREC, Marrakech 28-29-30 May 2008 16 Substance, TimePeriod, GasMotion, .. PureSubstance CompoundSubs tance Water Food Beverage Coffee Substance Mineral GasMotion Day Wind TimePeriod SeasonOfYear Month Substance (Noun) + TemperatureAttribute (ADJ) TimePeriod (Noun) + TemperatureAttribute (ADJ) GasMotion (Noun) + TemperatureAttribute (ADJ) LREC, Marrakech 28-29-30 May 2008 17 So.. If an adjective has 2 senses and 1 is related to emotion and the other to temperature then SocialInteraction (Noun) + TraitAttribute (ADJ) PsychologicalAttribute (Noun) + TraitAttribute (ADJ) Substance (Noun) + TemperatureAttribute (ADJ) TimePeriod (Noun) + TemperatureAttribute (ADJ) GasMotion (Noun) + TemperatureAttribute (ADJ) LREC, Marrakech 28-29-30 May 2008 18 More Information Cornetto project Cornetto Database: Licensed from TST-centrale, Nederlandse Taalunie (by September 2008) SUMO LREC, Marrakech 28-29-30 May 2008 19