Infrastructure for Digital humanities Jan Odijk Digital Humanities Session Utrecht, 2015-06-10 1 European DH RIs • European Research Infrastructures (RI) for DH: – CLARIN – DARIAH • NL contributions: – CLARIN-NL (2009-2015) – CLARIAH-SEED (2013-2014) – CLARIAH-CORE (2015-2018) 2 CLARIN Infrastructure • What to expect from a RI? – Find data – Find tools and services – Apply tools and services seamlessly – Store data and tools safely – Education , Training, Support 3 find data • Virtual Language Observatory – Faceted browsing and geographical navigation – CLARIN-prep • CLARIN-NL Portal Data overview – Faceted browsing in CLARIN-NL results by research discipline, data type, annotations present, language, etc. 4 find tools • CLARIN Resource and Tool Inventory – Faceted browsing • CLARIN-NL Portal Tool and Services overview – Faceted browsing by research discipline, tool type, tool task, language, etc. 5 Apply Tools • Big steps made in CLARIN-NL and CLARIAHSEED • Tool classes: – Search in and through the data – (semi-)automatic annotation and enrichment – Analysis of data and search results – Visualisation of data and analyses 6 Store Data • Network of CLARIN Centres in NL – Type B: MPI, Meertens, INL, Huygens, DANS – Type D: KB (incl. DBNL), UBU, Beeld & Geluid • Strengthened in CLARIAH – IISG • Safe – CLARIN certified (MPI, MI, INL, HI) – Data Seal of Approval (all 5 type B centres) • Many supporting tools 7 Education, Training, Support • Education & Training – – – – Educational packages Regular tutorials and workshops Seasonal Schools courses (e.g. LOT 2014 2015) In (small) part in the regular curriculum – – – – – Clear guidelines Introductory Documents: ex1 ex2 FAQ pages Via the CLARIN Centres in NL Helpdesk: [email protected] • Support 8 Thanks for your attention! http://portal.clarin.nl http://www.clariah.nl 9 DO NOT ENTER HERE 10 UU Data • • • • • • DUELME-LMF • MWE lexical database (Jan Odijk) C-DSD • Song metadata (Els Stronks) EMIT-X • Emblem metadata (Els Stronks) D-LUCEA • longitudinal Spoken English Database (Hugo Quené) DISCAN • Discourse annotated text corpora(Ted Sanders) VALID • curated language impairment data (Frank Wijnen) 11 UU Software • MIMORE • search in combined dialect databases (Sjef Barbiers) • DUELME-LMF • search in MWE lexical database (Jan Odijk) • TDS-Curator • Interface to Typological database system (Alexis Dimitriadis) • TTNWW/Semantics • Semantic Role Assigner(Paola Monachesi) • WAHSP / BILAND • web application for (bilingual) historical sentiment mining (Toine Pieters) 12 UU Software • Arthurian Fiction • Search interface to Arthurian fiction DB (Bart Besamusca) • MIGMAP • migration mapping application (Gerrit Bloothooft) • AVReseacherXL • audiovisual metadata explorer (Jasmijn van Gorp) • TPC • Links from Taalportaal to Corpora (Marjo van Koppen) • CKCC (Geleerdenbrieven) • Interface to 17th century scientists’ letters (Wijnand Mijnhard) 13