Institute for Information Law Text & Data Mining: Barriers, Paths and Passable Roads L. Guibault, Designing and Shaping Open Science 05.04. 2016 Instituut voor Informatierecht - IViR 2 Emerging solution(s) Machine reading process textual sources, organise and classify in various dimensions, extract main (indexical) information items, … and “understanding” identify and extract entities and relations between entities, facilitate the transformation of unstructured textual sources into structured data … and predicting enable the multidimensional analysis of structured data to extract meaningful insights and improve the ability to predict 3 ¢ OpenMinted’s focus Establish an open and sustainable Text and Data Mining (TDM) platform and infrastructure where researchers can discover, collaboratively create, share and re-use knowledge from a wide range of text based scientific and scholarly related sources. RDA666 BoF, Tokyo, March 3, 2016 Key aspects service oriented – discovery, re-use of content and tools build on existing TDM tools - no focus on new algorithms infrastructure – focus on interoperability community driven - user centric requirements open science - openness at all levels 5 Main Routes 6 WORKING GROUP: Intellectual Property Rights & Licensing ¢ ¢ • • • • Identify copyright and related rights (e.g. sui generis database right), restrictions and exceptions to the use and reuse of sources (both textual sources and text mining services) in TDM activities. Interoperability Outcomes Inventory of existing licences that grant access & reuse rights for content Policy specification for copyright law empowered permissions and licensing contracts based reuse Rights representation in appropriate standardised rights expression languages (ODRL, CCREL) for open content consumption and reuse Translation of legal and policy aspects into authentication & authorization specifications for user to-service and service-to-service interactions 7 ¢ ¢ ¢ Provides a critical up-to-date assessment of legal regulations and policies impacting TDM in the EU, placed in the international research and innovation context. Adopts a bottom-up approach by initiating dialogue between all relevant stakeholders to help identify barriers, common solutions and increase awareness of TDM practices and their potential. To develop novel policy frameworks and interdisciplinary casedriven practitioner guidelines facilitating the spread of TDM activities. Instituut voor Informatierecht - IViR THE LEGAL FRAMEWORK ¢ Directive 2001/29/EC on Copyright in the Information Society ¢ Directive 96/9/EC on the Legal Protection of Databases ¢ [Directive 97/66/EC on data protection and ePrivacy Directive (2002/58/EC)] Instituut voor Informatierecht - IViR 9 Copyright Protection ¢ Protected content mined: £ ¢ Text, image, sound, video, collections of data Scope of protection: £ Right of reproduction £ Right of communication to the public, incl. making available Instituut voor Informatierecht - IViR 10 Copyright Exceptions ¢ ¢ ¢ ¢ ¢ ¢ Transient and incidental reproduction (mandatory) Private use Reproductions by libraries, archives, education establishments etc. Educational use and research exception Press exception Quotation Instituut voor Informatierecht - IViR 11 Sui generis Database right ¢ Subject matter of protection ‘database which shows that there has been qualitatively and/or quantitatively a substantial investment in either the obtaining, verification or presentation of the contents to prevent extraction and/or re-utilization of the whole or of a substantial part, evaluated qualitatively and/or quantitatively, of the contents of that database’. Instituut voor Informatierecht - IViR 12 Sui generis Database right ¢ Right of extraction The permanent or temporary transfer of all or a substantial part of the contents of a database to another medium by any means or in any form ¢ Right of re-utilization Any form of making available to the public all or a substantial part of the contents of a database by the distribution of copies, by renting, by on-line or other forms of transmission. ¢ Repeated and systematic extraction and/or reutilization of insubstantial parts of the contents Instituut voor Informatierecht - IViR 13 Sui generis Database right ¢ Case C202-12, Decision 19 December 2013 (Innoweb v. Wegener) Dedicated meta search engine infringes database right since it constitutes “re-utilising” of the “whole or a substantial part” of a database. Text and Data Mining – 2013 14 Sui generis Database right ¢ Educational use and research exception (optional) (b) in the case of extraction for the purposes of illustration for teaching or scientific research, as long as the source is indicated and to the extent justified by the non-commercial purpose to be achieved; Instituut voor Informatierecht - IViR 15 THE PRACTICE ¢ Varying attitudes of content providers towards allowing TDM, ranging from £ A strict prohibition £ Only allowed through an API £ Implicitly allowed £ Expressly allowed (Open Access) Instituut voor Informatierecht - IViR 16 Instituut voor Informatierecht - IViR 17 Most pressing barriers to TDM ¢ Restrictiveness of legal provisions ¢ Cross-border differences in the law ¢ Unclarity of applicable rules ¢ Problems of interoperability Instituut voor Informatierecht - IViR 18 Conclusion ¢ Imperative to make room for TDM in research ¢ EU competitiveness is at stake ¢ Good timing to review IP laws Instituut voor Informatierecht - IViR 19 Thank you for your attention! For more information [email protected] This presentation is licensed under a Creative Commons Attribution 4.0 Licence Instituut voor Informatierecht - IViR 20