Presentation Text and data mining: barriers, paths and

advertisement
Institute for Information Law
Text & Data Mining: Barriers, Paths and
Passable Roads
L. Guibault,
Designing and Shaping Open Science
05.04. 2016
Instituut voor Informatierecht - IViR
2
Emerging solution(s)
Machine reading
process textual sources, organise and classify in various dimensions, extract main
(indexical) information items,
… and “understanding”
identify and extract entities and relations between entities, facilitate the
transformation of unstructured textual sources into structured data
… and predicting
enable the multidimensional analysis of structured data to extract meaningful
insights and improve the ability to predict
3
¢
OpenMinted’s
focus
Establish an open and sustainable Text and Data
Mining (TDM) platform and infrastructure where
researchers can discover, collaboratively create, share
and re-use knowledge from a wide range of text
based scientific and scholarly related sources.
RDA666 BoF, Tokyo, March 3, 2016
Key aspects
service oriented – discovery, re-use
of content and tools
build on existing TDM tools - no
focus on new algorithms
infrastructure – focus on interoperability
community driven - user centric
requirements
open science - openness at all levels
5
Main Routes
6
WORKING GROUP: Intellectual Property Rights & Licensing
¢
¢
•
•
•
•
Identify copyright and related rights (e.g. sui generis database right), restrictions and
exceptions to the use and reuse of sources (both textual sources and text mining
services) in TDM activities.
Interoperability Outcomes
Inventory of existing licences that grant access & reuse rights for content
Policy specification for copyright law empowered permissions and licensing contracts
based reuse
Rights representation in appropriate standardised rights expression languages (ODRL,
CCREL) for open content consumption and reuse
Translation of legal and policy aspects into authentication & authorization
specifications for user to-service and service-to-service interactions
7
¢
¢
¢
Provides a critical up-to-date assessment of legal regulations
and policies impacting TDM in the EU, placed in the
international research and innovation context.
Adopts a bottom-up approach by initiating dialogue between
all relevant stakeholders to help identify barriers, common
solutions and increase awareness of TDM practices and their
potential.
To develop novel policy frameworks and interdisciplinary casedriven practitioner guidelines facilitating the spread of TDM
activities.
Instituut voor Informatierecht - IViR
THE LEGAL FRAMEWORK
¢
Directive 2001/29/EC on Copyright in the
Information Society
¢
Directive 96/9/EC on the Legal Protection of
Databases
¢
[Directive 97/66/EC on data protection and
ePrivacy Directive (2002/58/EC)]
Instituut voor Informatierecht - IViR
9
Copyright Protection
¢
Protected content mined:
£
¢
Text, image, sound, video, collections of data
Scope of protection:
£
Right of reproduction
£
Right of communication to the public, incl.
making available
Instituut voor Informatierecht - IViR
10
Copyright Exceptions
¢
¢
¢
¢
¢
¢
Transient and incidental reproduction (mandatory)
Private use
Reproductions by libraries, archives, education
establishments etc.
Educational use and research exception
Press exception
Quotation
Instituut voor Informatierecht - IViR
11
Sui generis Database right
¢
Subject matter of protection
‘database which shows that there has been
qualitatively and/or quantitatively a substantial
investment in either the obtaining, verification or
presentation of the contents to prevent extraction
and/or re-utilization of the whole or of a substantial
part, evaluated qualitatively and/or quantitatively, of
the contents of that database’.
Instituut voor Informatierecht - IViR
12
Sui generis Database right
¢
Right of extraction
The permanent or temporary transfer of all or a substantial
part of the contents of a database to another medium by any
means or in any form
¢
Right of re-utilization
Any form of making available to the public all or a substantial
part of the contents of a database by the distribution of
copies, by renting, by on-line or other forms of transmission.
¢
Repeated and systematic extraction and/or reutilization of insubstantial parts of the contents
Instituut voor Informatierecht - IViR
13
Sui generis Database right
¢
Case C202-12, Decision 19 December 2013
(Innoweb v. Wegener)
Dedicated meta search engine infringes
database right since it constitutes “re-utilising”
of the “whole or a substantial part” of a
database.
Text and Data Mining – 2013
14
Sui generis Database right
¢
Educational use and research exception
(optional)
(b) in the case of extraction for the
purposes of illustration for teaching or
scientific research, as long as the source is
indicated and to the extent justified by the
non-commercial purpose to be achieved;
Instituut voor Informatierecht - IViR
15
THE PRACTICE
¢
Varying attitudes of content providers towards
allowing TDM, ranging from
£
A strict prohibition
£
Only allowed through an API
£
Implicitly allowed
£
Expressly allowed (Open Access)
Instituut voor Informatierecht - IViR
16
Instituut voor Informatierecht - IViR
17
Most pressing barriers to TDM
¢
Restrictiveness of legal provisions
¢
Cross-border differences in the law
¢
Unclarity of applicable rules
¢
Problems of interoperability
Instituut voor Informatierecht - IViR
18
Conclusion
¢
Imperative to make room for TDM in
research
¢
EU competitiveness is at stake
¢
Good timing to review IP laws
Instituut voor Informatierecht - IViR
19
Thank you for your attention!
For more information
[email protected]
This presentation is licensed under a
Creative Commons Attribution 4.0 Licence
Instituut voor Informatierecht - IViR
20
Download