The Project Concept
Language resources and technology refers to all knowledge sources based on language (written or spoken) and related tools.
Examples include:
- Texts of all sorts which can be digitized medieval sources, web-sites, newspapers, digitized books etc
- Multimedia recordings (audio/video) and time series recorded during communication (data glove, eye tracking, etc)
- Various types of manually or automatically created annotations on texts, media streams etc
- Tools such as aligners, speech recognizers, tokenizers, part-of-speech taggers, parsers, manual annotators, viewers etc
- Various types of knowledge sources encapsulating knowledge about resources and languages such as metadata descriptions, GIS, lexica, concept registries, ontologies, etc
The CLARIN project is a large-scale pan-European collaborative effort to create, coordinate and make language resources and technology available and readily useable. CLARIN offers scholars the tools to allow computer-aided language processing, addressing one or more of the multiple roles language plays (i.e. carrier of cultural content and knowledge, instrument of communication, component of identity and object of study) in the Humanities and Social Sciences.
CLARIN proposes to create a European Resources Infrastructure that will be based on an open European Federation of strong service centres and repositories that jointly provide the whole European Humanities (and Social Sciences) community with
- knowledge about the existence of language resources,
- coordinated creation of, archiving of, and access to such resources,
- access to services and tools that would allow scholars to operate on such resources,
- bundling of and access to expertise related to specific language processing problems.
CLARIN will be built on the existing national infrastructures and all the knowledge gathered from the European funded projects in our domain. At the European level an efficient umbrella organization has to be set-up that will be responsible for the unification and organizations at the European level. CLARIN will also link up closely with appropriate infrastructures in other humanities disciplines. CLARIN sees itself as a research infrastructure that will offer specialized resources, tools/services and knowledge and will easily join with other complementary initiatives in the humanities area.
