Using Comparable Corpora For Under Resourced Areas Of Machine Translation

Using Comparable Corpora For Under Resourced Areas Of Machine Translation Book in PDF, ePub and Kindle version is available to download in english. Read online anytime anywhere directly from your device. Click on the download button below to get a free pdf file of Using Comparable Corpora For Under Resourced Areas Of Machine Translation book. This book definitely worth reading, it is an incredibly well-written.

Using Comparable Corpora for Under-Resourced Areas of Machine Translation

Author : Inguna Skadiņa,Robert Gaizauskas,Bogdan Babych,Nikola Ljubešić,Dan Tufiş,Andrejs Vasiļjevs
Publisher : Springer
Page : 323 pages
File Size : 50,8 Mb
Release : 2019-02-06
Category : Computers
ISBN : 9783319990040

Get Book

Using Comparable Corpora for Under-Resourced Areas of Machine Translation by Inguna Skadiņa,Robert Gaizauskas,Bogdan Babych,Nikola Ljubešić,Dan Tufiş,Andrejs Vasiļjevs Pdf

This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.

Building and Using Comparable Corpora

Author : Serge Sharoff,Reinhard Rapp,Pierre Zweigenbaum,Pascale Fung
Publisher : Springer Science & Business Media
Page : 333 pages
File Size : 40,7 Mb
Release : 2013-12-13
Category : Computers
ISBN : 9783642201288

Get Book

Building and Using Comparable Corpora by Serge Sharoff,Reinhard Rapp,Pierre Zweigenbaum,Pascale Fung Pdf

The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Building and Using Comparable Corpora for Multilingual Natural Language Processing

Author : Serge Sharoff,Reinhard Rapp,Pierre Zweigenbaum
Publisher : Springer Nature
Page : 138 pages
File Size : 44,7 Mb
Release : 2023-08-23
Category : Computers
ISBN : 9783031313844

Get Book

Building and Using Comparable Corpora for Multilingual Natural Language Processing by Serge Sharoff,Reinhard Rapp,Pierre Zweigenbaum Pdf

This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications.

Human Language Technologies

Author : Inguna Skadina,Andrejs Vasiljevs
Publisher : IOS Press
Page : 264 pages
File Size : 40,8 Mb
Release : 2010
Category : Computers
ISBN : 9781607506409

Get Book

Human Language Technologies by Inguna Skadina,Andrejs Vasiljevs Pdf

This book contains papers from the Fourth International Conference on Human Language Technologies - the Baltic Perspective (Baltic HLT 2010), held in Riga in October 2010. This conference is the latest in a series which provides a forum for sharing recent advances in human language processing, and promotes cooperation between the computer science and linguistics communities of the Baltic countries and the rest of the world. Bringing together scientists, developers, providers and users, the conference is an opportunity to exchange information, discuss problems, find new synergies, and promote i.

Parallel Corpora for Contrastive and Translation Studies

Author : Irene Doval,M. Teresa Sánchez Nieto
Publisher : John Benjamins Publishing Company
Page : 313 pages
File Size : 47,5 Mb
Release : 2019-03-20
Category : Language Arts & Disciplines
ISBN : 9789027262844

Get Book

Parallel Corpora for Contrastive and Translation Studies by Irene Doval,M. Teresa Sánchez Nieto Pdf

This volume assesses the state of the art of parallel corpus research as a whole, reporting on advances in both recent developments of parallel corpora – with some particular references to comparable corpora as well– and in ways of exploiting them for a variety of purposes. The first part of the book is devoted to new roles that parallel corpora can and should assume in translation studies and in contrastive linguistics, to the usefulness and usability of parallel corpora, and to advances in parallel corpus alignment, annotation and retrieval. There follows an up-to-date presentation of a number of parallel corpus projects currently being carried out in Europe, some of them multimodal, with certain chapters illustrating case studies developed on the basis of the corpora at hand. In most of these chapters, attention is paid to specific technical issues of corpus building. The third part of the book reflects on specific applications and on the creation of bilingual resources from parallel corpora. This volume will be welcomed by scholars, postgraduate and PhD students in the fields of contrastive linguistics, translation studies, lexicography, language teaching and learning, machine translation, and natural language processing.

Corpus Use in Cross-linguistic Research

Author : Marlén Izquierdo,Zuriñe Sanz-Villar
Publisher : John Benjamins Publishing Company
Page : 245 pages
File Size : 55,8 Mb
Release : 2023-11-02
Category : Language Arts & Disciplines
ISBN : 9789027249319

Get Book

Corpus Use in Cross-linguistic Research by Marlén Izquierdo,Zuriñe Sanz-Villar Pdf

Cross-linguistic research is a fruitful field of language inquiry that has benefited enormously from the use of corpora. As sources of linguistic data of various kinds and as tools for language processing, corpora have shaped the development of cross-linguistic research, enabling both language description and practical applications. This volume contains twelve studies that emphasize the usefulness and usability of parallel corpora in accurately exploring the structure and use of seven under-researched languages and language varieties. The first part emphasizes the role of corpus-based descriptive analyses at the lexicogrammatical and discursive levels, as a first step on the way towards concrete applications like translation or language teaching. The second part focuses on the role of parallel-corpus-based language processing techniques and applications that facilitate professional communication. This book will be of interest to scholars in contrastive linguistics, translation studies, discourse analysis, language teaching, and natural language processing.

Corpora in Translation and Contrastive Research in the Digital Age

Author : Julia Lavid-López,Carmen Maíz-Arévalo,Juan Rafael Zamorano-Mansilla
Publisher : John Benjamins Publishing Company
Page : 353 pages
File Size : 47,9 Mb
Release : 2021-12-15
Category : Language Arts & Disciplines
ISBN : 9789027259684

Get Book

Corpora in Translation and Contrastive Research in the Digital Age by Julia Lavid-López,Carmen Maíz-Arévalo,Juan Rafael Zamorano-Mansilla Pdf

Corpus-based contrastive and translation research are areas that keep evolving in the digital age, as the range of new corpus resources and tools expands, opening up to different approaches and application contexts. The current book contains a selection of papers which focus on corpora and translation research in the digital age, outlining some recent advances and explorations. After an introductory chapter which outlines language technologies applied to translation and interpreting with a view to identifying challenges and research opportunities, the first part of the book is devoted to current advances in the creation of new parallel corpora for under-researched areas, the development of tools to manage parallel corpora or as an alternative to parallel corpora, and new methodologies to improve existing translation memory systems. The contributions in the second part of the book address a number of cutting-edge linguistic issues in the area of contrastive discourse studies and translation analysis on the basis of comparable and parallel corpora in several languages such as English, German, Swedish, French, Italian, Spanish, Portuguese and Turkish, thus showcasing the richness of the linguistic diversity carried out in these recent investigations. Given the multiplicity of topics, methodologies and languages studied in the different chapters, the book will be of interest to a wide audience working in the fields of translation studies, contrastive linguistics and the automatic processing of language.

Human Language Technologies - The Baltic Perspective

Author : A. Utka,G. Grigonytė,J. Kapočiūtė-Dzikienė
Publisher : IOS Press
Page : 276 pages
File Size : 46,8 Mb
Release : 2014-09-12
Category : Computers
ISBN : 9781614994428

Get Book

Human Language Technologies - The Baltic Perspective by A. Utka,G. Grigonytė,J. Kapočiūtė-Dzikienė Pdf

In the modern information society, there is an ever-growing need for improved natural language processing and human language technologies.This book presents the proceedings of the Sixth International Conference 'Human Language Technologies – The Baltic Perspective' (Baltic HLT 2014) held in Kaunas, Lithuania in September 2014. The Baltic HLT conferences provide an important forum for gathering and consolidating ideas, and are an opportunity for the Baltic countries to present important research results to an international audience. The book contains 39 long and short papers presented at the conference. These cover a wide range of topics: syntactic analysis, sentiment analysis, co-reference resolution, authorship attribution, information extraction, document clustering, machine translation, corpus and parallel corpus compiling, speech recognition, synthesis and others. The book is divided into three main sections: speech technology, methods in computational linguistics, and preparation of language resources. This book will be of interest to anyone whose work involves the use and application of computational linguistics and related disciplines.

Multilingual Processing in Eastern and Southern EU Languages

Author : Cristina Vertan
Publisher : Cambridge Scholars Publishing
Page : 410 pages
File Size : 46,6 Mb
Release : 2012-04-25
Category : Language Arts & Disciplines
ISBN : 9781443839624

Get Book

Multilingual Processing in Eastern and Southern EU Languages by Cristina Vertan Pdf

This volume draws attention to many specific challenges of multilingual processing within the European Union, especially after the recent successive enlargement. Most of the languages considered herein are not only ‘less resourced’ in terms of processing tools and training data, but also have features which are different from the well known international language pairs. The 16 contributions address specific problems and solutions for languages from south-eastern and central Europe in the context of multilingual communication, translation and information retrieval.

Machine Translation with Minimal Reliance on Parallel Resources

Author : George Tambouratzis,Marina Vassiliou,Sokratis Sofianopoulos
Publisher : Springer
Page : 88 pages
File Size : 45,6 Mb
Release : 2017-08-09
Category : Computers
ISBN : 9783319631073

Get Book

Machine Translation with Minimal Reliance on Parallel Resources by George Tambouratzis,Marina Vassiliou,Sokratis Sofianopoulos Pdf

This book provides a unified view on a new methodology for Machine Translation (MT). This methodology extracts information from widely available resources (extensive monolingual corpora) while only assuming the existence of a very limited parallel corpus, thus having a unique starting point to Statistical Machine Translation (SMT). In this book, a detailed presentation of the methodology principles and system architecture is followed by a series of experiments, where the proposed system is compared to other MT systems using a set of established metrics including BLEU, NIST, Meteor and TER. Additionally, a free-to-use code is available, that allows the creation of new MT systems. The volume is addressed to both language professionals and researchers. Prerequisites for the readers are very limited and include a basic understanding of the machine translation as well as of the basic tools of natural language processing.​

Chinese Lexical Semantics

Author : Donghong Ji,Guozheng Xiao
Publisher : Springer
Page : 838 pages
File Size : 44,6 Mb
Release : 2013-02-15
Category : Computers
ISBN : 9783642363375

Get Book

Chinese Lexical Semantics by Donghong Ji,Guozheng Xiao Pdf

This book constitutes carefully reviewed and revised selected papers from the 13th Chinese Lexical Semantics Workshop, CLSW 2012, held in Wuhan, China, in July 2012. The 67 full papers and 17 short papers presented in this volume were carefully reviewed and selected from 169 submissions. They are organized in topical sections named: applications on natural language processing; corpus linguistics; lexical computation; lexical resources; lexical semantics; new methods for lexical semantics; and other topics.

Computational Linguistics and Intelligent Text Processing

Author : Alexander Gelbukh
Publisher : Springer
Page : 598 pages
File Size : 46,9 Mb
Release : 2013-03-12
Category : Computers
ISBN : 9783642372568

Get Book

Computational Linguistics and Intelligent Text Processing by Alexander Gelbukh Pdf

This two-volume set, consisting of LNCS 7816 and LNCS 7817, constitutes the thoroughly refereed proceedings of the 13th International Conference on Computer Linguistics and Intelligent Processing, CICLING 2013, held on Samos, Greece, in March 2013. The total of 91 contributions presented was carefully reviewed and selected for inclusion in the proceedings. The papers are organized in topical sections named: general techniques; lexical resources; morphology and tokenization; syntax and named entity recognition; word sense disambiguation and coreference resolution; semantics and discourse; sentiment, polarity, subjectivity, and opinion; machine translation and multilingualism; text mining, information extraction, and information retrieval; text summarization; stylometry and text simplification; and applications.

Human-Machine Interaction in Translation

Author : Bernadette Sharp,Michael Zock,Michael Carl,Arnt Lykke Jakobsen
Publisher : Samfundslitteratur
Page : 274 pages
File Size : 53,6 Mb
Release : 2011
Category : Human-computer interaction
ISBN : 8759316152

Get Book

Human-Machine Interaction in Translation by Bernadette Sharp,Michael Zock,Michael Carl,Arnt Lykke Jakobsen Pdf

Includes 19 papers which were selected for presentation at the workshop and the text of invite keynote lectures. The workshop provided an attractive interdisciplinary forum for fostering interactions among researchers and practitioners in Natural Language Processing (NLP) working within the paradigm of Cognitive Science (CS)

Corpus Analysis for Language Studies at the University Level

Author : Giedrė Valūnaitė Oleškevičienė,Liudmila Mockienė,Nadežda Stojković
Publisher : Cambridge Scholars Publishing
Page : 176 pages
File Size : 44,6 Mb
Release : 2021-02-08
Category : Language Arts & Disciplines
ISBN : 9781527565944

Get Book

Corpus Analysis for Language Studies at the University Level by Giedrė Valūnaitė Oleškevičienė,Liudmila Mockienė,Nadežda Stojković Pdf

This book highlights corpora use in teaching foreign languages in university education. It will appeal to both academics and practitioners interested in the process of teaching foreign languages at more advanced levels while applying corpus analysis and building tools for corpus annotation. It provides a detailed case study of analyzing the terminology of constitutional law in both English and Lithuanian as an example to illustrate the possibility of integrating corpus analysis tools into the process of teaching foreign languages in university education. The book reveals that initial linguistic knowledge is essential when teaching and learning foreign languages at more advanced levels while applying corpus annotation. In addition, it shows that, even though the use of new corpus software is perceived as a positive, there are still certain issues to be solved in this regard, such as the constant renewal of public computers in universities and the technical and methodological support for teachers while using corpora tools.

Machine Learning in Translation Corpora Processing

Author : Krzysztof Wolk
Publisher : CRC Press
Page : 264 pages
File Size : 51,8 Mb
Release : 2019-02-25
Category : Computers
ISBN : 9780429590771

Get Book

Machine Learning in Translation Corpora Processing by Krzysztof Wolk Pdf

This book reviews ways to improve statistical machine speech translation between Polish and English. Research has been conducted mostly on dictionary-based, rule-based, and syntax-based, machine translation techniques. Most popular methodologies and tools are not well-suited for the Polish language and therefore require adaptation, and language resources are lacking in parallel and monolingual data. The main objective of this volume to develop an automatic and robust Polish-to-English translation system to meet specific translation requirements and to develop bilingual textual resources by mining comparable corpora.