首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
丁亮  李颖  何彦青 《情报工程》2016,2(4):080-088
统计机器翻译常常面临训练数据与待翻译文本领域不一的问题,从而影响了翻译的性能,因此领域自适应一直是研究者关注的课题。本文以传统自适应方法和现行的机器学习方法为框架,介绍了近年来统计机器翻译领域自适应研究的进展。分析了各类研究方法的优缺点并对未来研究做出展望。  相似文献   

2.
针对专利文献句子偏长的特点,将统计机器翻译中的训练语料进行子句切割获取双语的子句序列,再采 用统计和规则相结合的策略来生成子句对齐,建立基于简单子句的双语语料来重新训练统计机器翻译系统,在一定程 度上改善了原有双语训练语料中的短语对齐和词对齐,可以更为深入地利用平行语料中蕴含的翻译信息,应用于专利 统计机器翻译中,在NTCIR-9的测试集上进行实验比较,获得较为满意的翻译效果。  相似文献   

3.
双语语料库在机器翻译、跨语言信息检索以及翻译词典编纂等自然语言处理领域有着越来越重要的用途。该研究利用同族专利文献信息作为双语语料的来源,探讨了基于同族专利获取双语语料的可行性,以获取汉英双语语料为实例提出了双语语料的获取流程,同时进行双语对译部分的对齐规则的研究,从而构建出科技领域的平行双语语料库。最后,还阐述了该方法的相关注意事项以及应用前景。  相似文献   

4.
平行语料库的规模对于统计机器翻译性能的提高具有重要作用,但是平行语料库的人工构建成本很高。针对这个问题,本文提出了一种低成本高效率的平行语料构建方法,利用枢轴语言作为桥梁,借助已有的机器翻译技术并融合主动学习方法构建目标语言对的大规模高质量平行语料库。本文通过以英语作为枢轴语言构建日汉平行语料库的实例研究,利用成熟的基于短语的统计机器翻译技术,描述了基于译文自动评测的良好译文选择方法、基于主动学习的语料选取方法、以及翻译系统的更新迭代和评价实验。实验结果表明,本文提出的方法能够快速构建日汉平行语料,并有效提高日汉翻译系统的性能。  相似文献   

5.
基于模式匹配的军事演习情报信息抽取   总被引:1,自引:0,他引:1  
以军事演习情报信息抽取为突破点,采用基于模式匹配的方法进行演习情报的抽取.在信息抽取的不同环节,采用层次自动分类方法进行待抽取文本筛选;采用基于种子模式的自举方法结合领域词典进行军事演习组块识别;采用基于语料标注的方法进行事件属性模式学习获取.实验结果表明该方法在特定领域内的有效性,在实际工程项目中达到可应用状态.  相似文献   

6.
基于统计自然语言处理技术的领域本体半自动构建研究   总被引:1,自引:0,他引:1  
本体的构建是影响语义Web成功与否的重要因素之一.本文借鉴机器学习以及自然语言处理等技术成果尝试半自动构建本体,以专业研究论文为研究语料,采用N-Gram文本表达法从语料中抽取关键概念,计算主题度获取领域概念.利用改进的层次聚类算法对领域概念进行聚类以获取其等级体系,采用句法分析与统计相结合的方法从语料中获取可能的主、谓、宾模式为领域关系提供参考,并以农业史为例,设计开发了一个领域本体半自动构建实验系统,文中重点介绍了本体构建中概念的获取、等级关系、领域关系的构建以及形式化处理等关键技术的实现过程.  相似文献   

7.
面向专利领域的机器翻译近年来已成为机器翻译的重要应用领域之一。本文提出了一个汉英专利文本机器翻译融合系统,该系统以规则系统为主导搭建,并把规则翻译方法和基于短语的统计翻译系统相结合。在融合系统中,规则系统主要负责源语言的分析和转换阶段的处理,生成相应的源语言句法分析树与转换树,并确定目标语言的基本句法框架。统计翻译系统则在目标语生成阶段根据生成的目标语句法结构寻找合适的对译词形,并产生最终的候选译文。通过利用自动评测指标对融合系统进行测试,融合系统的结果均优于单个规则系统和统计系统的结果,表明了融合方法的有效性和可行性,可以改善系统的翻译性能,提高翻译质量。  相似文献   

8.
多机器翻译系统融合技术能够对不同机器翻译系统的输出结果有效地进行融合,产生更好的翻译性能,因此该技术成为机器翻译研究领域的一个热点问题。文章介绍了中国科学技术信息研究所(ISTIC)参加第七届全国机器翻译研讨会机器翻译评测的情况。本单位参加了英汉科技领域的机器翻译评测项目。文章阐述了本单位机器翻译系统的实现框架以及实施细节,并分析了它们在评测数据上的性能表现,最后对机器翻译系统融合方法目前的现状进行讨论,并对该系统融合方法进行总结和展望。  相似文献   

9.
张家俊  宗成庆 《情报工程》2017,3(3):021-028
近两年来,神经机器翻译(Neural Machine Translation, NMT)模型主导了机器翻译的研究,但是统计机器翻译(Statistical Machine Translation, SMT)在很多应用场合(尤其是专业领域)仍有较强的竞争力。如何利用深度学习技术提升现有统计机器翻译的水平成为研究者们关注的主要问题。由于语言模型是统计机器翻译中最核心的模块之一,本文主要从语言模型的角度入手,探索神经网络语言模型在统计机器翻译中的应用。本文分别探讨了基于词和基于短语的神经网络语言模型,在汉语到英语和汉语到日语的翻译实验表明神经网络语言模型能够显著改善统计机器翻译的译文质量。  相似文献   

10.
[目的/意义]文章旨在探究将不同语义知识融入机器翻译模型能否增强机器翻译的效果以及何种语义知识的作用更为显著,以助力机器翻译研究与中华优秀传统文化的传承与传播。[方法/过程]研究选取了30万对精加工的《二十四史》“古代汉语-现代汉语”平行语料作为实验数据,基于神经机器翻译OpenNMT模型,通过三种不同的特征融合方法,将词边界知识、词性知识、实体知识和依存句法知识分别融入机器翻译模型的训练过程中。[结果/结论]不同语义知识与模型的融合对典籍翻译效果有不同的影响,词边界知识、词性知识、实体知识对机器翻译任务有一定的贡献且实体知识的贡献最大,依存句法知识无明显作用。  相似文献   

11.
ABSTRACT

The history of the almanac in Croatia is reconstructed through primary research in bibliographic and archival sources. The almanac is a vehicle for knowledge communication in informal contexts, engaging both oral tradition and literary forms traceable to medieval literacy and ways of structuring knowledge. The history of the almanac in Croatia reflects the changing context of the book trade, literacy, and the evolution of language. Four main stages are identified: (1) the beginning of the annual almanac in the seventeenth century; astrological almanacs reflecting the sensibility of the Baroque period; (2) the Enlightenment's stimulation of almanac publishing in the spirit of contemporary secular reforms in agriculture and education; (3) nineteenth-and twentieth-century almanac trade, showing complex and overlapping networks for the production, distribution and appropriation of printed almanacs;(4) roughly the end of World War II, when the almanac slowly moved out of the role of a popular mass medium and into specialized niches represented by regional, diaspora, and religious almanacs.  相似文献   

12.
Some key questions for publishers in today’s market are: Could Amazon’s recent merger and acquisitions strategy create a disruptive or paradigm shifting business model in the publishing industry? Do their recent actions post mergers and acquisitions illustrate a predicable pattern of behaviour that publishers can strategize around? This paper will explore these questions and look at some of the possible reasons behind Amazon’s business practices and the possible consequences to publishers.  相似文献   

13.
ABSTRACT

German authorities are expecting more than 1 million refugees by the end of 2015. These people come to Germany to seek protection and assistance and to build a new life, therefore it is important to welcome them and to assist them in their integration as soon as possible. This situation creates a versatile and perfectly fitting opportunity for cultural and educational programs, including libraries, which can play a vital role in this integration process. The key to integration is the knowledge of the German language, and the most important challenge now is to teach the necessary language skills to as many asylum-seekers as possible.  相似文献   

14.
Abstract

Many libraries use RSS to syndicate information about their collections to users. A survey of 65 academic libraries revealed their most common use for RSS is to disseminate information about library holdings, such as lists of new acquisitions. Even though typical RSS feeds are ill suited to the task of carrying rich bibliographic metadata, great potential exists for developing applications that can exploit metadata exposed to Web services via RSS. Using the MODS metadata format, entire catalog records can be seamlessly embedded in RSS 2.0 feeds. Existing tools, such as Library of Congress Java toolkits and XSLT stylesheets, can facilitate this process, while a new XSLT stylesheet may be used to create the RSS feeds complete with MODS records. As an example of the added functionality these MODS/RSS feeds can offer, records from a MODS-enriched RSS feed can be ingested into a non-RSS application such as Zotero. As more emerging library technologies use Web services architectures to handle data objects, the ability to syndicate catalog records will become more critical to providing innovative library Web services.  相似文献   

15.
SUMMARY

The Nevada Constitution provides voters with the ability to propose new statutes, amendments to existing statutes, and amendments to the State constitution through the initiative petition process. Voters can also approve or disapprove of existing statutes through the referendum petition process. Both require the circulation of a petition to collect a minimum number of signatures before the Secretary of State will place the measures on the ballot for the general election. As with other topics of law, Nevada has a profound shortage of research resources on initiatives and referenda, but State law and government Web sites provide enough information to allow for significant research.  相似文献   

16.
SUMMARY

This article is a case study of Ariel use at the University of Texas at Austin. The author divides the history of Ariel use into three distinct stages: resistance, full implementation, and enhanced. For each stage the advantages and disadvantages of using Ariel are discussed. The author concludes that using Ariel saves both money and staff time and permits faster delivery of documents than other methods.  相似文献   

17.
SUMMARY

CISTI, the Canada Institute for Scientific and Technical Information, is one of the world's largest document delivery suppliers, and was among the first to be fully automated. The revolutionary Intel-liDoc system, developed during the years 1993 to 1995, provided for end-to-end automation of the document delivery process, which has enabled CISTI to improve its service and accommodate growth. From the beginning Ariel was, and remains, an integral part of IntelliDoc. This article describes how Ariel has been integrated into IntelliDoc and into CIS-TI's services, showing the benefits to CISTI and to its clients.  相似文献   

18.
利益相关者视角下的图书馆电子借阅服务研究   总被引:1,自引:0,他引:1  
[目的/意义] 以图书馆提供电子借阅服务过程中所涉及的利益相关者为研究对象,厘清各利益相关者的利益关系,提出建立基于社会平衡机制的图书馆电子借阅服务系统的主要路径。[方法/过程] 介绍图书馆电子书和电子借阅的发展现状,引入利益相关者理论分析作者、提供商、图书馆与用户之间的利益关系。[结果/结论] 研究发现图书馆电子借阅服务涉及作者与提供商的商业利益、图书馆提供信息资源的权利和用户获取知识的权利,不同利益主体存在既冲突又合作的关系;图书馆电子借阅服务系统不够成熟,产生明显的资源供需失衡问题。提出以下建议:完善法律制度,维护公共利益;扶持电子书产业发展,规范市场行为;建立许可协议标准模板,保护各方利益;控制采购成本,促进电子书资源利用;发挥电子书优势,促进电子书阅读推广。  相似文献   

19.
20.
This article is a preliminary summary of the results of a huge study of reading among urban high school students (ages 14–17) in many regions of Russia. It considers such aspects of the topic as: the place of reading in the structure of high school students' life plans and leisure activities; their motives for reading, their preferences, and the amount they read; the level of their culture of reading and information seeking; and the influence of family, peers, libraries, and the Internet on their reading habits.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号