首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
阮倩倩  黄万武 《科教文汇》2011,(35):127-127,175
由于不同国家的文化习惯和宗教信仰的不同,同一个颜色词有时会有不同的意思。这些明显的语义差异,刻上了深深的民族文化烙印。语言和文化是紧密相联的,一个民族既有自己的语言,也有自己的文化。词汇作为语言的建筑材料,常常最直接最具体地反映出人们的社会生活。本文从文化内涵的角度对英汉语言中一些基本颜色词进行分析比较,以便能够顺利地进行跨文化交际。  相似文献   

2.
赵力 《中国科技信息》2007,(22):279-280
词汇是构成语言的基本材料,要想把文章写好,首先必须用词正确,能够恰当地表达所达所要表达的思想。在初学写作者的习作中,错字、别字以及用词不当等错误比比皆是。本文就某些有代表性的用词错误举例说明。  相似文献   

3.
A method is introduced to recognize the part-of-speech for English texts using knowledge of linguistic regularities rather than voluminous dictionaries. The algorithm proceeds in two steps; in the first step information concerning the part-of-speech is extracted from each word of the text in isolation using morphological analysis as well as the fact that in English there are a reasonable number of word endings which are characteristic of the part-of-speech. The second step is to look at a whole sentence and, using syntactic criteria, to assign the part-of-speech to a single word according to the parts-of-speech and other features of the surrounding words. In particular, those parts-of-speech which are relevant for automatic indexing of documents, i.e. nouns, adjectives, and verbs, are recognized. An application of this method to a large corpus of scientific text showed the result that for 84% of the words the part-of-speech was identified correctly and only for 2% definitely wrong; for the rest of the words ambiguous assignments were made. Using only word lists of a limited extent, the technique thus may be a valuable tool aiding automatic indexing of documents and automatic thesaurus construction as well as other kinds of natural language processing.  相似文献   

4.
刘芳 《科教文汇》2014,(26):123-124
本文采用语料库分析方法,对双语文本进行对齐、标注并建库,使用语料库检索软件检索出各类文化负载词,并对其翻译策略进行分析和统计。数据显示,葛浩文对《红高粱家族》中文化负载词的翻译主要采取归化的翻译策略,这一结论既符合中国文学英译的主流模式也符合译者的翻译思想。通过对各类文化负载词的观察,发现译者对于难以寻找对等词的物质文化负载词和容易产生文化冲突的社会文化负载词和宗教文化负载词更倾向于使用归化策略,而为了保留原著的语言特征和异国情调,在语言文化负载词的翻译中大量采用了异化的策略。  相似文献   

5.
崔楠 《科教文汇》2013,(29):84-85
词汇是所有语言技能的基石,是语言系统赖以生存的支柱。词汇的教学在大学英语教学中可谓是重中之重,然而,英语词汇学习的难点就在于学习者对单词词义的掌握,它又是语言理解和表达的基础。大学英语词汇教学既要解决学习者的词义理解问题,又要面对学习者语言背景知识不足的实际情况。本文提出在英语词汇教学中,利用语言的象似性,分别运用隐喻模式、激活扩散模式、文化认知模式、体验模式将词义讲解和背景分析两者统一起来,可以让学生更好地理解英语词汇一词多义现象、词类转化现象,在需要掌握大量词汇的要求下找到词义之间的内在紧密联系,这样也可以使其更牢固地记忆单词。  相似文献   

6.
刘娇 《科教文汇》2020,(11):130-131
日语阅读教学作为核心课程"综合日语"中的一部分,虽然得到了相应的重视,但实际教学效果并不理想。教师更多地注重词句、语法等语言形式,对分析语言的功能、掌握阅读技巧、提高学生阅读能力等方面的关注不够。重语言形式的教学方式虽有其必要性,但也很有可能会导致学生在阅读时停留在对词句的表面理解层面,长此以往,不利于学生从整体上对文章进行理解和把控,无法正确提炼文章主旨、理顺章节脉络。笔者基于所任学校的中高职贯通应用日语专业、商务日语专业的日语阅读教学实际情况,进行分析与反思,认为应尝试在中职日语阅读教学中引入"语篇分析法",立足语篇,兼顾语言知识,注重培养学生分析语言在语篇中的功能性。  相似文献   

7.
Arabic is a morphologically rich language that presents significant challenges to many natural language processing applications because a word often conveys complex meanings decomposable into several morphemes (i.e. prefix, stem, suffix). By segmenting words into morphemes, we could improve the performance of English/Arabic translation pair’s extraction from parallel texts. This paper describes two algorithms and their combination to automatically extract an English/Arabic bilingual dictionary from parallel texts that exist in the Internet archive after using an Arabic light stemmer as a preprocessing step. Before using the Arabic light stemmer, the total system precision and recall were 88.6% and 81.5% respectively, then the system precision an recall increased to 91.6% and 82.6% respectively after applying the Arabic light stemmer on the Arabic documents.  相似文献   

8.
This paper presents a formalism for the representation of complex semantic relations among concepts of natural language. We define a semantic algebra as a set of atomic concepts together with an ordered set of semantic relations. Semantic trees are a graphical representation of a semantic algebra (comparable to Kantorovic trees for boolean or arithmetical expressions). A semantic tree is an ordered tree with nodes labeled with relation and concept names. We generate semantic trees from natural language texts in such a way that they represent the semantic relations which hold among the concepts occurring within that text. This generation process is carried out by a transformational grammar which transforms directly natural language sentences into semantic trees. We present an example for concepts and relations within the domain of computer science where we have generated semantic trees from definition texts by means of a metalanguage for transformational grammars (a sort of metacompiler for transformational grammars). The semantic trees generated so far serve for thesaurus entries in an information retrieval system.  相似文献   

9.
Word embeddings, which represent words as numerical vectors in a high-dimensional space, are contextualized by generating a unique vector representation for each sense of a word based on the surrounding words and sentence structure. They are typically generated using such deep learning models as BERT and trained on large amounts of text data and using self-supervised learning techniques. Resulting embeddings are highly effective at capturing the nuances of language, and have been shown to significantly improve the performance of numerous NLP tasks. Word embeddings represent textual records of human thinking, with all the mental relations that we utilize to produce the succession of sentences that make up texts and discourses. Consequently, the distributed representation of words within embeddings ought to capture the reasoning relations that hold texts together. This paper makes its contribution to the field by proposing a benchmark for the assessment of contextualized word embeddings that probes into their capability for true contextualization by inspecting how well they capture resemblance, contrariety, comparability, identity, relations in time and space, causation, analogy, and sense disambiguation. The proposed metrics adopt a triangulation approach, so they use (1) Hume’s reasoning relations, (2) standard analogy, and (3) sense disambiguation. The benchmark has been evaluated against 22 Arabic contextualized embeddings and has proven to be capable of quantifying their differential performance in terms of these reasoning relations. Results of evaluation of the target embeddings revealed that they do take context into account and that they do reasonably well in sense disambiguation but have weakness in their identification of converseness, synonymy, complementarity, and analogy. Results also show that size of an embedding has diminishing returns because the highly frequent language patterns swamp low frequency patterns. Furthermore, the suggest that future research endeavors should not be concerned with the quantity of data as much as its quality, and that it should focus more on the representativeness of data, and on model architecture, design, and training.  相似文献   

10.
The purpose of this article is to validate, through two empirical studies, a new method for automatic evaluation of written texts, called Inbuilt Rubric, based on the Latent Semantic Analysis (LSA) technique, which constitutes an innovative and distinct turn with respect to LSA application so far. In the first empirical study, evidence of the validity of the method to identify and evaluate the conceptual axes of a text in a sample of 78 summaries by secondary school students is sought. Results show that the proposed method has a significantly higher degree of reliability than classic LSA methods of text evaluation, and displays very high sensitivity to identify which conceptual axes are included or not in each summary. A second study evaluates the method's capacity to interact and provide feedback about quality in a real online system on a sample of 924 discursive texts written by university students. Results show that students improved the quality of their written texts using this system, and also rated the experience very highly. The final conclusion is that this new method opens a very interesting way regarding the role of automatic assessors in the identification of presence/absence and quality of elaboration of relevant conceptual information in texts written by students with lower time costs than the usual LSA-based methods.  相似文献   

11.
Text clustering is a well-known method for information retrieval and numerous methods for classifying words, documents or both together have been proposed. Frequently, textual data are encoded using vector models so the corpus is transformed in to a matrix of terms by documents; using this representation text clustering generates groups of similar objects on the basis of the presence/absence of the words in the documents. An alternative way to work on texts is to represent them as a network where nodes are entities connected by the presence and distribution of the words in the documents. In this work, after summarising the state of the art of text clustering we will present a new network approach to textual data. We undertake text co-clustering using methods developed for social network analysis. Several experimental results will be presented to demonstrate the validity of the approach and the advantages of this technique compared to existing methods.  相似文献   

12.
Coreference resolution of geological entities is an important task in geological information mining. Although the existing generic coreference resolution models can handle geological texts, a dramatic decline in their performance can occur without sufficient domain knowledge. Due to the high diversity of geological terminology, coreference is intricately governed by the semantic and expressive structure of geological terms. In this paper, a framework CorefRoCNN based on RoBERTa and convolutional neural network (CNN) for end-to-end coreference resolution of geological entities is proposed. Firstly, the fine-tuned RoBERTa language model is used to transform words into dynamic vector representations with contextual semantic information. Second, a CNN-based multi-scale structure feature extraction module for geological terms is designed to capture the invariance of geological terms in length, internal structure, and distribution. Thirdly, we incorporate the structural feature and word embedding for further determinations of coreference relations. In addition, attention mechanisms are used to improve the ability of the model to capture valid information in geological texts with long sentence lengths. To validate the effectiveness of the model, we compared it with several state-of-the-art models on the constructed dataset. The results show that our model has the optimal performance with an average F1 value of 79.78%, which is a 1.22% improvement compared to the second-ranked method.  相似文献   

13.
知识地图方法及其在国内语义网文献研究中的应用   总被引:4,自引:0,他引:4  
吴正荆  朱晶 《情报科学》2008,26(10):1502-1506
以维普数据库中2001-2007年与语义网相关的期刊论文为基础,在确定国内语义网研究领域30个高频关键词的基础上.采用共词分析法与知识地图法.以SPSS软件和ucinet软件为工具绘制知识地图.进而分析语义网的学科结构.最终得出结论:语义网目前的研究领域主要是围绕本体这一核心.主要分为关键技术、知识组织、信息检索和用户服务四个方面.  相似文献   

14.
While literary works are often treated as museum pieces, an alternative Romantic/ Pragmatic aesthetic emphasizes instead the rootedness of all texts in lived experience. This suggests that both literary and scientific texts may be approached as performances that weave together discursive and material elements, giving language to matter, both making, and becoming, "things that talk." Three authors are contrasted: Emerson uses natural objects as metaphors to complete his thought; Thoreau uses natural objects as mediators who enroll him to speak for them in the name of a wider ecology; Humboldt attempts to enroll nonhumans, namely cannibals, into the global civil community by asking them to speak for themselves. The resulting quandary unsettles the Cartesian boundary between human and nonhuman, subject and object; as scholars divided by this boundary, we must multiply our own relations, the better to understand the ties that bind us into the common project of building the Cosmos.  相似文献   

15.
Natural Language Processing (NLP) techniques have been successfully used to automatically extract information from unstructured text through a detailed analysis of their content, often to satisfy particular information needs. In this paper, an automatic concept map construction technique, Fuzzy Association Concept Mapping (FACM), is proposed for the conversion of abstracted short texts into concept maps. The approach consists of a linguistic module and a recommendation module. The linguistic module is a text mining method that does not require the use to have any prior knowledge about using NLP techniques. It incorporates rule-based reasoning (RBR) and case based reasoning (CBR) for anaphoric resolution. It aims at extracting the propositions in text so as to construct a concept map automatically. The recommendation module is arrived at by adopting fuzzy set theories. It is an interactive process which provides suggestions of propositions for further human refinement of the automatically generated concept maps. The suggested propositions are relationships among the concepts which are not explicitly found in the paragraphs. This technique helps to stimulate individual reflection and generate new knowledge. Evaluation was carried out by using the Science Citation Index (SCI) abstract database and CNET News as test data, which are well known databases and the quality of the text is assured. Experimental results show that the automatically generated concept maps conform to the outputs generated manually by domain experts, since the degree of difference between them is proportionally small. The method provides users with the ability to convert scientific and short texts into a structured format which can be easily processed by computer. Moreover, it provides knowledge workers with extra time to re-think their written text and to view their knowledge from another angle.  相似文献   

16.
Parliamentary texts are records of discussions of domestic and international affairs, which reflect national attitudes and development trends in foreign relations. In this paper, a research framework is proposed to analyze foreign relations on the basis of parliamentary texts. First, topic words are extracted from parliamentary texts, and then a co-word network is constructed to represent the correlation structure of topic words. The basic statistics, calculation of network indicators, community detection, and visualization of network maps and evolution venation, as well as the depiction of a strategic diagram, elucidate deeper characteristics and connotations of foreign relations. This case study on UK-China relations during the period of 2011-2017 using British parliamentary texts reveals the following findings. Over this period, UK-China relations changed in terms of the topics involved, topics which are greatly unbalanced in distribution, but are quite concentrated. Five different directions exist, centering on Trade, Human rights, Nuclear, Steel, and Visa. The evolution of topics includes merging and differentiation. A minority of topics exhibit marked continuity, which constitute the main focal points discussed each year, such as Economy and Trade. Regarding development trends, themes related to trade and steel remain focal points in UK-China relations. Overall, the framework proposed in this paper is proven to be both effective and feasible, and its application through this case study can foster a deeper understanding of the status and development of UK-China relations.  相似文献   

17.
《Endeavour》2001,25(3):121-126
  相似文献   

18.
马慈君 《科教文汇》2012,(1):128-129
词汇是语言的基础,词汇的学习和掌握对语言学习至关重要。但词汇的习得却始终困扰着学生,对将英语作为第三语言学习的白族学生而言,教师应当从语音强化,传授词汇记忆策略,扩大阅读量,渗透文化知识等方面提高学生的词汇水平。  相似文献   

19.
Semantic representation reflects the meaning of the text as it may be understood by humans. Thus, it contributes to facilitating various automated language processing applications. Although semantic representation is very useful for several applications, a few models were proposed for the Arabic language. In that context, this paper proposes a graph-based semantic representation model for Arabic text. The proposed model aims to extract the semantic relations between Arabic words. Several tools and concepts have been employed such as dependency relations, part-of-speech tags, name entities, patterns, and Arabic language predefined linguistic rules. The core idea of the proposed model is to represent the meaning of Arabic sentences as a rooted acyclic graph. Textual entailment recognition challenge is considered in order to evaluate the ability of the proposed model to enhance other Arabic NLP applications. The experiments have been conducted using a benchmark Arabic textual entailment dataset, namely, ArbTED. The results proved that the proposed graph-based model is able to enhance the performance of the textual entailment recognition task in comparison to other baseline models. On average, the proposed model achieved 8.6%, 30.2%, 5.3% and 16.2% improvement in terms of accuracy, recall, precision, and F-score results, respectively.  相似文献   

20.
邢宝山  李欣欣  王丽 《现代情报》2006,26(12):125-126
本文从副主题词的定义及基本特征入手,分析副主题词在医学期刊文献主题词标引中的作用,即修饰限制作用及通过与主题词组配扩大词量的作用;指出目前副主题词使用过程中存在的问题,如组配错误,组配不恰当,不能正确揭示概念等,并提出提高副主题词标引质量的方法及对策。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号