首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
How are names for new disciplinary fields coined? Here a new (and fun) way to look at the history of such coinages is proposed, focusing on how phonesthemic tints and taints figure in decisions to adopt one type of suffix rather than another. The most common suffixes used in such coinages ("-logy," "-ics," etc.) convey semantic and evaluative content quite unpredictable from literal (root) meanings alone. Pharmaceutical manufacturers have long grasped the point, but historians have paid little attention to how suffixes of one sort or another become productive. A romp through examples from English shows that certain suffixes have become "hard" or "soft" in consequence of the status of their most prominent carrier disciplines. The "-ics" ending came to signify hardness in consequence of the prestige of physics, for example (with "-metrics" as the arteriosclerosis of suffixes), while lower-status (less "hard") disciplines have developed alternate endings (such as "studies"). Some suffixes are eschewed for their perceived ideologic slant (the "-isms," for example). Historians of science need to think more about the pragmatics of language, a task made easier by information technologies and databases that allow searches for words by suffix and first known use.  相似文献   

2.
Extracting semantic relationships between entities from text documents is challenging in information extraction and important for deep information processing and management. This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using support vector machines. Our study illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while current commonly used features from full parsing give limited further enhancement. This suggests that most of useful information in full parse trees for relation extraction is shallow and can be captured by chunking. This indicates that a cheap and robust solution in relation extraction can be achieved without decreasing too much in performance. We also demonstrate how semantic information such as WordNet, can be used in feature-based relation extraction to further improve the performance. Evaluation on the ACE benchmark corpora shows that effective incorporation of diverse features enables our system outperform previously best-reported systems. It also shows that our feature-based system significantly outperforms tree kernel-based systems. This suggests that current tree kernels fail to effectively explore structured syntactic information in relation extraction.  相似文献   

3.
The task of answering complex questions requires inferencing and synthesizing information from multiple documents that can be seen as a kind of topic-oriented, informative multi-document summarization. In generic summarization the stochastic, graph-based random walk method to compute the relative importance of textual units (i.e. sentences) is proved to be very successful. However, the major limitation of the TF*IDF approach is that it only retains the frequency of the words and does not take into account the sequence, syntactic and semantic information. This paper presents the impact of syntactic and semantic information in the graph-based random walk method for answering complex questions. Initially, we apply tree kernel functions to perform the similarity measures between sentences in the random walk framework. Then, we extend our work further to incorporate the Extended String Subsequence Kernel (ESSK) to perform the task in a similar manner. Experimental results show the effectiveness of the use of kernels to include the syntactic and semantic information for this task.  相似文献   

4.
The fundamental idea of the work reported here is to extract index phrases from texts with the help of a single word concept dictionary and a thesaurus containing relations among concepts. The work is based on the fact, that, within every phrase, the single words the phrase is composed of are related in a certain well denned manner, the type of relations holding between concepts depending only on the concepts themselves. Therefore relations can be stored in a semantic network. The algorithm described extracts single word concepts from texts and combines them to phrases using the semantic relations between these concepts, which are stored in the network. The results obtained show that phrase extraction from texts by this semantic method is possible and offers many advantages over other (purely syntactic or statistic) methods concerning preciseness and completeness of the meaning representation of the text. But the results show, too, that some syntactic and morphologic “filtering” should be included for effectivity reasons.  相似文献   

5.
吕怡宁 《科教文汇》2014,(26):118-120
本文以中国学习者语料库为基础,从句法、语义和语用角度研究教科书中英语“侥幸”类副词性关联词语的句法功能、语义功能和语用功能,从篇章层面研究英语该类副词的语篇衔接功能。我们发现英语“侥幸”类副词性关联词语luckily,fortunately和happily具有多种句法结构、语义功能和语用功能。本文研究的目的和意义在于对学界关于英语“侥幸”类副词性关联词语的深入研究有所帮助,并对英语学习者和教材词汇编写等方面有所启示。  相似文献   

6.
Text summarization is a process of generating a brief version of documents by preserving the fundamental information of documents as much as possible. Although most of the text summarization research has been focused on supervised learning solutions, there are a few datasets indeed generated for summarization tasks, and most of the existing summarization datasets do not have human-generated goal summaries which are vital for both summary generation and evaluation. Therefore, a new dataset was presented for abstractive and extractive summarization tasks in this study. This dataset contains academic publications, the abstracts written by the authors, and extracts in two sizes, which were generated by human readers in this research. Then, the resulting extracts were evaluated to ensure the validity of the human extract production process. Moreover, the extractive summarization problem was reinvestigated on the proposed summarization dataset. Here the main point taken into account was to analyze the feature vector to generate more informative summaries. To that end, a comprehensive syntactic feature space was generated for the proposed dataset, and the impact of these features on the informativeness of the resulting summary was investigated. Besides, the summarization capability of semantic features was experienced by using GloVe and word2vec embeddings. Finally, the use of ensembled feature space, which corresponds to the joint use of syntactic and semantic features, was proposed on a long short-term memory-based neural network model. ROUGE metrics evaluated the model summaries, and the results of these evaluations showed that the use of the proposed ensemble feature space remarkably improved the single-use of syntactic or semantic features. Additionally, the resulting summaries of the proposed approach on ensembled features prominently outperformed or provided comparable performance than summaries obtained by state-of-the-art models for extractive summarization.  相似文献   

7.
王初艳 《科教文汇》2014,(34):124-125
文章通过对随机抽取的英汉存现句样品进行分析,根据句子成分的隐现规律和语义特征将英汉存现句进行句法和语义分类,旨在帮助语言学习者更了解英汉存现句,并在一定程度上有助于相关研究。  相似文献   

8.
李哲 《科教文汇》2011,(4):116-116,128
"只"与"だけ"在句法、语义、语用三个平面各有异同。句法上结构基本一致,差异主要表现在语序上,语义范畴基本相同,但是语义指向却有很大差异。语用上两者都有限制和强调的功能。  相似文献   

9.
10.
苟欣悦 《科教文汇》2012,(19):82-83
本文探讨现代汉语方式副词的句法和语义的使用情况。联系实际生活日常生活汉语的使用,通过对日常的对汉语的学习并结合专业知识,归纳总结出现代汉语的方式副词的句法语义的使用方法。同时通过对现代汉语副词的使用状况调查,了解现代汉语副词在句法语义方面得到广泛的应用,并促进了现代汉语的发展。  相似文献   

11.
This paper describes a state-of-the-art supervised, knowledge-intensive approach to the automatic identification of semantic relations between nominals in English sentences. The system employs a combination of rich and varied sets of new and previously used lexical, syntactic, and semantic features extracted from various knowledge sources such as WordNet and additional annotated corpora. The system ranked first at the third most popular SemEval 2007 Task – Classification of Semantic Relations between Nominals and achieved an F-measure of 72.4% and an accuracy of 76.3%. We also show that some semantic relations are better suited for WordNet-based models than other relations. Additionally, we make a distinction between out-of-context (regular) examples and those that require sentence context for relation identification and show that contextual data are important for the performance of a noun–noun semantic parser. Finally, learning curves show that the task difficulty varies across relations and that our learned WordNet-based representation is highly accurate so the performance results suggest the upper bound on what this representation can do.  相似文献   

12.
吴明贤 《科教文汇》2012,(1):88-88,114
本文从句法、语义、语用三个平面的角度对"光"和"就"两个副词进行对比,详述了这对副词在句法、语义和语用等三个方面的异同点。  相似文献   

13.
目的:在对医学英语词汇的构成特点进行分析后,制定相应的学习策略,帮助医学院和相关医学专业学生快速、有效地扩大词汇量。方法:分析医学英语词汇的来源和构成、分析词义的规律,整理出心血管、消化、呼吸、泌尿系统常用的词根和常用的前缀、后缀。结果:学生通过掌握为数不多的常用构词成分.可以迅速扩大医学英语词汇量,由此提高医学英语水平。  相似文献   

14.
Opinion mining in a multilingual and multi-domain environment as YouTube requires models to be robust across domains as well as languages, and not to rely on linguistic resources (e.g. syntactic parsers, POS-taggers, pre-defined dictionaries) which are not always available in many languages. In this work, we i) proposed a convolutional N-gram BiLSTM (CoNBiLSTM) word embedding which represents a word with semantic and contextual information in short and long distance periods; ii) applied CoNBiLSTM word embedding for predicting the type of a comment, its polarity sentiment (positive, neutral or negative) and whether the sentiment is directed toward the product or video; iii) evaluated the efficiency of our model on the SenTube dataset, which contains comments from two domains (i.e. automobile, tablet) and two languages (i.e. English, Italian). According to the experimental results, CoNBiLSTM generally outperforms the approach using SVM with shallow syntactic structures (STRUCT) – the current state-of-the-art sentiment analysis on the SenTube dataset. In addition, our model achieves more robustness across domains than the STRUCT (e.g. 7.47% of the difference in performance between the two domains for our model vs. 18.8% for the STRUCT)  相似文献   

15.
Extracting semantic relationships between entities from text documents is challenging in information extraction and important for deep information processing and management. This paper proposes to use the convolution kernel over parse trees together with support vector machines to model syntactic structured information for relation extraction. Compared with linear kernels, tree kernels can effectively explore implicitly huge syntactic structured features embedded in a parse tree. Our study reveals that the syntactic structured features embedded in a parse tree are very effective in relation extraction and can be well captured by the convolution tree kernel. Evaluation on the ACE benchmark corpora shows that using the convolution tree kernel only can achieve comparable performance with previous best-reported feature-based methods. It also shows that our method significantly outperforms previous two dependency tree kernels for relation extraction. Moreover, this paper proposes a composite kernel for relation extraction by combining the convolution tree kernel with a simple linear kernel. Our study reveals that the composite kernel can effectively capture both flat and structured features without extensive feature engineering, and easily scale to include more features. Evaluation on the ACE benchmark corpora shows that the composite kernel outperforms previous best-reported methods in relation extraction.  相似文献   

16.
Answer selection is the most complex phase of a question answering (QA) system. To solve this task, typical approaches use unsupervised methods such as computing the similarity between query and answer, optionally exploiting advanced syntactic, semantic or logic representations.  相似文献   

17.
[目的/意义]实体语义关系分类是信息抽取重要任务之一,将非结构化文本转化成结构化知识,是构建领域本体、知识图谱、开发问答系统、信息检索系统的基础工作。[方法/过程]本文详细梳理了实体语义关系分类的发展历程,从技术方法、应用领域两方面回顾和总结了近5年国内外的最新研究成果,并指出了研究的不足及未来的研究方向。[结果/结论]热门的深度学习方法抛弃了传统浅层机器学习方法繁琐的特征工程,自动学习文本特征,实验发现,在神经网络模型中融入词法、句法特征、引入注意力机制能有效提升关系分类性能。  相似文献   

18.
基于本体的信息系统引论   总被引:5,自引:0,他引:5  
Since Tim Bemers-Lee, current W3C chairman, first proposed the concept of Semantic Web, it is be-coming a hot topic in computer information processing area. Ontologies are playing a key role in the Semantic Web, ex-tending syntactic interoperability to semantic intemperability by providing a source of shared and precisely defined terms.The paper analyzes the requirement of information systems for ontology languages. The current popular ontology languages are also discussed.  相似文献   

19.
The problem of modelling information systems is studied with focus on predictability. Predictability presupposes discovery and knowledge of empirical laws and theories, which are in the domain of information science. Discovery of such laws and theories goes hand in hand with the development of the capability to measure important variables in that domain. The state-of-the-art of predictive modelling is discussed with respect to syntactic, semantic, and pragmatic criteria, emphasizing the need for concentrated effort in further development of the empirical foundation of information science.  相似文献   

20.
This paper presents a model that incorporates contemporary theories of tense and aspect and develops a new framework for extracting temporal relations between two sentence-internal events, given their tense, aspect, and a temporal connecting word relating the two events. A linguistic constraint on event combination has been implemented to detect incorrect parser analyses and potentially apply syntactic reanalysis or semantic reinterpretation—in preparation for subsequent processing for multi-document summarization. An important contribution of this work is the extension of two different existing theoretical frameworks—Hornstein’s 1990 theory of tense analysis and Allen’s 1984 theory on event ordering—and the combination of both into a unified system for representing and constraining combinations of different event types (points, closed intervals, and open-ended intervals). We show that our theoretical results have been verified in a large-scale corpus analysis. The framework is designed to inform a temporally motivated sentence-ordering module in an implemented multi-document summarization system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号