共查询到9条相似文献,搜索用时 0 毫秒
1.
Word sense disambiguation is important in various aspects of natural language processing, including Internet search engines, machine translation, text mining, etc. However, the traditional methods using case frames are not effective for solving context ambiguities that requires information beyond sentences. This paper presents a new scheme for solving context ambiguities using a field association scheme. Generally, the scope of case frames is restricted to one sentence; however, the scope of the field association scheme can be applied to a set of sentences. In this paper, a formal disambiguation algorithm is proposed to control the scope for a set of variable number of sentences with ambiguities as well as solve ambiguities by calculating the weight of fields. In the experiments, 52 English and 20 Chinese words are disambiguated by using 104,532 Chinese and 38,372 English field association terms. The accuracy of the proposed field association scheme for context ambiguities is 65% higher than the case frame method. The proposed scheme shows better results than other three known methods, namely UNED-LS-U, IIT-2, and Relative-based in corpus SENSEVAL-2. 相似文献
2.
Uddin Md. Elmarhomy Elsayed Masao Kazuhiro Jun-ichi 《Information processing & management》2007,43(6):1793
Field Association (FA) terms are a limited set of discriminating terms that can specify document fields. Document fields can be decided efficiently if there are many relevant FA terms in that documents. An earlier approach built FA terms dictionary using a WWW search engine, but there were irrelevant selected FA terms in that dictionary because that approach extracted FA terms from the whole documents. This paper proposes a new approach for extracting FA terms using passage (portions of a document text) technique rather than extracting them from the whole documents. This approach extracts FA terms more accurately than the earlier approach. The proposed approach is evaluated for 38,372 articles from the large tagged corpus. According to experimental results, it turns out that by using the new approach about 24% more relevant FA terms are appending to the earlier FA term dictionary and around 32% irrelevant FA terms are deleted. Moreover, precision and recall are achieved 98% and 94% respectively using the new approach. 相似文献
3.
随着当前世界科学技术的不断发展,如何宏观把握技术发展态势、了解技术发展方向、预测技术发展趋势,已经成为国家政府、大中型企业以及研究机构愈来愈关注的重要问题之一。在此基础上,技术路线图方法应运而生,并成为技术评价与预测领域的核心工具之一。本文以文献计量学方法为核心,综合运用定性分析与定量分析方法,在定量数据的基础上,引入专家判断,构建了一套适用于宏观描绘新兴技术发展现状,并预测其发展趋势的技术路线图绘制模型。此外,本文选择了“电动汽车”技术领域展开实证研究,绘制了“电动汽车”领域全球技术路线图,验证了本模型的有效性与科学性。 相似文献
4.
《Information processing & management》2023,60(5):103416
Recently, graph neural network (GNN) has been widely used in sequential recommendation because of its powerful ability to capture high-order collaborative relations, greatly promoting recommendation performance. However, some existing GNN-based methods fail to make full use of multiple relevant features of nodes and ignore the impact of semantic association between nodes on extracting user preferences. To this end, we propose a multi-feature fused collaborative attention network MASR, which sufficiently learns the temporal and positional features of nodes, and innovatively measures the importance of these two features for analyzing the nodes’ dynamic patterns. In addition, we incorporate semantic-enriched contrastive learning into collaborative filtering to enhance the semantic association between nodes and reduce the noise from the structural neighborhood, which has a positive effect on the sequential recommendation. Compared with the baseline models, the performance of MASR on MovieLens, CDs and Beauty datasets is improved by 2.0%, 2.1% and 1.7% respectively, proving its effectiveness in the sequential recommendation. 相似文献
5.
Automatic text classification is the task of organizing documents into pre-determined classes, generally using machine learning algorithms. Generally speaking, it is one of the most important methods to organize and make use of the gigantic amounts of information that exist in unstructured textual format. Text classification is a widely studied research area of language processing and text mining. In traditional text classification, a document is represented as a bag of words where the words in other words terms are cut from their finer context i.e. their location in a sentence or in a document. Only the broader context of document is used with some type of term frequency information in the vector space. Consequently, semantics of words that can be inferred from the finer context of its location in a sentence and its relations with neighboring words are usually ignored. However, meaning of words, semantic connections between words, documents and even classes are obviously important since methods that capture semantics generally reach better classification performances. Several surveys have been published to analyze diverse approaches for the traditional text classification methods. Most of these surveys cover application of different semantic term relatedness methods in text classification up to a certain degree. However, they do not specifically target semantic text classification algorithms and their advantages over the traditional text classification. In order to fill this gap, we undertake a comprehensive discussion of semantic text classification vs. traditional text classification. This survey explores the past and recent advancements in semantic text classification and attempts to organize existing approaches under five fundamental categories; domain knowledge-based approaches, corpus-based approaches, deep learning based approaches, word/character sequence enhanced approaches and linguistic enriched approaches. Furthermore, this survey highlights the advantages of semantic text classification algorithms over the traditional text classification algorithms. 相似文献
6.
一种基于词上下文向量的文本自动分类方法 总被引:1,自引:0,他引:1
分析了传统文本自动分类方法的不足、词上下文向量的含义及其在自动分类中的作用,提出了一种基于词上下文向量的文本自动分类方法,该方法利用词上下文向量来生成分类器的类别中心向量和待分类文本的文本向量,使分类质量有所提高。 相似文献
7.
在迅速发展的网络技术的影响下。图书馆学进一步偏重信息技术。文章通过对文本自动分类技术的分析,指出自动分类技术不但不能替代图书馆分类对纸质图书进行分类,而且自动分类技术的发展需要图书学家提供支持。 相似文献
8.
降低银行与绿色创新企业间面临的系统性风险已成为中国现阶段经济发展亟待解决的问题。本文将银行与绿色创新企业两类主体作为研究对象,构建人工银企系统,基于仿真实验着重考察其重要主体行为。具体地,着重于考察资产负债表构建与更新、主体行为动态演化、流动性资产更新以及违约清算机制等方面,通过仿真手段探究绿色创新补助政策形式与补助强度对银企系统性风险与系统总收益的影响机制,以对理论模型进行全面验证。本研究结论的意义在于:聚焦于绿色创新补助政策与银企系统性风险的内在关系,通过计算实验方法进行验证与延伸,为促进中国绿色创新企业健康发展,帮助政府与银行防范绿色创新企业可能导致的系统性风险提供理论机制参考与新观点。 相似文献
9.
基于科技服务组织在“科技活动”与“产业活动”中的介入程度,通过多案例研究归纳了四种组织类型并总结了相对应的支撑能力组合。得出以下研究发现:一是基于科技服务组织与创新链上不同主体共生程度的差异的分类视角,把科技服务组织分为前向共生型、后向共生型、双向共生型及平台搭建型组织;二是总结了不同类型科技服务组织所需的主导和辅助支撑能力组合,其中主导支撑能力分别为“知识协同能力”、“产业共生能力”、“全面协同能力”以及“信息聚合能力”;三是在开放经济背景下,所有类型的科技服务组织都需要高度重视“国际联接能力”。 相似文献