共查询到19条相似文献,搜索用时 171 毫秒
1.
2.
3.
文章介绍了语音识别的基本原理以及用DSK6713实现语音识别算法的一些原则和方法,阐述了语音识别在DSP上的实现技术。系统使用梅尔倒谱系数(MFCC)作为特征参数,采用算法相对简单以及计算量较小的动态时间弯折算法(DTW)实现语音参数的匹配。用MATLAB实现DTW算法的仿真,进而将语音识别技术应用到DSP上,实验结果表明对特定人、小词汇量和孤立词的语音识别效果比较好。 相似文献
4.
5.
文章主要介绍了广播电视同步播出监管系统的主要功能、系统的总体框架、各个子系统完成的主要功能以及采用以MFCC(Mel频率倒谱系数)为特征提取算法和隐马尔可夫模型(HMM)为基本语音模型的音频比对的关键技术。 相似文献
6.
7.
针对一般的语音识别算法均存在抗噪能力不好的问题,无法满足音乐检索系统的需求。针对于此,本文以抗噪能力为基础对MFCC语音识别算法进行了优化,第一步是通过F比方法加权优化了MFCC算法的特征参数,估计出各维特征分量对识别率的影响,并将其提取出来,然后采用主成分分析法对提取的特征分量进行降维处理,以降低计算复杂度,减少数据存储量,加快训练时间,最终提高识别效率。算法仿真结果表明,本文提出的基于抗噪能力优化的MFCC语音识别算法具有较好的抗噪能力,比传统MFCC算法对音乐检索的精确度更高。 相似文献
8.
9.
10.
采用能够反映人对语音的感知特性的Mel频率倒谱系数(MFCC)作为语音的特征参数,研究了基于MFCC的VQ的识别方法,对单独使用MFCC与使用MFCC和AMFCC结合的识别率进行比较,实验结果表明通过对说话人的特征参数进行倒谱提升之后,MFCC和△MFCC结合能更好地区分不同说话人。 相似文献
11.
12.
随着语音识别技术的发展,孤立词、小词汇量的语音识别系统在日常生活中得到广泛应用,本文提出了一种基于DSP的孤立词实时语音识别系统,并将动态时间规整技术运用到识别算法中。根据楼宇控制系统的特点,结合BACnet网络协议,把系统设计成BACnet设备的一个嵌入式子系统,从而把语音识别应用到楼宇控制系统中。结合了系统硬件速度快、算法高效的特点,实现了对楼宇更加实时、方便的控制。 相似文献
13.
通过分析语音特征参数的特点和说话人识别的基本方法,以线性预测倒谱系数为特征参数提取算法以及隐马尔可夫模型为建模算法,利用凌阳单片机作硬件平台,实现了声控锁的语音控制功能。实验结果表明,系统性能稳定,识别效果良好。 相似文献
14.
Mohammad Abd-Alrahman Mahmoud Abushariah Raja Noor Ainon Roziati Zainuddin Assal Ali Mustafa Alqudah Moustafa Elshafei Ahmed Othman Omran Khalifa 《Journal of The Franklin Institute》2012,349(7):2215-2242
This paper presents our work towards developing a new speech corpus for Modern Standard Arabic (MSA), which can be used for implementing and evaluating Arabic speaker-independent, large vocabulary, automatic, and continuous speech recognition systems. The speech corpus was recorded by 40 (20 male and 20 female) Arabic native speakers from 11 countries representing three major regions (Levant, Gulf, and Africa). Three development phases were conducted based on the size of training data, Gaussian mixture distributions, and tied states (senones). Based on our third development phase using 11 hours of training speech data, the acoustic model is composed of 16 Gaussian mixture distributions and the state distributions tied to 300 senones. Using three different data sets, the third development phase obtained 94.32% and 8.10% average word recognition correctness rate and average Word Error Rate (WER), respectively, for same speakers with different sentences (testing sentences). For different speakers with same sentences (training sentences), this work obtained 98.10% and 2.67% average word recognition correctness rate and average WER, respectively, whereas for different speakers with different sentences (testing sentences) this work obtained 93.73% and 8.75% average word recognition correctness rate and average WER, respectively. 相似文献
15.
朝鲜语是我国目前适用范围较为广泛、使用人12'较多的一种少数民族语言。朝鲜语紧急呼叫号码的语音识别软件,采用语音命令来呼叫号码,能够准确识别拨叫号码,在特定场合可以起到至关重要的作用。将语音控制指令范围设定为报警求助、火警等词汇的识别中,实现了朝鲜语紧急呼叫号码语音识别系统的软件算法部分。通过对信号处理的每个步骤和朝鲜语数字连读问题进行深入分析及研究,选择DTW(动态时间弯曲)算法作为该软件的核心算法。Matlab实验结果表明,采用的语音识别过程及算法可以准确地对录制的朝鲜语紧急呼叫号码进行识别。 相似文献
16.
《Information processing & management》2020,57(3):102087
With the rapid development in mobile computing and Web technologies, online hate speech has been increasingly spread in social network platforms since it's easy to post any opinions. Previous studies confirm that exposure to online hate speech has serious offline consequences to historically deprived communities. Thus, research on automated hate speech detection has attracted much attention. However, the role of social networks in identifying hate-related vulnerable community is not well investigated. Hate speech can affect all population groups, but some are more vulnerable to its impact than others. For example, for ethnic groups whose languages have few computational resources, it is a challenge to automatically collect and process online texts, not to mention automatic hate speech detection on social media. In this paper, we propose a hate speech detection approach to identify hatred against vulnerable minority groups on social media. Firstly, in Spark distributed processing framework, posts are automatically collected and pre-processed, and features are extracted using word n-grams and word embedding techniques such as Word2Vec. Secondly, deep learning algorithms for classification such as Gated Recurrent Unit (GRU), a variety of Recurrent Neural Networks (RNNs), are used for hate speech detection. Finally, hate words are clustered with methods such as Word2Vec to predict the potential target ethnic group for hatred. In our experiments, we use Amharic language in Ethiopia as an example. Since there was no publicly available dataset for Amharic texts, we crawled Facebook pages to prepare the corpus. Since data annotation could be biased by culture, we recruit annotators from different cultural backgrounds and achieved better inter-annotator agreement. In our experimental results, feature extraction using word embedding techniques such as Word2Vec performs better in both classical and deep learning-based classification algorithms for hate speech detection, among which GRU achieves the best result. Our proposed approach can successfully identify the Tigre ethnic group as the highly vulnerable community in terms of hatred compared with Amhara and Oromo. As a result, hatred vulnerable group identification is vital to protect them by applying automatic hate speech detection model to remove contents that aggravate psychological harm and physical conflicts. This can also encourage the way towards the development of policies, strategies, and tools to empower and protect vulnerable communities. 相似文献
17.
针对凌阳SPCE061A单片机在语音处理方面的优势,设计完成了一个基于SPCE061A的语音识别机器人控制系统。在经过训练后使机器人可根据训练人的命令完成一系列趣味动作。 相似文献
18.
在噪声鲁棒语音识别研究中,使用并行模型结合(parallel model combination, PMC)方法得到的模型理论上能够接近匹配噪声环境模型的性能,故成为噪声鲁棒语音识别的重要研究方向。本文首先提出了一种基于前后向差分动态参数的特征MFCC_FWD_BWD,该特征满足PMC对特征构造矩阵可逆的要求。在此基础上,提出了一种用于PMC的新模型——并行子状态隐马尔可夫模型(parallel sub-state hidden Markov model, PSSHMM),该模型每个状态包含平行关系的子状态,且子状态间存在转移关系。实验表明,PSSHMM模型在各种噪声和SNR下取得了较好的识别效果,特别是对于非平稳噪声,其鲁棒性能非常显著。 相似文献