Posts by Collection

portfolio

publications

ParsiNorm: A Persian Toolkit for Speech Processing Normalization

2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS), 2021

This paper is a Persian language pre-processing tool that has two separate parts: general normalization and text-to-speech normalization. In general part, there are several normalization functions that can be used regardless of the domain of the task. In text to speech part, each text is converted to how it is read. For example, the date 1996/10/01 is written the tenth of January nineteen nighty six.

Download here

Using a Pre-Trained Language Model for Context-Aware Error Detection and Correction in Persian language

arXiv, 2024

This paper presents a Persian spell checker called Virastman, which aims to detect and correct non-word and real-word errors in a sentence. A state-of-the-art method based on sequence labeling with BERT detects real-word errors on a small artificially made dataset. An unsupervised model based on BERT is used for correcting errors by calculating the probability of each candidate in a sentence (including the detected word). A highly probable candidate word is selected as the correct word if some conditions are met based on two thresholds named alpha and beta. Our experiments across six distinct test sets underscore our proposed methodology’s notable superiority in detecting and correcting real-word and non-word errors compared to the baselines. More specifically, our approach demonstrates an average enhancement of 3.41% in error detection and an average substantial 15% in error correction when assessed using the F0.5 metric, thus surpassing contemporary baselines, establishing our method as the state-of-the-art for error detection and correction.

Cost-Effective Development of Custom Wake Word Detection Models for Low-Resource Languages in Embedded Devices

arXiv, 2024

Creating a reliable wake word detection system for custom wake words poses a significant challenge, particularly in low-resource languages where the scarcity of available data sources is a major hurdle. Moreover, collecting an adequately voluminous dataset that includes both positive and negative samples entails substantial financial costs and significant time expenditures. To address this problem, we propose a cost-efficient approach to enrich a small set of collected custom samples. We provide a range of techniques for preprocessing, data augmentation, and noise synthesis to expand the positive samples. In addition, we automatically extracted specifically chosen negative samples from an existing speech dataset. The augmented data is utilized for the training of a neural network-based detector through the utilization of Mycroft Precise. The results demonstrate an improved production-grade performance, which can be vastly used in embedded devices and custom virtual assistants.

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.