Embedded Offline ASR Engine Development For English Language

This project is to use Python and open source Mozilla’s DeepSpeech ASR (automatic speech recognition) engine on different platforms (such as Raspberry Pi 4 - 1 GB, Nvidia Jetson Nano, Windows PC, and Linux PC, Samsung Galaxy A50, Huawei P20) in order to develop a refined ASR engine for English language with following functionalities, with source code, instruction and an API documentation delivered after development complete. The architecture of Deep Speech is an end-to-end trainable, character-level, deep recurrent neural network (RNN). It is a deep neural network with recurrent layers that gets audio features as input and outputs characters directly — the transcription of the audio, and uses LSTM (Long short-term memory) cells instead of GRU (gated recurrent unit) cells. This project targets <6% of Word Error Rate, and especially for key phrases and key words to <3% WER, close to human level performance for English language.

1. Use the latest Mozilla’s DeepSpeech ASR engine which comes with .tflite model (TensorFlow Lite), faster than real-time on a single core of a Raspberry Pi 4, and able to make our own audio transcription application with hot word detection function.

2. Generate Confidence Scores of high accuracy level. Be able to retrieve result confidence scores for each English word and for each English sentence in a transcription of audio into text (confidence score for both word and sentence level), and they should be highly reliable.

3. Support Keyword Spotting modes that can recognize in a continuous stream. User can configure a list of key phrases to search for and specify the detection threshold for each of them. This mode should reliably work in continuous speech stream and can be used for keyword activation. Equivalent to pocketsphinx -kws and –keyphrase options (The methods are ps_set_keyphrase and ps_set_kws).

4. Recognize accurately at least a thousand of commands and controls that user can define in a simple text editor in keyword spotting mode or keyword activation mode. (same as No. 3 requirement, with 3% WER)

5. Mozilla’s DeepSpeech ASR word error rate on LibriSpeech’s test-clean set is 6.5%, which we target to improve to 6% by this project, and for key phrases and key words to <3% WER, close to human level performance.

6. Use automatic phoneme alignment, VAD and other methods to detect the start time, the end time and position for each recognized phoneme, word and sentence, with an output data structure readable at real-time for the complete set of phoneme position information. Forced alignment refers to the process by which orthographic transcriptions are aligned to audio recordings to automatically generate phone level segmentation. As described at the following link:[login to view URL]。

When developing an ASR system, “good initial estimates … are essential” when training Gaussian Mixture Model (GMM) parameters (Rabiner and Juang, 1993, p. 370). Phoneme location information is also critical when building concatenative text-to-speech systems.

7. Implement CTC decoder as an important optimization: integrating appropriate language model into the decoder.

Please see attached word file, especially the articles and video attached, for the details of the project and we can discuss work requirement further in the messenger.

Квалификация: Python, Tensorflow

Показать больше english language teaching approachessummary, 5000 words english language, command english language, strong command english language, social engine english language, english language pack social engine, english language development, check for good english language editing for free, english language editing for 60 page report, translating french to english language & tutoring services for hire, translating french to english language tutoring services for hire, looking for freelance opening as examiner for english language testing in bangalore india, academic english writer service for english second language, jack smith english language editing for math educators, jd for freelancer japanes english language converter, naati spanish english translation service traductor naati ivanhoe services for hire language tutoring, naati spanish english translator services for hire language tutoring, to publicity pamphlet for a freelance design company+ content in english language+ free sample wanted+ in attractive and impress, web developer wordpress jobs in germany for english language, Conclusion for english language project

О работодателе:
( 2 отзыв(-а, -ов) ) Flushing, United States

ID проекта: #24812913

10 фрилансеров(-а) в среднем готовы выполнить эту работу за $494


Hello, Upon reading the job details I would say that all the required skills Python, C Programming and Tensorflow fall under my skills. I work on freelancer full time and I believe I can do this job if I get all the d Больше

$750 USD за 23 дней(-я)
(4 отзывов(-а))

hello! i am python developer and scrape expert with rich experiences. i am interested in your scrape project. i have done many scrape project, [login to view URL] this is github li Больше

$500 USD за 15 дней(-я)
(5 отзывов(-а))

Hello! I would love to help you reach your goals on-time and on-budget. I have extensive experience creating web platforms from simple informational sites, to high-performance Single Page Applications Больше

$382 USD за 7 дней(-я)
(1 отзыв)

Hi employer, How are you? I am a senior full stack developer who have career in IT part for 6 years over. I read the job posting carefully and I am absolutely sure that I can do the project very well. I have developed Больше

$500 USD за 7 дней(-я)
(1 отзыв)

Hello, I have read your job description carefully and I am confident I can finish your job without fail. I have full experience in R and MATLAB and have expertise in statistical analysis and machine learning data proc Больше

$500 USD за 7 дней(-я)
(1 отзыв)

Hey, glad to inform you that have done S-I-M-I-L-A-R project in past. Do you want to check the D-E-M-O ???? Thanks.

$500 USD за 7 дней(-я)
(0 отзывов(-а))

Hello. Hope you are doing well. I have over 8years experiences with ML, AI such as image processing ,face detection, automation by using tensorflow , OpenCV, selenium webdriver. I have rich experiences with several pro Больше

$500 USD за 7 дней(-я)
(0 отзывов(-а))

Hi, Dear Sir! I've seriously read your post and I have understood what you need. I am sure that I can be the best candidate who can perfectly complete your project (Embedded Offline ASR Engine Development For English L Больше

$555 USD за 6 дней(-я)
(0 отзывов(-а))

Hi, I am an Electrical, Electronics and Embedded Engineer. A PCB Designer, Arduino/Raspberry Pi, ESP32, ESP8266 and internet of things expert. I read through the job description very carefully and I am absolutely sure Больше

$250 USD за 7 дней(-я)
(0 отзывов(-а))

Hi, I'm expert in Linux administration and a web developer (network configuration, firewall, Database administration, system configuration, web server deployment, etc. ).also I have a development company and I have a g Больше

$500 USD за 7 дней(-я)
(0 отзывов(-а))