AI Engineer: Automatic Speech Recognition (ASR)

Ikorodu, Lagos, Nigeria

Job Openings AI Engineer: Automatic Speech Recognition (ASR)

About the job AI Engineer: Automatic Speech Recognition (ASR)

Develop and optimize speech‑to‑text systems that support Awarri’s multilingual applications, focusing on Nigerian languages. Design acoustic and language models, integrate them into real‑time pipelines, and ensure robustness across noisy environments and varied accents.

Key Responsibilities

* Design, train, and fine tune ASR models such as Conformer, Transducer, CTC, or seq2seq.

* Build ASR pipelines that work for low-resource languages with limited labeled data.

* Optimize models for low latency, noise robustness, accented speech, and domain performance.

* Build scalable training and inference pipelines using PyTorch and GPU acceleration.

* Create data preprocessing pipelines for segmentation, normalization, augmentation, and labeling.

* Develop and maintain phoneme lexicons, G2P models, and MFA alignment workflows.

* Build evaluation suites for WER, CER, latency, rare-word recall, and accented speech.

* Diagnose failure modes such as mapping issues, poor alignment, or orthographic variation while handling orthographic inconsistencies and dialectal variations in underrepresented languages.

* Deploy ASR models to production with batching, streaming, and quantization optimizations.

* Monitor model drift, quality degradation, and real-time inference performance.

* Collaborate with product, linguistics, and ML teams to deliver production-ready ASR systems.

* Document experiments, model changes, and reproducibility workflows.

Person Profile

* Strong background in speech processing, signal processing, and deep learning.

* Hands-on experience training ASR models end-to-end.

* Skilled with PyTorch, GPU workflows, model optimization, and distributed training.

* Familiar with Conformer, RNN-T, CTC, seq2seq, and modern multilingual models.

* Comfortable with noisy, accented, or low-resource data environments.

* Experience with MFA, phoneme mapping, tokenization, and G2P tools.

* Strong debugging skills for alignment, segmentation, and data quality issues.

* Familiar with Whisper, MMS, wav2vec 2.0, HuBERT, and transfer learning.

* Able to write clean, production-ready code and build reliable ASR pipelines.

* Communicates clearly, flags blockers early, and owns tasks from idea to deployment.

Bonus: experience in African languages or other low resource languages

Or refer someone