AI/ML Engineer Speech, RAG & Fine-Tuning

Rawalpindi, Pakistan

Or refer someone

Job Openings AI/ML Engineer Speech, RAG & Fine-Tuning

About the job AI/ML Engineer Speech, RAG & Fine-Tuning

Job Title: AI/ML Engineer Speech, RAG & Fine-Tuning

Location: Bahria Town, Phase 7, Rawalpindi
Employment Type: Full-time,Onsite (10AM - 7PM)

Job Description:

We are seeking a highly skilled AI/ML Engineer with expertise in speech-to-speech pipelines, open-source models, and LLM fine-tuning. The ideal candidate will work on designing, developing, and deploying cutting-edge speech and language AI solutions, integrating open-source frameworks with advanced fine-tuning methods to deliver production-ready systems.

Key Responsibilities:

Design and implement speech-to-speech pipelines using open-source models (Whisper, Wav2Vec, etc.).
Develop and optimize speech-to-text (STT) and text-to-speech (TTS) systems leveraging Coqui or similar frameworks.
Work with large language models (LLMs) such as LLaMA 2, LLaMA 3 for NLP applications.
Apply LoRA and PEFT-based fine-tuning techniques to customize LLMs for domain-specific tasks.
Build and optimize Retrieval-Augmented Generation (RAG)-based systems for knowledge-grounded responses.
Develop and integrate agentic AI systems with reasoning and task automation capabilities.
Collaborate with cross-functional teams (data engineers, product managers, software developers) to deliver scalable AI solutions.
Monitor, evaluate, and optimize deployed AI models for accuracy, latency, and efficiency.

Requirements:

Strong experience in AI/ML model development with open-source speech and language models.
Hands-on experience with Whisper, Wav2Vec, Coqui TTS/STT frameworks.
Proven track record with LLaMA 2, LLaMA 3 or similar LLMs.
Proficiency in fine-tuning techniques: LoRA, PEFT, and parameter-efficient training.
Experience in RAG-based systems for knowledge retrieval and contextual response generation.
Familiarity with agentic AI frameworks for building task-oriented agents.
Strong programming skills in Python, PyTorch, TensorFlow.
Experience with Hugging Face, LangChain, and vector databases (FAISS, Pinecone, Weaviate, etc.).
Knowledge of cloud platforms (AWS, GCP, Azure) and containerization (Docker, Kubernetes).
Strong problem-solving skills and ability to optimize model performance.

Preferred Qualifications:

Master’s or PhD in Computer Science, AI/ML, Data Science, or related field.
Publications or projects in speech AI, LLM fine-tuning, or agentic AI.
Experience with distributed training and model deployment at scale.

What We Offer:

Lunch provided by the company
Medical Allowance
Competitive compensation and growth opportunities.
Opportunity to work with state-of-the-art open-source AI models.
Collaborative environment with AI researchers and engineers.

Or refer someone