About the job Senior AI/ML Engineer (GenAI & LLM Systems)
About the Role
TEKHQS is seeking a Senior AI/ML Engineer to design, fine-tune, and deploy production-grade Generative AI and LLM-powered systems. This role is ideal for engineers who have shipped real-world ML systems, understand modern transformer architectures, and can operate across the full ML lifecycle—from data and training to inference and optimization.
You will work on scalable AI platforms, enterprise-grade GenAI solutions, and intelligent systems integrated into Web, ERP, and enterprise workflows. This is a hands-on role with strong ownership and architectural influence.
Key Responsibilities
- Design, fine-tune, and optimize transformer-based models (GPT, LLaMA, Mistral, T5) for production use cases.
- Build and maintain end-to-end GenAI pipelines: data processing, training, evaluation, deployment, and monitoring.
- Implement Retrieval-Augmented Generation (RAG) systems using vector databases and hybrid search.
- Optimize inference for latency, throughput, and cost efficiency.
- Work with multi-modal AI (text, embeddings, images, audio where applicable).
- Integrate AI services into enterprise applications, ERP systems, and SaaS platforms.
- Collaborate with product, backend, and cloud teams to deliver scalable AI solutions.
- Apply best practices in ML governance, security, and responsible AI.
Required Skills & Experience
Core AI / ML
- Strong experience with PyTorch and transformer architectures.
- Hands-on experience with LLMs, embeddings, fine-tuning (LoRA/QLoRA), and prompt engineering.
- Solid understanding of training vs inference tradeoffs, evaluation metrics, and model behavior.
GenAI & Systems
- Experience with RAG pipelines, vector databases (Pinecone, Weaviate, FAISS, Chroma).
- Familiarity with RLHF concepts (DPO, PPO, reward modeling) is a plus.
- Tokenization concepts (BPE, SentencePiece, Tiktoken).
Model Optimization & Deployment
- Quantization and optimization techniques (GPTQ, AWQ, int8, fp16).
- Model serving using vLLM, Triton, HuggingFace TGI, or similar.
- Experience deploying models on AWS, Azure, or GCP.
Data & Infrastructure
- Distributed training or inference using DeepSpeed, FSDP, Accelerate.
- Data pipelines using Parquet, WebDataset, or cloud storage.
- CI/CD for ML workflows.
Software Engineering
- Strong Python engineering practices.
- Docker and Kubernetes for ML workloads.
- Experience with monitoring, logging, and profiling ML systems.
Nice to Have
- Experience with ERP-integrated AI solutions (NetSuite, SAP, Dynamics).
- Exposure to multi-agent systems, orchestration frameworks, or AutoGen/LangGraph.
- Open-source contributions or published technical work.
Qualifications
- Bachelors or Masters degree in Computer Science, AI, Data Science, or related field.
- 4+ years of professional ML experience, with 3+ years in GenAI/LLMs.
- Proven experience deploying AI systems to production.
About TEKHQS
TEKHQS is a global technology solutions provider headquartered in Lake Forest, California, with a delivery team of 300+ professionals across Pakistan and other regions. We specialize in:
- Web & Mobile Development (Web 2.0)
- Blockchain & Web 3.0 Solutions
- AI/ML & Generative AI Systems
- ERP Services as a certified partner of SAP S/4HANA, Oracle NetSuite, and Microsoft Dynamics 365
We deliver enterprise-grade solutions across implementation, integration, customization, training, support, and staff augmentation.