S

Sreekar Reddy

Gen AI / ML Engineer · PNC Bank
sreekarreddy2591@gmail.com (508)306-1082 Added Jun 04, 2026
Back

Recruiter Analysis

Sr. Gen AI / ML Engineer

The candidate is a seasoned GenAI and ML engineer with substantial hands-on experience building and shipping production-scale LLM-driven systems in enterprise contexts. Strengths include architecting retrieval-augmented generation pipelines, designing stateful multi-agent orchestration with robust fallback and recovery logic, and operationalizing GenAI with CI/CD, Kubernetes, and observability tooling. They demonstrate practical expertise in prompt lifecycle management, model selection engines, and compliance-focused controls (PII/PHI masking, audit trails). Technical breadth spans modern LLMs (OpenAI, Claude, LLaMA, Gemini), vector databases (FAISS, ChromaDB, Weaviate, Pinecone), and orchestration/monitoring stacks (LangChain ecosystem, LangGraph, LangSmith, Prometheus, Grafana). The candidate also shows experience in multimodal pipelines (speech, OCR, vision) and integration with enterprise data platforms (Snowflake, BigQuery, Azure SQL). From a hiring perspective, they are well-suited for senior engineering or architect roles that require ownership of GenAI platforms and delivery to product teams. Areas to probe in interviews: depth of hands-on fine-tuning with QLoRA/SFT at scale, concrete performance/latency trade-offs made in production, and examples of measurable business impact tied to deployments. Overall, a pragmatic implementer with strong platform and orchestration skills for enterprise GenAI initiatives.

Technical Profile

Senior GenAI/ML engineer with 10+ years delivering production-grade LLM systems, agentic orchestration, RAG pipelines, and MLOps across enterprise environments. Deep Python full-stack background with hands-on experience in LangChain, LangGraph, FastAPI, Streamlit, vector DBs (FAISS, Weaviate, ChromaDB), and cloud-native deployments on Azure/AWS/GCP.
Projects demonstrate consistent delivery of enterprise-grade GenAI capabilities: multi-agent orchestration with state management and fallback logic, scalable RAG pipelines indexing millions of documents, multimodal ingestion (voice and OCR) and robust observability/evaluation tooling. The work shows operationalization beyond prototypes — including CI/CD, Kubernetes-based deployment, model selection engines, and integrations with enterprise data systems. There is strong emphasis on prompt lifecycle, compliance controls, and model governance which aligns with regulated enterprise requirements.
Core stack centers on Python, LangChain/AutoGen/LangGraph for orchestration, FastAPI and Streamlit for APIs and experimentation UIs, vector DBs (FAISS, ChromaDB, Weaviate, Pinecone) for retrieval, and cloud MLOps using Kubernetes, Docker, MLflow, Terraform across Azure/AWS/GCP. Observability and evaluation rely on LangSmith, PromptLayer, TruLens, Prometheus and Grafana.

Skills

Primary
Generative AI / LLM engineeringAgentic / multi-agent orchestrationRetrieval-Augmented Generation (RAG)Prompt engineering and prompt lifecycle managementMLOps and model deploymentPython full-stack engineering
Secondary
NLP and semantic searchMultimodal (voice, OCR, vision) pipelinesModel evaluation and observabilitySecurity and compliance for AI (PII/PHI redaction)Recommendation systems and personalization
Frameworks
FastAPIDjangoStreamlitReactSQLAlchemy
Databases
PostgreSQLRedisAzure SQLBigQuerySnowflakeFAISSChromaDBWeaviatePineconeAzure Cognitive Search
Cloud
AzureAWSGCPAzure ML StudioGCP Vertex AIAWS SageMaker

Work Experience

Gen AI / ML Engineer

PNC Bank
Designed and delivered enterprise GenAI platforms and agentic systems embedded into merchandising and business workflows. Built production-grade RAG pipelines, multi-agent orchestration graphs, and full-stack AI experimentation frameworks. Deployed scalable inference services and MLOps pipelines across cloud providers with observability, compliance controls, and prompt lifecycle management.
GPT-4GPT-4oClaude 3LLaMA 3GeminiLangChainLangGraphAutoGenLangSmithPromptLayerTruLensFAISSChromaDBWeaviatePineconeAzure Cognitive SearchFastAPIStreamlitPostgreSQLRedisvLLMKubernetesDockerHelmMLflowGitHub ActionsTerraformPrometheusGrafanaAzure ML StudioGCP Vertex AIAzure SQLBigQuerySnowflakeOpenAI embeddingsUnstructured.ioWhisperOCR
  • Designed GenAI copilots for merchandising platforms supporting product comparison, pricing rules, and supplier resolution.
  • Built RAG pipelines using FAISS, ChromaDB, Azure Cognitive Search and vector embeddings for high-confidence retrieval.
  • Developed dynamic orchestration graphs with LangGraph for stateful agents, tool use, fallback handling, and persistent memory.
  • Implemented evaluation and observability pipelines using LangSmith, TruLens, PromptLayer, Prometheus and Grafana.
  • Deployed inference microservices with FastAPI, vLLM, Kubernetes, Docker, Helm and CI/CD automation.
  • Engineered prompt management systems with versioning, temperature control, fallback logic and policy enforcement for auditability.
  • Integrated agents with enterprise data sources (Redis, Azure SQL, BigQuery, Snowflake) and external SaaS via secured API tool wrappers.
  • Implemented compliance controls (PII/PHI masking, prompt guards) and partnered with governance and infosec teams.
  • Built full-stack experimentation frameworks (Streamlit + FastAPI + PostgreSQL) for prompt testing and model comparison.
  • Constructed neural/hybrid search pipelines using cosine similarity, BM25 and hybrid retrieval methods.
  • Developed model selection and dynamic routing engines to switch LLMs based on scoring, latency and cost thresholds.

Projects

GenAI Copilots for Merchandising

Lead GenAI Engineer
Embedded GenAI copilots into merchandising workflows to assist with product comparison, pricing rules, supplier dispute resolution and generating actionable insights for analysts.
LangGraphLangChainFastAPIStreamlitPostgreSQLRedisGPT-4Claude 3ChromaDBFAISS
  • Agent orchestration with tool invocation and persistent memory
  • RAG-based grounding for responses
  • Prompt versioning and lifecycle controls
  • Fallback routing across models based on cost and latency
  • Streamlit UI for analyst experimentation and analytics

Enterprise RAG & Vector Search Pipeline

Architect / Engineer
Built production RAG pipelines to ingest, chunk and embed millions of documents using Unstructured.io and OpenAI/Azure embeddings, indexing into vector stores for fast retrieval.
Unstructured.ioOpenAI embeddingsAzure embeddingsFAISSWeaviateChromaDBPineconeAzure Cognitive Search
  • Document ingestion and chunking
  • Embedding generation and indexing
  • Hybrid retrieval (cosine similarity + BM25)
  • Relevance scoring and retrieval tuning
  • Integration with LLM prompt pipelines

Multimodal Agents (Voice, OCR, Vision)

Senior Engineer
Developed multimodal agents that ingest voice, scanned documents and images using Whisper, OCR and Gemini Vision to extract, summarize and analyze enterprise data.
WhisperGemini VisionOCRLangChainFastAPI
  • Speech-to-text ingestion and summarization
  • OCR extraction from PDFs and scanned forms
  • Multimodal grounding and summarization
  • Integration with agent workflows and RAG

AI Observability & Evaluation Platform

Engineer
Built pipelines and dashboards for prompt/response observability, model scoring, auto-evaluation and monitoring of hallucination/grounding metrics.
LangSmithPromptLayerTruLensPrometheusGrafanaPostgreSQL
  • Completion scoring across multiple dimensions (grounding, hallucination, coherence)
  • Real-time dashboards for model usage and routing
  • Logging, trace-based debugging and token cost monitoring
  • Red-teaming and human-in-the-loop review integration