Jayesh Koli — Enterprise GenAI Engineer

/ Welcome to my World

I architect production GenAI systems at the intersection of finance, compliance, and reliability — fusing deterministic backend engineering with LLM-assisted decision flows.

/ Find with me

/ Best skill on

Hire Me

View Work

/ Now playing

Building LLM Ops

/ Available

4+ yrs

shipping production GenAI

/ Scale

10,000+

employees served at Jio

/ ID 001 — OPERATOR

JAYESH

KOLI

GenAIBackendRAGLLM OpsCloud

4+ yrs

/ Production GenAI

10k+

/ Users at Jio

/ LLM Systems Shipped

8.1

/ M.Tech CGPA · BITS

Currently shipping LLM Ops at Jio Platforms

19.0° N · 73.1° EPanvel, Navi Mumbai

/ 2022 — present

/About Me

Enterprise GenAI Engineer with 4+ years designing and deploying production-grade LLM systems for regulated enterprise workflows. I build AI-assisted finance and compliance systems with human-in-the-loop controls, auditability, and cost-aware architecture — all backed by a strong cloud and backend foundation.

Currently at Jio Platforms (Reliance) shipping LLM-assisted financial decision workflows with deterministic fallbacks, RAG over policy documents with role-based access, and a centralized LLM evaluation/governance framework with prompt versioning, drift detection, hallucination guards, and rollback.

Deep focus on retrieval quality (hybrid search + reranking), structured LLM I/O (JSON mode, Pydantic), trace-level observability (LangSmith / Langfuse-style), and semantic-cache-driven cost control.

4+ yrs

/ Experience

10k+

/ Users Served

Fin · GenAI

/ Domains

Download CV

View Experience

Python

Node.js

FastAPI

RAG

Hybrid Retrieval

Reranking

LangChain

LangGraph

LlamaIndex

Function Calling

Structured Outputs

SSE Streaming

LLM-as-a-Judge

Guardrails

LLM Governance

Prompt Versioning

Embeddings

pgvector

PostgreSQL

Kafka

RabbitMQ

Next.js

React

Redis

OpenAPI

Microservices

RBAC

SAP Integrations

Tailwind CSS

Python

Node.js

FastAPI

RAG

Hybrid Retrieval

Reranking

LangChain

LangGraph

LlamaIndex

Function Calling

Structured Outputs

SSE Streaming

LLM-as-a-Judge

Guardrails

LLM Governance

Prompt Versioning

Embeddings

pgvector

PostgreSQL

Kafka

RabbitMQ

Next.js

React

Redis

OpenAPI

Microservices

RBAC

SAP Integrations

Tailwind CSS

Human-in-the-LoopDrift DetectionHallucination GuardsConfidence RoutingToken BudgetingSemantic CachePII RedactionJailbreak DefenseTrace LoggingJSON ModePydantic SchemasCohere RerankBGE RerankersHyDEQuery RewritingRecursive ChunkingSemantic ChunkingPineconeQdrantWeaviateFAISSAudit LoggingCost MonitoringEvent-DrivenCentralized EvaluationRollback CapabilityAsync WorkflowsApproval ConsolesReal-time DashboardsSecure API DesignPerformance OptimizationRole-based RetrievalMetadata FilteringBehavioral MonitoringHuman-in-the-LoopDrift DetectionHallucination GuardsConfidence RoutingToken BudgetingSemantic CachePII RedactionJailbreak DefenseTrace LoggingJSON ModePydantic SchemasCohere RerankBGE RerankersHyDEQuery RewritingRecursive ChunkingSemantic ChunkingPineconeQdrantWeaviateFAISSAudit LoggingCost MonitoringEvent-DrivenCentralized EvaluationRollback CapabilityAsync WorkflowsApproval ConsolesReal-time DashboardsSecure API DesignPerformance OptimizationRole-based RetrievalMetadata FilteringBehavioral Monitoring

/Expertise

Backend & Systems/01

Python, Node.js, FastAPI

Event-driven services, microservices, REST APIs powering 10k+ user platforms.

GenAI & RAG/02

LLMs · RAG · Governance

Production LLM systems with prompt versioning, drift detection, and rollback.

Data & Cloud/03

PostgreSQL · pgvector · Redis

Relational, vector, and event stores tuned for low-latency retrieval at scale.

Frontend/04

React · Next.js · Tailwind

Building fast, accessible internal dashboards and approval consoles.

/Experience

/ Current

Jio Platforms
Limited.

Reliance Corporate Park, Navi Mumbai

/ Role

Software Engineer — Enterprise Backend & GenAI Systems

/ Period

Jun 2022 — Present

/ faster

Designed and deployed enterprise-grade backend systems for finance, procurement, and corporate services workflows, improving response times through architectural and performance optimizations.

/01

/ efficiency

Built automated financial decision workflows integrating ML-assisted classification with deterministic validation rules and SAP-based approval systems — auditable, human-in-the-loop processing for compliance-sensitive expense proposals.

/02

/ shorter cycles

Architected a scalable financial management platform that reduced procurement cycle times, with extensibility for future commercialization and strict data integrity guarantees.

/03

/ lower latency

Developed event-driven services using RabbitMQ and Node.js to power real-time dashboards and notifications, improving information delivery while maintaining system reliability.

/04

/ GenAI systems shipped

Shipped production GenAI surfaces — a RAG-grounded policy assistant and a centralized LLM Ops platform — to internal users, integrating prompt versioning, hallucination guards, and trace-level observability into the standard enterprise release lifecycle.

/05

/ employees

Implemented RBAC across internal platforms enforcing strict data boundaries, and collaborated cross-functionally to deliver secure, scalable systems used company-wide.

/06

/Recent Work

/policy-2024.md

/finance-rules

/access-matrix

/audit-spec

/expense-cap

/sap-flow

RAG · pgvector

LLM · Approved Knowledge

Confidence 0.92

→ Routed to Human Reviewer

01/ GenAI · RAG · Human-in-the-Loop

/ Project · 01GenAI · RAG · Human-in-the-Loop

Enterprise Financial Policy Assistant

Designed an LLM-assisted financial analysis system grounded in internal policy documents using RAG, constraining responses to approved enterprise knowledge.
Implemented role-based document retrieval, metadata filtering, and structured prompt templates to prevent unauthorized access and reduce hallucinations.
Improved retrieval quality with hybrid search (BM25 + dense vectors over pgvector) and a reranking stage; tuned recursive + semantic chunking to keep grounding tight on adversarial finance queries.
Streamed structured JSON outputs (Pydantic-validated) directly into the approval console via SSE, with token-level streaming for sub-second perceived latency on long answers.
Introduced confidence-based routing and human-in-the-loop approval flows with full audit trails of prompts, retrieved context, model outputs, and decisions.
Evaluated reliability through curated test cases and production feedback, monitoring correctness, latency, and token usage to optimize cost-performance.

PythonFastAPIpgvectorOpenAICohere RerankPydanticPostgreSQLRBAC

Prompt v1.3 · drift2.4ms p50

Tokens

412K

Cost

$1.20

Drift

0.04

02/ Centralized LLM Ops Framework

/ Project · 02Centralized LLM Ops Framework

LLM Evaluation, Monitoring & Governance

Built a centralized framework for testing, monitoring, and governing LLM behavior across enterprise applications.
Implemented prompt and configuration versioning, treating GenAI behavior as code with traceable changes and rollback capability.
Designed automated response-quality scoring pipelines using LLM-as-a-Judge rubrics (correctness, faithfulness, policy compliance) plus heuristic checks to detect hallucinations, retrieval failures, and drift.
Instrumented end-to-end trace logging across prompt → retrieval → generation → output (LangSmith / Langfuse-style spans) for post-hoc replay and root-cause analysis on failed runs.
Monitored latency, token usage, and error rates with semantic caching and per-tenant budgets to enforce cost and reliability budgets for production GenAI systems.

PythonFastAPIPostgreSQLRedisLangfuseOpenAPICI/CD

/Education

CGPA 8.1