Dennis K

AI Infrastructure Engineer

I'm an AI Infrastructure Engineer specializing in building and scaling the backend systems that power large-scale machine learning. I bridge the gap between AI development and production by orchestrating distributed GPU clusters, tuning low-latency inference engines, and hardening core data infrastructure.

What I do:

Architecting high-performance GPU and AI clusters
Optimizing serving engines for low-latency production inference
Scaling distributed vector databases and data pipelines
Automating reliable MLOps and infrastructure engineering

Services

GPU Cluster Orchestration

I provision and optimize multi-node GPU clusters using Kubernetes, managing resource allocation, VRAM utilization, and multi-tenant isolation.

High-Performance Inference Serving

I set up highly optimized serving engines like vLLM, TensorRT-LLM, and Triton to minimize TTFT and maximize token throughput for production workloads.

Vector DB & RAG Storage Systems

I deploy and scale distributed vector databases like Weaviate, Milvus, or Qdrant, optimizing indexing strategies and retrieval pipeline speeds.

AI Infrastructure Pipelines (MLOps)

I build robust CI/CD and data engineering pipelines, automating weight distribution, checkpointing, and dynamic cluster autoscaling.

Distributed Training Infra

I architect infrastructure setups for model fine-tuning and training, configuring data-parallel and model-parallel setups with Ray and DeepSpeed.

Compute & Cost Monitoring

I implement full-stack observability frameworks to track GPU metrics, prompt cache hit-rates, latency, and cloud compute expenditures.

Dennis K

What I do:

Services

Transforming Businesses Through Intelligence

Rebecca Walsh

James Mitchell

Lisa Chen

Delivering Measurable Impact