Running LLM OpenAI Open Source Model with vLLM and GPU NVIDIA L4

Running openai/gpt-oss-20b local model with NVIDIA L4 GPU this model can actually run on a consumer RTX Series GPU with ~16GB of VRAM. I divided it into two parts: running manually and using a container using the Ubuntu 24.04 LTS operating system. Preparation Installing drivers and dependencies wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb && rm -rf cuda-keyring_1.1-1_all.deb sudo apt update && sudo apt install -y \ linux-headers-$(uname -r) \ libnvidia-compute-580 nvidia-dkms-580-open \ datacenter-gpu-manager-4-cuda-all \ datacenter-gpu-manager-exporter \ cuda-toolkit nvtop build-essential We need a host reboot to apply the GPU driver. ...

December 15, 2025 · 4 min · 681 words · Viki Pranata

Kubernetes Cluster RKE2 with Cilium eBPF CNI

Preparation I used three VM nodes for this project with 8 Cores 8GB Memory and 80GB for the containers storage with operating systems Rocky Linux 9.6 with RKE2 v1.32.5+rke2r1 and Cilium v1.17.3 Node Hostname vCPU Memory Storage PrivateNet Node Roles knode01master01 8 Core 8GB 80GB 172.16.0.211 Control-plane knode01master02 8 Core 8GB 80GB 172.16.0.212 Control-plane knode01master03 8 Core 8GB 80GB 172.16.0.213 Control-plane knode01worker01 8 Core 8GB 80GB 172.16.0.211 Worker knode01worker02 8 Core 8GB 80GB 172.16.0.212 Worker knode01worker03 8 Core 8GB 80GB 172.16.0.213 Worker All operations use the root user, be careful when running commands! In this step execution on all nodes ...

June 16, 2025 · 4 min · 686 words · Viki Pranata

Lightweight Kubernetes Cluster with Multi Masters K3S and CRI-O

Preparation I used three VM nodes for this home lab project with 4 Cores 4GB Memory and 20GB for the containers storage with operating systems Rocky Linux 9.5 with kubernetes v1.32.5+k3s1 and cri-o v1.32. Node Hostname vCPU Memory Storage PrivateNet litekube01 4 Core 4GB 20GB 172.16.0.111 litekube02 4 Core 4GB 20GB 172.16.0.112 litekube03 4 Core 4GB 20GB 172.16.0.113 All operations use the root user, be careful when running commands! In this step execution on all nodes ...

June 6, 2025 · 3 min · 610 words · Viki Pranata