Licensed to be used in conjunction with basebox, only.
Server Preparation Guide
Latest Update: February 10, 2026
Document Version: 2.1.4
Quick Start Guide
For experienced system administrators preparing a server immediately:
- Install Ubuntu 24.04 LTS Server (Download)
- Install NVIDIA Drivers:
sudo ubuntu-drivers autoinstall && sudo reboot - Install CUDA Toolkit 13.0 (see Step 3 for details)
- Install Docker + NVIDIA Container Toolkit (see Steps 4-5)
- Install Kubernetes 1.33+ with GPU Operator (see Steps 6-7)
Contact basebox support: support@basebox.ai for deployment assistance
Note: This guide provides the recommended path for new installations. If you have existing infrastructure or different requirements, contact support@basebox.ai
TLDR
Recommended Operating System: Ubuntu 24.04 LTS
Key Requirements:
- Ubuntu 24.04 LTS Server (the recommended version for new installations)
- NVIDIA GPU with CUDA support (Compute Capability 7.0+)
- NVIDIA drivers compatible with CUDA 13.0+
- CUDA Toolkit 13.0 (always install the latest version)
- Docker Engine 24.0+ or containerd 2.0+
- Kubernetes 1.33+ with GPU Operator (required for production deployments)
- Minimum 8 CPU cores, 16GB RAM, 500GB storage (varies by deployment size)
- Network access for container image registry during installation phase (air-gapped deployments supported via offline image transfer - contact support@basebox.ai for details)
Note: Ubuntu 22.04 LTS is supported for existing systems. For new installations, always use Ubuntu 24.04 LTS.
Why Ubuntu 24.04 LTS:
- Best NVIDIA CUDA documentation and support
- Most comprehensive documentation for GPU workloads
- Longest support lifecycle (until April 2029)
- Extensive Docker and Kubernetes community resources
- Industry standard for GPU-accelerated workloads
Not Supported: Microsoft Windows (vLLM requires Linux; Windows Server lacks native GPU container support)
Verified Software Versions
This guide was last tested with the following software versions:
| Component | Version Tested | Latest Check | Last Verified |
|---|---|---|---|
| Ubuntu | 24.04 LTS | Download | Feb 2026 |
| NVIDIA Driver | 580.126.09 | Check | Feb 2026 |
| CUDA Toolkit | 13.0.0 | Check | Feb 2026 |
| Docker | 29.2.1 | Check | Feb 2026 |
| containerd | 1.7.x | Bundled with Docker | Feb 2026 |
| Kubernetes | 1.33.7 | Check | Feb 2026 |
| Helm | 3.20.0 | Check | Feb 2026 |
| Calico CNI | 3.31.3 | Check | Feb 2026 |
| GPU Operator | Latest | Auto-updated via Helm | Feb 2026 |
⚠️ Important ⚠️: These versions change frequently. Always check the official documentation links before installation.
Introduction
This guide provides step-by-step instructions for preparing a new server machine for basebox on-premise deployment. The recommendations are based on our experience deploying GPU-accelerated AI inference workloads using vLLM 0.14.0, Docker, and Kubernetes in production environments.
basebox is a Rust-based backend with Vue frontend that requires GPU acceleration for LLM inference. We recommend Kubernetes deployment (with Helm) for all production deployments. This guide focuses on preparing the operating system and foundational software stack.
Documentation Approach:
This guide provides simplified, step-by-step instructions tailored for basebox deployments. While we recommend consulting the official documentation from NVIDIA, Docker, Kubernetes, and Ubuntu for comprehensive details and latest updates, our intention is to streamline the installation process for our customers by consolidating the essential steps and configurations specific to basebox requirements. Each section includes links to the official documentation for reference, which should be consulted first for authoritative information, troubleshooting, and advanced configurations.
🚨 CRITICAL: AI Infrastructure Evolves Rapidly
The CUDA, Kubernetes, Docker, and GPU software ecosystems release updates frequently—sometimes weekly. This guide provides a proven installation path tested in February 2026, but software versions may have changed since publication.
ALWAYS check official documentation first: - Each step includes official documentation links at the top - Verify latest stable versions before running commands - Check compatibility matrices (CUDA ↔ Driver ↔ GPU model) - Test in non-production environment first if possible
Our guide consolidates essential steps to streamline installation for basebox requirements, but official documentation is authoritative for latest versions, troubleshooting, and advanced configurations.
When in doubt: Contact support@basebox.ai for current version recommendations.
Important Notes:
- Requirements may vary based on your specific deployment configuration
- Air-gapped deployment is supported - contact support@basebox.ai for offline installation procedures
- Hardware specifications depend on your GPU models and workload requirements
- If your environment differs from these recommendations, contact support@basebox.ai
Pre-Installation Checklist
Before beginning installation, verify you have:
Hardware Inventory:
- NVIDIA GPU(s) installed and recognized by BIOS (run
lspci | grep -i nvidiaafter OS installation) - GPU Compute Capability 7.0+ (V100, T4, RTX20xx, A100, L4, H100, etc.)
- Minimum 8 CPU cores (16+ recommended for production)
- Minimum 16GB RAM (32GB+ recommended, 64GB+ for large models)
- Minimum 500GB storage (1TB+ recommended for model storage)
- Adequate power supply and cooling for GPUs
Network Requirements:
- Internet connectivity during installation (or offline installation packages prepared)
- Required ports accessible (22/TCP for SSH, 6443/TCP for K8s API if applicable)
- Access credentials for server management
Access and Security:
- Physical or remote access to server (IPMI, iLO, or similar)
- BeyondTrust PRA or similar privileged access management configured (if applicable)
- Backup plan for existing data (if not a fresh installation)
Installation Media:
- Ubuntu 24.04 LTS Server ISO downloaded
- USB drive or DVD for installation (if not using remote installation)
Verification Commands (run after OS installation):
# Check CPU cores
lscpu | grep "^CPU(s):"
# Check RAM
free -h
# Check storage
df -h
# Check GPU detection
lspci | grep -i nvidia
Issues or Different Requirements? Contact support@basebox.ai before proceeding.
Required Network Access
For successful installation, the server must have network access to the following domains and URLs. Ensure your firewall allows outbound HTTPS (443/TCP) connections to these endpoints:
Package Repositories (Installation Phase)
Ubuntu Package Repositories:
archive.ubuntu.com- Ubuntu package repositorysecurity.ubuntu.com- Ubuntu security updates
Docker:
download.docker.com- Docker Engine packages and GPG keys*.docker.com- Docker registry and documentation
Kubernetes:
pkgs.k8s.io- Kubernetes packages (kubelet, kubeadm, kubectl)raw.githubusercontent.com- Kubernetes manifests and scripts
NVIDIA:
developer.download.nvidia.com- CUDA Toolkit packagesdeveloper.nvidia.com- CUDA downloads and documentationnvidia.github.io- NVIDIA Container Toolkit and GPU Operator repositorieshelm.ngc.nvidia.com- NVIDIA Helm chart repository
Helm:
raw.githubusercontent.com/helm/helm- Helm installation scriptsget.helm.sh- Helm binary downloads
Container Registries (Operation Phase):
registry-1.docker.io- Docker Hub (for base images)gcr.io- Google Container Registry (for some Kubernetes images)k8s.gcr.io/registry.k8s.io- Kubernetes container imagesquay.io- Container registry (for some Kubernetes components)
Network Plugins:
raw.githubusercontent.com/projectcalico- Calico CNI manifests
Testing Network Connectivity
Before installation, verify access to critical endpoints:
# Test DNS resolution
nslookup pkgs.k8s.io
nslookup download.docker.com
nslookup developer.download.nvidia.com
# Test HTTPS connectivity
curl -I https://pkgs.k8s.io
curl -I https://download.docker.com
curl -I https://developer.download.nvidia.com
# Test specific endpoints
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.33/deb/Release.key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg
Note: For air-gapped environments, contact support@basebox.ai for offline installation procedures and image transfer requirements.
Recommended Operating System: Ubuntu LTS
Ubuntu 24.04 LTS
Ubuntu 24.04 LTS (Noble Numbat)
- Release: April 2024
- Support until: April 2029
- Download: Ubuntu 24.04 LTS Server
- Architecture: x86_64
- Latest LTS release with extended support period
- Best NVIDIA CUDA support and documentation
- Most comprehensive ecosystem for GPU workloads
Note: Ubuntu 22.04 LTS is supported for existing systems that are already running basebox. For all new installations, use Ubuntu 24.04 LTS.
Step-by-Step Installation Guide
Note: Each step below includes official documentation links at the beginning. We recommend reviewing the official documentation first for comprehensive information, then using our simplified steps for basebox-specific installation. Our guide consolidates essential steps to streamline the process for basebox customers.
Step 1: Install Ubuntu Server
Official Documentation:
Short:
-
Download Ubuntu Server ISO
- Download Ubuntu 24.04 LTS Server
- Choose the server edition (not desktop) for minimal resource usage
-
Create Installation Media
- Write ISO to USB drive using tools like Rufus (Windows) or dd (Linux)
- Boot from installation media
-
Installation Options
- Select "Ubuntu Server" (minimal installation)
- Choose "Use an entire disk" for automatic partitioning (or custom if needed)
- Set up user account and hostname
- Enable OpenSSH server during installation (recommended for remote access)
-
Post-Installation
Step 2: Install NVIDIA GPU Drivers
Official Documentation:
Prerequisites:
- NVIDIA GPU installed and detected:
lspci | grep -i nvidia - Internet connection for driver downloads
NVIDIA Driver Installation (Recommended Method):
# Install latest NVIDIA drivers automatically
sudo ubuntu-drivers autoinstall
# Reboot to load drivers
sudo reboot
# Verify installation
nvidia-smi
Verification:
# Check driver version (should be 550+ for CUDA 13.0 support)
nvidia-smi
# Expected output shows:
# - Driver Version: 550.xx.xx or higher
# - CUDA Version: 13.0+
# - GPU(s) listed with memory information
Step 3: Install CUDA Toolkit
Install CUDA Toolkit 13.0:
Always install CUDA 13.0 - the latest version. CUDA 12.9 is supported as fallback if 13.0 is unavailable.
Official Documentation:
# Add NVIDIA CUDA repository
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
# Install CUDA Toolkit 13.0
sudo apt-get install -y cuda-toolkit-13-0
# Set environment variables (add to ~/.bashrc)
echo 'export PATH=/usr/local/cuda-13.0/bin${PATH:+:${PATH}}' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc
source ~/.bashrc
# Verify installation
nvcc --version
Fallback Option (CUDA 12.9):
Use CUDA 12.9 if CUDA 13.0 is unavailable in your repository:
sudo apt-get install -y cuda-toolkit-12-9
echo 'export PATH=/usr/local/cuda-12.9/bin${PATH:+:${PATH}}' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.9/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc
source ~/.bashrc
Step 4: Install Docker Engine
Requirements: Docker Engine 24.0+ (with containerd 2.0+ for Kubernetes 1.35+)
Official Documentation:
# Remove old versions
sudo apt-get remove docker docker-engine docker.io containerd runc
# Add Docker's official GPG key
sudo apt-get update
sudo apt-get install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
# Add Docker repository
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker Engine
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Start and enable Docker
sudo systemctl start docker
sudo systemctl enable docker
# Add current user to docker group (optional, for non-root access)
sudo usermod -aG docker $USER
# Log out and back in for group changes to take effect
# Verify installation
docker --version
sudo docker run hello-world
Step 5: Install NVIDIA Container Toolkit
Note: If you are installing Kubernetes with GPU Operator (Step 6-7), you can skip this step as the GPU Operator will automatically install and configure the NVIDIA Container Toolkit.
Official Documentation:
Required for Docker deployments:
# Configure the production repository (generic DEB repository)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
# Install NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit
# Configure Docker to use NVIDIA runtime
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
# Verify GPU access in containers
sudo docker run --rm --gpus all nvidia/cuda:12.9.0-base-ubuntu22.04 nvidia-smi
Step 6: Install Helm (Required for Kubernetes deployment)
Official Documentation:
Install Helm (Required for Kubernetes deployment):
# Download Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Verify installation
helm version
Step 7: Install Kubernetes (Required for Production)
Install Kubernetes 1.33+:
Kubernetes is required for all production deployments. Install Kubernetes 1.33 or later for optimal stability and feature support.
Official Documentation:
# Install required packages
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
# Add Kubernetes repository (for 1.33)
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.33/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.33/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
# Install Kubernetes components
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
# Disable swap (required for Kubernetes)
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Set kernel parameters for Kubernetes (required before kubeadm init)
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
# Configure containerd for Kubernetes (required before kubeadm init)
# The default containerd config may have CRI plugin disabled
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
sudo systemctl restart containerd
sudo systemctl enable containerd
# Verify containerd is running
sudo systemctl status containerd
# Initialize Kubernetes cluster (on master node)
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
# Set up kubeconfig for regular user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Install network plugin (Calico example)
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.31.3/manifests/calico.yaml
# Install NVIDIA GPU Operator (Production-ready with tolerations)
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update
# Create values file with tolerations for control-plane nodes
cat <<EOF > gpu-operator-values.yaml
# Tolerations for Node Feature Discovery components
node-feature-discovery:
master:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
worker:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
gc:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
# Tolerations for GPU Operator itself
operator:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
EOF
# Install with custom values
helm install --wait --timeout 15m --generate-name \
-n gpu-operator --create-namespace \
-f gpu-operator-values.yaml \
nvidia/gpu-operator
# Verify installation
kubectl get pods -n gpu-operator
kubectl get daemonsets -n gpu-operator
Alternative: Simple installation (without tolerations - for multi-node clusters with dedicated worker nodes):
If you have a multi-node cluster with dedicated worker nodes (no control-plane taint on worker nodes), you can use the simpler installation:
helm install --wait --timeout 15m --generate-name \
-n gpu-operator --create-namespace \
nvidia/gpu-operator
Note: The command above shows the standard installation for most setups. If you have:
- Pre-installed NVIDIA drivers (from Step 2)
- Air-gapped/offline environments
- MIG-enabled GPUs
- Custom security requirements
...you may need different GPU Operator configuration. Contact support@basebox.ai for guidance on your specific setup.
Step 8: System Configuration
Configure system settings for optimal performance:
Note: If you installed Kubernetes in Step 7, swap disabling and kernel parameters are already configured. You can skip those steps below.
# Increase shared memory size (required for model loading)
sudo mkdir -p /dev/shm
# Add entry to /etc/fstab (check if it already exists first)
if ! grep -q "/dev/shm" /etc/fstab; then
echo 'tmpfs /dev/shm tmpfs defaults,size=64g 0 0' | sudo tee -a /etc/fstab
fi
# Reload systemd to pick up the new fstab entry
sudo systemctl daemon-reload
# Unmount and remount /dev/shm to apply new size
sudo umount /dev/shm 2>/dev/null || true
sudo mount /dev/shm
# Verify the new size (should show your configured size, e.g., 64G)
df -h /dev/shm
# Disable swap (required for Kubernetes - already done in Step 7 if following steps in order)
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Set kernel parameters for Kubernetes (already done in Step 7 if following steps in order)
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
# Configure firewall (if using UFW)
sudo ufw allow 22/tcp # SSH
sudo ufw allow 6443/tcp # Kubernetes API (if master)
sudo ufw allow 10250/tcp # Kubelet API
sudo ufw allow 80/tcp # http
sudo ufw allow 443/tcp # https
sudo ufw enable
Step 9: Verification Checklist
Verify all components are installed correctly:
# Check Ubuntu version
lsb_release -a
# Check NVIDIA driver
nvidia-smi
# Check CUDA toolkit
nvcc --version
# Check Docker
docker --version
sudo docker run --rm --gpus all nvidia/cuda:12.9.0-base-ubuntu22.04 nvidia-smi
# Check Kubernetes (if installed)
kubectl version --client
kubectl get nodes
# Check GPU Operator (if Kubernetes installed)
kubectl get pods -n gpu-operator
# Check system resources
free -h
df -h
lscpu
All checks passing? Proceed to deployment. Issues? See Troubleshooting section or contact support@basebox.ai
Tested Configuration
This guide has been fully tested and verified on the following hardware configuration:
System Specifications
| Component | Specification | Status |
|---|---|---|
| Operating System | Ubuntu 24.04.3 LTS (Noble) | ✅ Verified |
| CPU | Intel Core i7-11800H @ 2.30GHz (8 cores, 16 threads) | ✅ Verified |
| Memory | 15 GB RAM | ✅ Verified |
| Storage | 468 GB NVMe SSD (262 GB available) | ✅ Verified |
| GPU | NVIDIA GeForce RTX 3050 Laptop GPU (4 GB VRAM) | ✅ Verified |
| Shared Memory | 64 GB /dev/shm |
✅ Verified |
Verified Software Stack
| Component | Version | Verification Command | Status |
|---|---|---|---|
| NVIDIA Driver | 580.126.09 | nvidia-smi |
✅ Verified |
| CUDA Toolkit | 13.0.88 | nvcc --version |
✅ Verified |
| Docker | 29.2.1 | docker --version |
✅ Verified |
| Kubernetes | 1.33.7 | kubectl version --client |
✅ Verified |
| GPU Operator | Latest (via Helm) | kubectl get pods -n gpu-operator |
✅ Verified |
Installation Verification Results
All components successfully installed and verified:
✅ Ubuntu 24.04.3 LTS - Confirmed via lsb_release -a
✅ NVIDIA Driver 580.126.09 - GPU detected and operational
✅ CUDA 13.0.88 - Compiler tools verified
✅ Docker 29.2.1 - Container runtime functional
✅ Kubernetes 1.33.7 - Single-node cluster (control-plane) running
✅ GPU Operator - All pods running:
- node-feature-discovery-gc: Running
- node-feature-discovery-master: Running
- node-feature-discovery-worker: Running
- gpu-operator: Running
✅ System Resources - Adequate for basebox deployment
✅ Shared Memory - 64 GB configured and mounted
Test Date
Tested: February 9, 2026
Test Environment: Single-node Kubernetes cluster with GPU Operator installed using production-ready configuration (with tolerations for control-plane taint).
Alternative Operating Systems
Ubuntu 24.04 LTS is the recommended operating system for basebox. The following table summarizes our evaluation of alternative operating systems for basebox deployments:
| Operating System | Description | Advantages | Comparison with Ubuntu | Recommendation |
|---|---|---|---|---|
| Ubuntu 24.04 LTS | Latest LTS release (April 2024) | Best NVIDIA CUDA documentation and support Most comprehensive GPU workload documentation Longest support lifecycle (until April 2029) Extensive Docker/Kubernetes community resources Industry standard for GPU-accelerated workloads |
N/A (baseline) | ✅ Recommended - Use for all new installations |
| Ubuntu 22.04 LTS | Previous LTS release (April 2022) | Stable, well-tested release Good NVIDIA CUDA support Support until April 2027 |
Older release with shorter support lifecycle Less recent CUDA toolkit packages Fewer community resources for latest features |
⚠️ Supported - Only for existing systems already running basebox |
| Debian Stable | Ubuntu's parent distribution | More conservative package versions Smaller default footprint Strong security focus No commercial ties |
Less NVIDIA-specific documentation Older CUDA toolkit packages Smaller GPU workload community Fewer Docker/Kubernetes tutorials |
⚠️ Supported - Requires additional configuration. Not officially tested. Contact support if needed |
| Red Hat Enterprise Linux (RHEL) | Enterprise Linux distribution | Commercial support from Red Hat Security hardening and compliance certifications 10-year support lifecycle Red Hat OpenShift integration Partner-validated GPU configurations |
Licensing costs Less accessible documentation More complex installation Fewer NVIDIA driver packages |
⚠️ May work - Consider if you require commercial support contracts or have existing RHEL infrastructure |
| SUSE Linux Enterprise Server (SLES) | Enterprise Linux distribution | Enterprise support available Strong security features Long-term support |
Less NVIDIA ecosystem support Smaller community resources Higher cost More complex setup |
⚠️ May work - Consider only if you have specific SLES requirements or existing SLES infrastructure |
| Rocky Linux / AlmaLinux | RHEL-compatible alternatives | Community-supported RHEL alternatives No licensing costs RHEL compatibility |
Less NVIDIA-specific documentation Limited GPU workload community Not extensively tested with basebox |
⚠️ May work - Not officially tested. Requires thorough testing in non-production environment |
| Microsoft Windows Server | Windows-based server OS | Enterprise Windows integration Familiar Windows administration |
vLLM requires Linux (not supported on Windows) Limited GPU container passthrough Docker uses Hyper-V virtualization overhead Reduced Kubernetes GPU support |
❌ Not Supported - vLLM requires Linux. Use Ubuntu 24.04 LTS instead |
Notes:
- Ubuntu 24.04 LTS is the officially supported and recommended distribution for new installations
- Alternative Linux distributions may work but require additional configuration, testing, and troubleshooting
-
If you must use an alternative distribution, follow these steps:
- Verify NVIDIA CUDA Toolkit support for your distribution version
- Ensure Docker and Kubernetes packages are available
- Test thoroughly in a non-production environment first
- Contact support@basebox.ai for assistance with non-standard configurations
-
Microsoft Windows Server is not supported due to vLLM's Linux requirement
Troubleshooting Decision Tree
Problem: GPU Not Detected
START: Run `nvidia-smi`
│
├─ ERROR: "command not found"
│ └─ SOLUTION: NVIDIA drivers not installed
│ └─ ACTION: Go to Step 2 and install drivers
│
├─ ERROR: "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver"
│ └─ SOLUTION: Driver installation incomplete or incorrect
│ └─ ACTION:
│ 1. Run: sudo apt-get purge nvidia-*
│ 2. Reinstall: sudo ubuntu-drivers autoinstall
│ 3. Reboot: sudo reboot
│
├─ ERROR: "No devices were found"
│ └─ SOLUTION: GPU not recognized by system
│ └─ ACTION:
│ 1. Check physical connection
│ 2. Verify in BIOS: GPU should be visible
│ 3. Check: lspci | grep -i nvidia
│ 4. If not visible, hardware issue - check power/seating
│
└─ SUCCESS: GPU information displayed
└─ NEXT: Verify CUDA version is 13.0 (or 12.9 as fallback)
Problem: CUDA Version Mismatch
START: Run `nvcc --version`
│
├─ ERROR: "command not found"
│ └─ SOLUTION: CUDA Toolkit not installed
│ └─ ACTION: Go to Step 3 and install CUDA Toolkit
│
├─ VERSION: CUDA 11.x or 12.0-12.8
│ └─ SOLUTION: CUDA version too old for vLLM 0.14.0
│ └─ ACTION:
│ 1. Remove old CUDA: sudo apt-get purge cuda-*
│ 2. Install CUDA 13.0 (see Step 3)
│ 3. Update environment variables
│ 4. Verify: nvcc --version
│
├─ VERSION: CUDA 13.0
│ └─ SUCCESS: Recommended version installed
│
├─ VERSION: CUDA 12.9
│ └─ ACCEPTABLE: Fallback version (use if 13.0 unavailable)
│
└─ ERROR: CUDA version doesn't match nvidia-smi
└─ SOLUTION: Driver/toolkit mismatch
└─ ACTION:
1. Check driver supports CUDA version: nvidia-smi
2. Install compatible driver (550+): see Step 2
3. Reboot
Problem: Docker Can't Access GPU
START: Run `sudo docker run --rm --gpus all nvidia/cuda:12.9.0-base-ubuntu22.04 nvidia-smi`
│
├─ ERROR: "docker: Error response from daemon: could not select device driver"
│ └─ SOLUTION: NVIDIA Container Toolkit not installed
│ └─ ACTION: Go to Step 5 and install toolkit
│
├─ ERROR: "docker: unknown flag: --gpus"
│ └─ SOLUTION: Docker version too old
│ └─ ACTION:
│ 1. Check version: docker --version
│ 2. Upgrade Docker: sudo apt-get install --upgrade docker-ce
│ 3. Restart: sudo systemctl restart docker
│
├─ ERROR: Permission denied or similar
│ └─ SOLUTION: NVIDIA runtime not configured
│ └─ ACTION:
│ 1. Configure runtime: sudo nvidia-ctk runtime configure --runtime=docker
│ 2. Restart Docker: sudo systemctl restart docker
│ 3. Verify: cat /etc/docker/daemon.json
│
└─ SUCCESS: nvidia-smi output displayed in container
└─ NEXT: Proceed with basebox deployment
Problem: Kubernetes GPU Not Available
START: Run `kubectl get nodes -o json | jq '.items[].status.allocatable'`
│
├─ NO GPU RESOURCES SHOWN
│ └─ SOLUTION: GPU Operator not installed or not working
│ └─ ACTION:
│ 1. Check pods: kubectl get pods -n gpu-operator
│ 2. If missing: Install GPU Operator (see Step 6)
│ 3. Check logs: kubectl logs -n gpu-operator <pod-name>
│ 4. Verify node labels: kubectl get nodes --show-labels
│
├─ ERROR: Pods in CrashLoopBackOff
│ └─ SOLUTION: Configuration issue
│ └─ ACTION:
│ 1. Check driver version: nvidia-smi
│ 2. Verify containerd config: containerd config dump | grep nvidia
│ 3. Check GPU Operator docs for compatibility
│ 4. Contact support: support@basebox.ai
│
└─ SUCCESS: nvidia.com/gpu resources shown
└─ NEXT: Deploy basebox workloads
Problem: kubeadm init fails with "unknown service runtime.v1.RuntimeService"
START: Run `sudo kubeadm init --pod-network-cidr=10.244.0.0/16`
│
├─ ERROR: "unknown service runtime.v1.RuntimeService"
│ └─ SOLUTION: containerd CRI plugin is disabled in configuration
│ └─ ACTION:
│ 1. Generate proper containerd config:
│ sudo mkdir -p /etc/containerd
│ containerd config default | sudo tee /etc/containerd/config.toml
│ 2. Restart containerd:
│ sudo systemctl restart containerd
│ sudo systemctl enable containerd
│ 3. Verify containerd is running:
│ sudo systemctl status containerd
│ 4. Retry kubeadm init:
│ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
│
├─ ERROR: "admin.conf: No such file or directory" (after failed init)
│ └─ SOLUTION: kubeadm init failed, so admin.conf was not created
│ └─ ACTION:
│ 1. Fix containerd configuration (see above)
│ 2. If cluster was partially initialized, reset first:
│ sudo kubeadm reset
│ 3. Retry kubeadm init after fixing containerd
│
└─ SUCCESS: Cluster initialized successfully
└─ NEXT: Continue with kubeconfig setup and network plugin installation
Problem: Network/Connectivity Issues
START: Basic connectivity check
│
├─ Can't access package repositories
│ └─ ACTION:
│ 1. Check internet: ping -c 4 8.8.8.8
│ 2. Check DNS: nslookup ubuntu.com
│ 3. Check firewall: sudo ufw status
│ 4. For air-gapped: Contact support@basebox.ai for offline setup
│
├─ Can't pull Docker images
│ └─ ACTION:
│ 1. Check Docker daemon: systemctl status docker
│ 2. Check proxy settings if behind corporate proxy
│ 3. Verify network access to registry
│ 4. For air-gapped: Contact support@basebox.ai
│
└─ Kubernetes nodes not communicating
└─ ACTION:
1. Check firewall rules for K8s ports
2. Verify pod network: kubectl get pods -A
3. Check CNI plugin: kubectl get pods -n kube-system
Problem: GPU Operator Installation Issues
START: Install GPU Operator with Helm
│
├─ WARNING: "node-role.kubernetes.io/master is deprecated"
│ └─ SOLUTION: Harmless deprecation warning
│ └─ ACTION: Safely ignore - GPU Operator chart uses old label
│ └─ NOTE: Kubernetes deprecated master label in favor of control-plane
│
├─ ERROR: Any installation failure (timeout, resources exist, etc.)
│ └─ FIRST STEP: Clean up previous failed installation
│ └─ ACTION (RECOMMENDED - Most reliable cleanup):
│ 1. Delete namespace (removes all resources):
│ kubectl delete namespace gpu-operator
│ 2. Recreate namespace:
│ kubectl create namespace gpu-operator
│ 3. Retry installation with proper values file
│ └─ ALTERNATIVE (if namespace deletion doesn't work):
│ 1. List releases: helm list -n gpu-operator
│ 2. Uninstall: helm uninstall <release-name> -n gpu-operator
│ 3. Manually clean resources if needed:
│ kubectl delete all --all -n gpu-operator
│ kubectl delete sa --all -n gpu-operator
│
├─ ERROR: "context deadline exceeded"
│ └─ SOLUTION: Installation timed out waiting for pods
│ └─ CAUSES:
│ - GPU Operator images are large (several GB)
│ - Multiple images need to be pulled
│ - Network connectivity issues
│ - Nodes may not be ready
│ └─ ACTION:
│ 1. Clean up first: kubectl delete namespace gpu-operator && kubectl create namespace gpu-operator
│ 2. Increase timeout: Add --timeout 15m to helm install
│ 3. Or install without --wait and monitor manually:
│ helm install --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator
│ 4. Monitor: kubectl get pods -n gpu-operator -w
│ 5. Check logs: kubectl describe pods -n gpu-operator | grep -A5 "Events"
│
├─ ERROR: "ServiceAccount exists and cannot be imported"
│ └─ SOLUTION: Previous installation attempt left resources behind
│ └─ ACTION (RECOMMENDED):
│ 1. Delete namespace: kubectl delete namespace gpu-operator
│ 2. Recreate namespace: kubectl create namespace gpu-operator
│ 3. Retry installation
│ └─ ALTERNATIVE:
│ 1. List releases: helm list -n gpu-operator
│ 2. Uninstall: helm uninstall <release-name> -n gpu-operator
│ 3. If still stuck, use namespace deletion method above
│
├─ ERROR: Pods stuck in "Pending" state
│ └─ ERROR MESSAGE: "0/1 nodes are available: 1 node(s) had untolerated taint"
│ └─ SOLUTION: Pods need tolerations for control-plane taint
│ └─ ACTION:
│ 1. Get release name: helm list -n gpu-operator -q | head -1
│ 2. Create values file with tolerations (see Step 7 for full example)
│ 3. Upgrade: helm upgrade <release-name> -n gpu-operator -f gpu-operator-values.yaml nvidia/gpu-operator
│ 4. Verify: kubectl get pods -n gpu-operator
│ └─ ALTERNATIVE (single-node dev only):
│ Remove taint: kubectl taint nodes --all node-role.kubernetes.io/control-plane-
│
└─ SUCCESS: All pods running
└─ VERIFY:
1. Check pods: kubectl get pods -n gpu-operator
2. Check node labels: kubectl get nodes -L nvidia.com/gpu.product
3. Test GPU: kubectl run gpu-test --image=nvidia/cuda:12.9.0-base-ubuntu22.04 --rm -it --restart=Never -- nvidia-smi
When to Contact Support
Contact support@basebox.ai if:
- Installation fails after following troubleshooting steps
- Your hardware configuration is non-standard
- You need air-gapped/offline installation assistance
- You encounter errors not covered in this guide
- You need guidance on custom configurations
- Your organization has specific security or compliance requirements
Provide the following information when contacting support:
# System Information
uname -a
lsb_release -a
lscpu | grep "Model name"
free -h
df -h
# GPU Information
lspci | grep -i nvidia
nvidia-smi
# Software Versions
nvcc --version
docker --version
kubectl version --client (if applicable)
# Logs (if applicable)
sudo journalctl -u docker -n 100
kubectl logs -n gpu-operator <pod-name> (if applicable)
Next Steps: Deployment
After completing server preparation, proceed with basebox deployment:
Kubernetes Deployment - Recommended Production Method
For all production deployments, use Kubernetes:
Kubernetes deployment is covered in the Kubernetes/Helm Installation Guide. That guide provides detailed instructions for:
- Using Helm charts to deploy all basebox services
- Configuring GPU resource allocation per service
- Setting up ingress, TLS certificates, and networking
- Production-grade monitoring and logging
Prerequisites:
- Kubernetes cluster (1.33+) with GPU Operator installed
- Helm 3.x installed
- Access to basebox container image registry during installation
- For air-gapped deployments: Contact support@basebox.ai for offline image transfer procedures
- Storage class configured for persistent volumes
Benefits:
- Automatic scaling and load balancing
- Rolling updates with zero downtime
- Resource isolation and security
- Production-grade monitoring and logging
- Multi-node cluster support
Docker-Based Deployment
For development/testing:
- Contact support@basebox.ai for Docker-based deployment guidance
- Suitable for single-server development environments
- Not recommended for production workloads
GPU Configuration Examples
basebox supports various GPU configurations. After server preparation, you'll configure GPU allocation based on your hardware:
Single GPU (1x H100, 1x L4, etc.)
- Allocate GPU to inference service
- Run RAG server in CPU mode
- Suitable for: Testing, low-volume production
Multi-GPU (4x H100, 4x L40S, 2x RTX 4090, 2x L40)
- Allocate all GPUs to inference service for tensor parallelism
- Run RAG server in CPU mode
- Suitable for: Production workloads, high concurrency
Mixed Configuration
- Dedicate primary GPUs to inference
- Optionally allocate one GPU to RAG for OCR workloads
- Suitable for: Production with document processing requirements
Note: Specific GPU allocation and model selection will be covered in the basebox deployment guide. Contact support@basebox.ai for guidance on optimal configuration for your hardware.
See: LLM recommendations for further details.
Security Considerations
Initial Server Hardening:
# Configure automatic security updates
sudo apt-get install -y unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades
# Configure firewall
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw enable
# Disable root login via SSH (edit /etc/ssh/sshd_config)
# Set: PermitRootLogin no
sudo nano /etc/ssh/sshd_config
sudo systemctl restart sshd
# Set up SSH key authentication (disable password auth if desired)
# Set in /etc/ssh/sshd_config: PasswordAuthentication no
# Install fail2ban for intrusion prevention
sudo apt-get install -y fail2ban
sudo systemctl enable fail2ban
sudo systemctl start fail2ban
Container Security:
- Use non-root users in containers
- Scan container images for vulnerabilities
- Limit container capabilities
- Use read-only filesystems where possible
- Implement network policies in Kubernetes
Access Control:
- Use BeyondTrust PRA or similar privileged access management
- Implement least-privilege access principles
- Enable audit logging
- Regular security updates
Additional Security Resources:
Maintenance and Updates
Regular Maintenance Tasks:
# Update system packages (monthly or as needed)
sudo apt update && sudo apt upgrade -y
# Update NVIDIA drivers (when needed, test in non-production first)
sudo apt update
sudo apt install --upgrade nvidia-driver-550
# Update Docker (when new versions available)
sudo apt update
sudo apt install --upgrade docker-ce docker-ce-cli containerd.io
# Update Kubernetes (if installed, follow upgrade path carefully)
# See: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
# Clean up unused packages and images
sudo apt autoremove -y
docker system prune -a --volumes # Use with caution - removes unused data
Monitoring:
- Monitor GPU temperature and utilization:
nvidia-smi -l 1 - Check system resources:
htop,df -h,free -h - Monitor Docker containers:
docker stats - Monitor Kubernetes (if used):
kubectl top nodes,kubectl top pods
Monitoring Tools:
- Prometheus + Grafana for metrics
- NVIDIA Data Center GPU Manager (DCGM) for GPU monitoring
- Kubernetes Dashboard for K8s cluster monitoring
Support and Resources
Official Documentation:
- NVIDIA CUDA Documentation
- NVIDIA Driver Documentation
- Docker Documentation
- Kubernetes Documentation
- vLLM Documentation
- Ubuntu Server Documentation
basebox Support:
- Email: support@basebox.ai
- For: Deployment assistance, air-gapped installations, custom configurations, troubleshooting
Community Support:
Conclusion
This guide provides the recommended path for preparing a server for basebox deployment. Ubuntu 24.04 LTS is the recommended operating system for new installations, offering the best balance of NVIDIA CUDA support, Docker/Kubernetes compatibility, security, and maintainability for GPU-accelerated AI inference workloads.
Ubuntu 22.04 LTS is supported for existing systems. Alternative Linux distributions may work but require additional configuration and are not officially supported. Microsoft Windows Server is not suitable for basebox due to vLLM's Linux requirement and limited GPU container support.
Key Takeaways:
- Ubuntu 24.04 LTS is the recommended OS for new installations
- CUDA 13.0 required (12.9 fallback)
- Kubernetes 1.33+ required for production deployments
- GPU Compute Capability 7.0+ required
- Air-gapped deployments supported with assistance
Ready to Deploy?
After completing server preparation, contact support@basebox.ai to proceed with basebox deployment. Our team will guide you through the Kubernetes installation process (the recommended production method).
Document Version History:
- v2.1.4 (February 10, 2026): Fixing helm GPU operator, added helm troubleshooting and record of precise test environment.
- v2.1.3 (February 09, 2026): Fixing K8S step 7, update verisons fo CNI, adding version matrix at beginnig, and troubleshooting part.
- v2.1.2 (February 06, 2026): Install helm before k8s, official documentation - first!
- v2.1.1 (February 05, 2026): Updated step 5 - use nvidia global repository
- v2.1 (February 03, 2026): Reviewed, cleaned, simplified and polished
- v2.0 (February 03, 2026): Updated CUDA requirements, Kubernetes versions, added Quick Start and Troubleshooting sections
- v1.0 (Initial release): Basic server preparation guide
Feedback: Send documentation feedback to support@basebox.ai