Building a machine, setting up a homelab, and self-hosting apps has always intrigued me. I wasn’t certain about the exact steps, I had a million questions — what hardware should I use, what Large Language Model (LLM) can I run, what memory is needed, how to setup backups, etc. Last week, I finally setup my homelab that is integrated with a GPU to run LLMs locally, all under €400.
With the help from my (new) friends Claude and GPT, I was able to set things up. All running and functioning as expected. In this blog, I will share the whole experience and I hope you enjoy reading this.
The Plan
I already had a Raspberry Pi 4 sitting at home from a previous project, so the plan was a hybrid setup:
- Raspberry Pi 4 (already owned) — always-on lightweight services like Home Assistant and n8n
- GPU workstation (new) — heavy services like Immich, Plausible/ClickHouse, and Ollama for local LLMs
I originally planned to buy an Intel N100 mini PC as a middle step, but finding a cheap workstation made it redundant and saved me ~€200. Sometimes the best plan is the one that simplifies itself.
Sourcing Components in Germany
The Workstation — Dell Precision T3620 (€89.59)
I spent a couple of evenings browsing eBay.de, Kleinanzeigen, and refurbished shops like it-versand.com, MJtronics, and AfB. The Dell Precision T3620 stood out:
- i5-6600 (4C/4T) — adequate since LLM inference is GPU-bound
- 16GB DDR4 RAM
- 256GB 2.5” SATA SSD
- Full PCIe x16 slot — essential for a proper GPU
- Standard ATX PSU bay — can swap in a real PSU
- M.2 NVMe slot on the motherboard for future SSD upgrades
Found it on eBay.de for €89.59 — much cheaper than the typical OptiPlex towers going for €150–250.
The GPU — ASUS TUF RTX 3060 12GB (€223.34)
The RTX 3060 12GB is the sweet spot for budget LLM inference:
- 12GB VRAM runs 7B–8B parameter models comfortably (you can see the performance here)
- Ampere architecture with Tensor Cores (useful for QLoRA fine-tuning)
- Excellent driver support on Linux
I spent days watching eBay auctions. A few tips for GPU shopping in Germany:
- RTX 3060 12GB prices hovered around €220–230 on eBay.de in March 2026
- Kleinanzeigen is cheapest but riskier — always do local pickup with cash, or use “Sicher bezahlen” for shipped items. Never bank transfer to strangers.
Secured an ASUS TUF for €223.34.
One thing I didn’t account for: the TUF is a triple-fan card at 300mm long. The Dell case only fits ~220mm. More on this later.
The PSU — Deepcool PL650D 650W (~€50)
The T3620 comes with a proprietary Dell PSU that can’t power an RTX 3060. I needed a proper ATX PSU.
The Deepcool PL650D checked every box: 650W, 80+ Bronze, ATX 3.1 compliant, PCIe 6+2 pin connector, and a native 12V-2x6 (PCIe 5.1) connector for future GPU upgrades. €50 from Alternate.de.
PSU buying advice: Don’t cheap out — the PSU powers everything. Avoid cheap brands and stick with trusted brands. 80+ Bronze minimum.
The Adapter Cable — 24-pin to 8-pin ATX (~€8)
This is the non-obvious part. Dell uses a proprietary 8-pin motherboard power connector — not the standard 24-pin ATX. You need an adapter cable with a 12VSB booster that converts 5V standby to the 12V Dell motherboards expect. Found one on Amazon.de that explicitly lists Precision T3620 compatibility.
The Final Bill
| Component | Price |
|---|---|
| Dell Precision T3620 | €89.59 |
| ASUS TUF RTX 3060 OC V2 12GB | €223.34 |
| Deepcool PL650D 650W PSU | ~€50 |
| 24-pin to 8-pin adapter cable | ~€8 |
| Total | ~€381 |
Assembly: My First PC Build
This was my first time building a PC. I’ve always been a software person — hardware felt intimidating. But honestly, it’s a lot like Lego once you get past the fear of breaking something expensive.
Step 1: Test Before Modifying
Booted the T3620 as-is before changing anything. It came with Windows pre-installed and worked fine. A small relief — good to know the motherboard, RAM, and SSD are healthy before I start swapping parts.
The 256GB SSD turned out to be a 2.5” SATA drive in the drive cage, not an M.2 stick on the motherboard. This matters — the drive cage sits exactly where the GPU needs to go.
Step 2: The Case Problem
Remember how I mentioned the TUF is 300mm long? The Dell case fits ~220mm GPUs. The drive cage is in the way. The option was to remove the drive cage. It’s held by rivets, not screws — you need a drill. I didn’t have one (welcome to apartment life in Berlin).
The real problem: Even with the drive cage removed, the display ports wouldn’t align with the case bracket openings. The case is simply too small for a 300mm card.
If I had to do it again, I’d pick a shorter RTX 3060 model. These all fit the T3620 without modification:
| Model | Length |
|---|---|
| Gainward Ghost | 201mm |
| EVGA XC Gaming | 201mm |
| Palit Dual | 209mm |
| Inno3D Twin X2 | 210mm |
| Zotac Twin Edge | 222mm |
| MSI Ventus 2X | 235mm |
My solution: Run it outside the case, test bench style. I placed the motherboard on top of the Dell case and assembled everything in the open. I am not thrilled about the aesthetics. Not pretty, but functional — and honestly, there’s something satisfying about seeing all the components exposed and working.
Step 3: Connect Everything
Since we’re running open-air, the PSU just sits next to the motherboard. Three connections:
- 24-pin to 8-pin adapter → PSU’s 24-pin output through the adapter to the motherboard’s 8-pin socket (marked ATX_SYS)
- 4+4 pin CPU cable → from PSU to motherboard’s 4-pin socket near the CPU (the cable splits — use one half)
- PCIe 6+2 pin cable → from PSU directly to the GPU’s power connector
All connectors are keyed — they physically only fit one way. If you have to force it, it’s wrong.
Step 4: First Boot
Powered on. Got a “Front I/O cable failure” warning — expected since the front panel cables are disconnected in the open-air setup. Press F1 to skip it.
The system booted successfully. It works! I may have done a little fist pump.
Installing Ubuntu Server
This machine’s job is to be a headless Linux server, so I skipped Windows entirely. I am also not a huge fan of Windows, and I would prefer Linux any day.
Creating the Bootable USB
From my MacBook, I flashed Ubuntu Server 24.04.2 LTS onto a USB stick. First attempt with dd got stuck — the status=progress flag caused issues on macOS. Used balenaEtcher instead — it just works.
Lesson learned: If
ddhangs on macOS, don’t waste time debugging it. Use Etcher.
Boot from USB by pressing F12 for the boot menu, selecting UEFI Boot, then choosing Ubuntu Server with HWE Kernel (HWE = Hardware Enablement — better driver support for newer GPUs).
Make sure that the machine is connected to the internet. T3620 doesn’t have a WiFi module, you need to connect it with an ethernet cable.
Installer Choices That Matter
A few non-obvious decisions during the Ubuntu install:
- Third-party drivers: search and install all — the installer detected the RTX 3060 and queued NVIDIA drivers
- Storage: entire disk with LVM, no encryption — LVM allows easy resizing later
- LVM size: resize to max — the default only uses ~100GB of a 235GB disk. Always resize
ubuntu-lvto use the full disk, or expand later withlvextend. - OpenSSH: enabled — essential for headless management
- Featured snaps: none — we’ll install everything via Docker
NVIDIA Drivers
The third-party driver installer didn’t fully set up NVIDIA. After rebooting, nvidia-smi returned “command not found”. Quick fix:
sudo apt update && sudo apt install -y nvidia-driver-590sudo rebootAfter reboot:
$ nvidia-smi+-------------------------------------------------------------------------+| NVIDIA-SMI 590.48.01 Driver Version: 590.48.01 CUDA Version: 13.1 ||-------------------------------+------------------------+----------------+| GPU Name Persistence-M| Bus-Id Disp.A | GPU-Util || 0 NVIDIA GeForce RTX 3060 | 00000000:01:00.0 On | 0% || 0% 46C P8 14W / 170W| 17MiB / 12288MiB | Default |+-------------------------------+------------------------+----------------+RTX 3060 detected. 12GB VRAM. CUDA 13.1. 46°C idle. This got me more excited!
SSH Setup
From the MacBook:
ssh-copy-id username@<ip_address>ssh username@<ip_address>No more monitor and keyboard needed.
Docker, GPU Passthrough & Ollama
Docker + NVIDIA Container Toolkit
Docker keeps every service isolated. But by default, containers can’t see the GPU. The NVIDIA Container Toolkit fixes that:
# Install Dockercurl -fsSL https://get.docker.com | sudo shsudo usermod -aG docker $USER
# Install NVIDIA Container Toolkitcurl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \ | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \ | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \ | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update && sudo apt install -y nvidia-container-toolkitsudo nvidia-ctk runtime configure --runtime=dockersudo systemctl restart dockerVerify GPU access inside Docker:
docker run --rm --gpus all nvidia/cuda:12.6.3-base-ubuntu24.04 nvidia-smiIf you see your RTX 3060 inside the container, the full stack works: Linux -> NVIDIA driver -> Docker -> GPU passthrough. This was the moment where weeks of research and planning finally came together.
Running Ollama
Ollama wraps llama.cpp and makes running local LLMs trivially easy — pull models like Docker images, get an OpenAI-compatible API, and it integrates with tools like n8n and Open WebUI.
docker run -d --gpus all --name ollama \ -p 11434:11434 \ -v ollama:/root/.ollama \ --restart unless-stopped \ ollama/ollamaWhat Models Fit in 12GB VRAM?
With 12GB on the RTX 3060, models up to ~14B parameters fit comfortably. Here’s what I’m running:
| Model | Size | VRAM | Use Case |
|---|---|---|---|
| Qwen 3.5 8B | 8B | ~6GB | General purpose — newest generation |
| GPT-OSS 20B | 20B | ~12GB | OpenAI’s first open-weight model — tight fit (worth it?) |
| DeepSeek-R1 14B | 14B | ~9GB | Chain-of-thought reasoning |
| Ministral 3 8B | 8B | ~6GB | Designed to run on the edge |
| Qwen 3 VL 8B | 8B | ~6GB | Vision language model |
docker exec ollama ollama pull qwen3.5:8bdocker exec ollama ollama pull gpt-oss:20bdocker exec ollama ollama pull deepseek-r1:14bYou can only run one model at a time (each needs most of the VRAM), but switching is instant since they’re cached on disk. Here’s the performance for both Qwen3.5:9B and Ministral 3:8B
$ docker exec ollama ollama run qwen3.5:9b "Count from 1 to 20" --verbose⠴ Thinking... Process:
1. **Analyze the Request:** The user wants me to count from 1 to 20. This is a simple, straightforward task.
2. **Determine the Output:** I need to list the numbers from 1 to 20 in sequential order.
3. **Draft the Response:** 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20.
4. **Review and Refine:** Ensure the formatting is clear (e.g., using commas or newlines). A simple comma-separated list or a numbered list works well. I'll use a comma-separated list for compactness or newlines for clarity. Let's go with a clear, comma-separated list.
5. **Final Output Generation:** 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20.
6. **Check Constraints:** No special constraints. Just count.
7. **Execute:** Produce the text.cw...done thinking.
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20.total duration: 23.400164695sload duration: 14.548057324sprompt eval count: 18 token(s)prompt eval duration: 54.099668msprompt eval rate: 332.72 tokens/seval count: 391 token(s)eval duration: 8.521913011seval rate: 45.88 tokens/sFor Ministral:
$ docker exec ollama ollama run ministral-3:8b "Count from 1 to 20" --verboseHere you go:
1, 2, 3, 4, 5,6, 7, 8, 9, 10,11, 12, 13, 14, 15,16, 17, 18, 19, 20.
total duration: 14.728132586sload duration: 13.097138179sprompt eval count: 561 token(s)prompt eval duration: 304.198791msprompt eval rate: 1844.19 tokens/seval count: 72 token(s)eval duration: 1.216488265seval rate: 59.19 tokens/sOpen WebUI — Your Private ChatGPT
While I can access these models via the CLI, and I wanted a UI. Ollama has an app, but I find it limiting. Open WebUI gives you a full ChatGPT-like experience on top of your local models: chat history, model switching, file uploads, and multi-user support. You can add integrations and tools, configure model parameters, and a lot more. I am still exploring it.
docker run -d --name open-webui --network host \ -v open-webui:/app/backend/data \ -e OLLAMA_BASE_URL=http://localhost:11434 \ --restart unless-stopped \ ghcr.io/open-webui/open-webui:mainI added it to Caddy as a reverse proxy and set up a DNS rewrite on my router (AdGuard Home: ollama.lan -> <ip_address>). Now every device on my network — phone, laptop, tablet — can open http://ollama.lan and chat with local models.
Tip: Don’t use
.localfor custom domain names — macOS and iOS intercept.localfor mDNS (Bonjour) and it won’t resolve through your router’s DNS. Use.lanor.homeinstead.
Lessons Learned
-
Measure your GPU before buying. A 300mm card won’t fit a Dell workstation case without major modification. Cards under 235mm fit the T3620 without changes.
-
Dell uses proprietary power connectors. You need a 24-pin to 8-pin adapter with a 12VSB booster. Make sure it explicitly lists your Dell model.
-
Test before you modify. Boot the machine stock first to verify everything works before swapping components.
-
Running outside the case is fine. You don’t need a case to test if everything works. Motherboard on a flat surface, components connected — it boots. This might not be sustainable for long time, but for the time being, it works.
-
Ubuntu’s LVM default wastes disk space. Always resize the root volume to use the full disk during installation.
-
The HWE kernel matters. Select the HWE kernel at boot for better hardware support, especially for newer GPUs.
-
Don’t cheap out on the PSU. It powers everything. Stick to known brands, 80+ Bronze minimum.
-
ddon macOS can be unreliable. Use balenaEtcher instead of fighting withddflags.
What’s Next
- Deploy the full service stack: Immich (with ML), Plausible Analytics, n8n
- Set up the Raspberry Pi 4 with Home Assistant
- Test QLoRA fine-tuning on the RTX 3060
- Find a proper ATX case (or embrace the open-air test bench life)
A GPU-powered homelab running local LLMs, for under €400 in Germany. If you’re in a similar situation — curious about self-hosting, intimidated by hardware, working with a tight budget — I hope this helps you take the leap. The hardest part was just starting. Not bad for a first-time PC build.