Portainer Guide for Understand Tech

Overview

Portainer provides a web-based interface for managing the UnderstandTech Docker stack on DGX systems. This guide covers the essential features you'll need to monitor, troubleshoot, and manage the platform.

Full Portainer documentation: https://docs.portainer.io/arrow-up-right

Access: https://localhost:9443arrow-up-right

What You'll Learn:

  • Viewing and managing containers (start, stop, restart, scale)

  • Accessing container logs and consoles

  • Managing the UnderstandTech stack

  • Basic troubleshooting


Table of Contents

  1. First-Time Setup

  2. Dashboard Overview

  3. Managing Containers

  4. Viewing Logs

  5. Console Access

  6. Container Stats

  7. Managing Volumes

  8. Common Tasks

  9. Troubleshooting


1. First-Time Setup

Portainer is installed as part of the UnderstandTech deployment process. If you need to reinstall or check installation status, refer to the main deployment documentation.

Initial Access

  1. Open your browser and navigate to: https://localhost:9443arrow-up-right

  2. First visit: You'll see a security warning about the self-signed SSL certificate

  3. This is expected and safe for internal use

  4. Click "Advanced" → "Proceed to localhost"

Creating Admin Account

On first access, create your administrator account:

  1. Username: Choose an admin username (avoid "admin" for security)

  2. Password: Minimum 12 characters with uppercase, lowercase, numbers, and symbols

  3. Click Create user

Important: Save these credentials securely - password recovery requires database access.

Connect to Docker Environment

After creating your account:

  1. Select Get Started with Docker

  2. Portainer auto-detects the local Docker socket

  3. Click Connect

You're now ready to manage the UnderstandTech platform.


2. Dashboard Overview

Main Dashboard

After login, you'll see the Portainer home screen showing an overview of your Docker environment.

Clicking on the 'local' environment will take you to the main dashboard:

Main Navigation:

  • Home - Environment overview

  • Containers - View and manage all containers

  • Images - View downloaded Docker images

  • Networks - View Docker networks

  • Volumes - View persistent storage volumes

  • Stacks - Manage the UnderstandTech stack


3. Managing Containers

Navigate to Containers in the left sidebar.

Understanding the Container List

The UnderstandTech platform consists of these containers:

Container
Purpose
Typical Status

ut-caddy

Reverse proxy/HTTPS

healthy

ut-frontend

React web interface

healthy

ut-api

Main backend API

healthy

ut-api-customer

Partner API

healthy

ut-workers-1, ut-workers-2

Background job processing

running

ut-workers-customer-1, ut-workers-customer-2

REST API background job processing

running

ut-llm

GPU-accelerated LLM service

healthy

ut-mongodb

Database

healthy

ut-redis

Cache and task queue

healthy

circle-info

Status Indicators:

  • Green dot: Container is running and healthy

  • Orange dot: Container is running but unhealthy (check logs)

  • Red dot: Container is stopped

Basic Container Operations

Starting/Stopping Containers:

  1. Locate the container in the list

  2. Click the appropriate icon:

  3. Play icon (▶): Start stopped container

  4. Stop icon (■): Stop running container

  5. Restart icon (↻): Restart container

  6. Confirm the action if prompted

When to restart containers:

  • After configuration changes (for example, adding API keys for public LLMs in the Admin page requires restarting the API container)

  • When troubleshooting issues

  • If container appears unhealthy

Viewing Container Details:

Click on a container name to see:

  • Logs - Recent output (see Viewing Logs)

  • Inspect - Full configuration (JSON)

  • Stats - Real-time CPU, memory and network usage

  • Console - Terminal access (see Console Access)

Scaling Workers

The UnderstandTech platform uses worker containers for background job processing, handling operations like AI assistant training and creation, workflow processing, etc. You can scale workers up or down based on workload.

To scale workers:

  1. Open the .env file of your deployment.

  2. Set the WORKER_REPLICAS and WORKER_CUSTOMER_REPLICAS variables to the desired value, and save it (Ctrl+S).

  3. Restart the stack:


4. Viewing Logs

Logs are essential for troubleshooting and monitoring the UnderstandTech platform.

Accessing Container Logs

From Containers page:

  1. Navigate to Containers

  2. Click the Logs icon (document icon) next to any container

Or from Container details:

  1. Click on container name

  2. Select Logs tab

Log Viewer Features

Essential Controls:

Feature
Purpose

Auto-refresh

Toggle ON for live log streaming

Search

Find specific text in logs

Lines

How many log lines to display (100-2000)

Timestamps

Show when each log entry occurred

Download

Save logs as text file

Wrap lines

Toggle line wrapping for long lines

Common Log Viewing Tasks

1

View live logs (like docker logs -f):

  • Toggle Auto-refresh ON

  • Logs update in real-time

2

Search for errors:

  • Enter "error" in the Search box

  • Matching lines are highlighted

  • Use arrows to navigate between matches

3

Check recent issues:

  • Set Lines to 500 or 1000

  • Enable Timestamps

  • Scroll through recent activity

4

Download logs for analysis:

  • Click Download button

  • Saves as <container-name>-logs.txt

Searching Logs Effectively

Common search terms:

  • error - Find error messages

  • exception - Find Python/Java exceptions

  • failed - Find failed operations

  • connection - Find connection issues

  • mongodb - Find database-related logs

  • gpu - Find GPU-related messages (in ut-llm)

circle-info

Pro tip: Use the Filter logs toggle to show ONLY matching lines when you have a specific search term.


5. Console Access

The console provides direct terminal access to containers for advanced troubleshooting.

Accessing Container Console:

  1. Navigate to Containers

  2. Click Console icon (terminal) next to container

  3. Or click container name → Console tab

Before Connecting:

Select the appropriate shell:

Container Type
Shell to Use

ut-api, ut-workers, ut-mongodb, ut-llm

/bin/bash

ut-frontend, ut-redis

/bin/sh

Connecting:

  1. Select shell command from dropdown

  2. Click Connect

  3. Terminal opens in browser

Common Console Tasks

Check service health:

Test connectivity:

View environment:

Disconnect: Click Disconnect button


6. Container Stats

Portainer provides real-time monitoring of container resource usage through the Stats view. This is invaluable for understanding performance characteristics and identifying resource bottlenecks.

Accessing Container Stats

  1. Navigate to Containers

  2. Click on a container name

  3. Select the Stats tab

Understanding the Stats Dashboard

The stats view shows four key metrics updated in real-time:

Memory Usage

The memory graph shows how much RAM the container is currently using. For UnderstandTech containers, you'll typically see:

  • ut-api: 500-700MB during normal operation

  • ut-llm: Several GB (depends on loaded model size)

  • ut-mongodb: 1-2GB with working dataset

  • ut-redis: 50-200MB

  • Workers: 300-500MB per worker

If you see memory usage climbing steadily over time without dropping, that could indicate a memory leak in the application that should be investigated.

CPU Usage

Shows the percentage of CPU resources being consumed. The graph displays spikes when the container is processing requests. The LLM service will show the highest CPU usage during inference, while most other services stay below 5% during idle periods. Sustained high CPU usage (above 80%) might indicate the container is overloaded and needs scaling.

Network Usage

Displays network I/O split between received (RX) and transmitted (TX) data. This helps identify containers handling heavy traffic. The API services will show higher network usage during peak request times, while internal services like Redis and MongoDB have relatively low external network activity since they primarily communicate within the Docker network.

I/O Usage (Disk)

Shows disk read and write operations. MongoDB shows the highest I/O activity due to database operations, while the LLM service has bursts of read activity when loading models. Sustained high write activity on non-database containers might indicate excessive logging that should be investigated.

The Process List

At the bottom of the stats page, you'll see a list of processes running inside the container. This is similar to running top or ps inside the container, but accessible without opening a console. The process list shows:

  • PID: Process ID

  • Command: The actual command being executed

  • CPU %: Per-process CPU usage

  • Memory: Memory consumed by that process


7. Managing Volumes

Volumes provide persistent storage for containers. Unlike the container filesystem (which gets wiped when the container is removed), volumes persist data across container restarts and recreations. Understanding volume management is crucial for maintaining your UnderstandTech deployment.

Viewing Volumes

Navigate to Volumes in the left sidebar. You'll see a list of all volumes on the system, showing:

  • Name: Volume identifier

  • Stack: Which stack created it (if any)

  • Driver: Storage driver (usually local)

  • Mount point: Where it's stored on the host

  • Created: Timestamp

  • Ownership: Who owns it

UnderstandTech Volumes

The platform uses several named volumes for persistent data:

Volume Name
Purpose
Typical Size
Critical?

ut-mongodb-data

Database files

5-50GB+

Yes - contains all platform data

ut-mongodb-backup

Automated backups

2-20GB

Yes - backup copies

ut-redis-data

Redis persistence

100MB-1GB

Moderate - can be rebuilt

ut-llm-ollama

Ollama configuration

50MB

No - just config

ut-llm-models

Downloaded LLM models

10-100GB

Moderate - can redownload

ut-uploads-data

User-uploaded files

Varies

Yes - user data

ut-caddy-data

TLS certificates

10MB

Moderate - certs can regenerate

ut-caddy-config

Caddy configuration

1MB

No - just config

The Anonymous Volume Problem

When you run docker compose down followed by docker compose up, Docker sometimes creates new anonymous volumes instead of reusing the named ones. These anonymous volumes appear in your volume list with long hexadecimal names and show as "Unused" (orange badge in Portainer).

This happens because of how Docker handles volume lifecycle. The volumes themselves persist (which is good - your data is safe), but they're no longer attached to any container (which is wasteful - they're just taking up disk space).

Each time you recreate the stack, you might accumulate unused volumes. If you do this frequently during testing or troubleshooting, you can end up with dozens of orphaned volumes consuming disk space. On a DGX system with limited SSD storage, this can become a real problem.

How to identify orphaned volumes:

In the Portainer volumes list, look for:

  • Volumes with hexadecimal names (not human-readable like ut-mongodb-data)

  • Orange "Unused" badges

  • Recent creation dates that match when you recently recreated containers

Cleaning Up Unused Volumes

Before deleting anything, verify the volume is truly unused. Click on the volume name to see its details. If it shows "Used by: 0 containers" and doesn't contain critical data, it's safe to remove.

To remove unused volumes:

  1. In the Volumes list, check the boxes next to unused volumes

  2. Click the Remove button at the top

  3. Confirm the deletion

Bulk cleanup approach:

If you have many unused volumes, you can clean them all at once from the command line:

This command removes all volumes not currently in use by any container. It will prompt for confirmation and show you how much space you'll reclaim.

Caution: Never delete a volume that's marked as part of the understandtech stack unless you intend to lose that data. The named volumes (ut-mongodb-data, ut-llm-models, etc.) should generally be left alone unless you're intentionally wiping data.

Preventing Volume Accumulation

The best way to avoid orphaned volumes is to avoid unnecessary stack recreation. Instead of:

Use:

This recreates containers without removing volumes. If you do need to fully tear down the stack (for example, when upgrading or troubleshooting deep issues), explicitly keep the volumes:

Only use docker compose down -v if you intentionally want to delete volumes and lose data.

Monitoring Volume Disk Usage

Volumes can grow quite large, especially ut-mongodb-data and ut-llm-models. Periodically check total volume disk usage:

This shows disk usage for images, containers, volumes, and build cache. If volumes are consuming too much space, investigate which volumes are largest:

This one-liner shows the size of each volume, helping you identify storage hogs.


8. Common Tasks

Restarting a Service

When a service is misbehaving:

  1. Go to Containers

  2. Find the container (e.g., ut-api)

  3. Click Restart icon (↻)

  4. Wait for container to come back up

Checking Container Health

Quick health check:

  1. Containers page shows status dots

  2. Green = healthy, Orange = unhealthy

  3. Click container name → Stats for resource usage

Investigating Errors

When something goes wrong:

  1. Check container logs for errors:

    1. Search for "error", "exception", "failed"

    2. Enable timestamps to see when issues occurred

  2. Use console to check connectivity:

    1. ping mongodb, ping redis

    2. curl http://llm:8000/health

  3. Verify environment variables are set correctly

  4. Check resource usage in Stats tab

Scaling for Performance

If workers can't keep up with jobs:

  1. Check Redis queue length:

  1. If queue is backing up, scale workers (see Scaling Workers)

  2. Monitor CPU/memory in Stats tab


9. Troubleshooting

Cannot Access Portainer

Symptom: Browser cannot reach https://understand.local:9443arrow-up-right

Solutions:

  1. Try direct IP: https://<DGX-IP>:9443

  2. Check if Portainer is running:

  1. If stopped, start it:

  1. Check logs:

Container Won't Start

circle-exclamation

Check logs for clues:

  1. Click container name → Logs tab

  2. Look for error messages at the end of logs

  3. Common issues:

  4. Port already in use

  5. Missing environment variables

  6. Volume mount errors

  7. Image pull failures

Console Won't Connect

circle-info

Symptom: Console fails to open or disconnects immediately

Try different shell:

  • For Alpine containers: Use /bin/sh

  • For Ubuntu containers: Use /bin/bash

If still fails: Container may not include a shell (minimal images)

Logs Not Showing

circle-info

Symptom: Log viewer is empty

Solutions:

  1. Wait a few seconds - container might be starting

  2. Check if container is running (green status)

  3. Try refreshing the page

  4. Check via CLI:

Stack Update Fails

circle-info

Symptom: Error when updating stack

Check:

  1. YAML syntax in Editor tab

  2. All required environment variables are set

  3. Image names are correct

  4. No conflicting port mappings


8. Quick Reference

Common Container Operations

Useful Console Commands

Getting Help

  • Portainer Docs: https://docs.portainer.io/

  • UnderstandTech Docs: See main platform documentation

  • Logging Guide: See ut-logging-guide.md


Last updated