Portainer Guide for Understand Tech

Overview

Portainer provides a web-based interface for managing the UnderstandTech Docker stack on DGX systems. This guide covers the essential features you'll need to monitor, troubleshoot, and manage the platform.

Full Portainer documentation: https://docs.portainer.io/

Access: https://localhost:9443

What You'll Learn:

Viewing and managing containers (start, stop, restart, scale)
Accessing container logs and consoles
Managing the UnderstandTech stack
Basic troubleshooting

First-Time Setup
Dashboard Overview
Managing Containers
Viewing Logs
Console Access
Container Stats
Managing Volumes
Common Tasks
Troubleshooting

1. First-Time Setup

Portainer is installed as part of the UnderstandTech deployment process. If you need to reinstall or check installation status, refer to the main deployment documentation.

Initial Access

Open your browser and navigate to: https://localhost:9443
First visit: You'll see a security warning about the self-signed SSL certificate
This is expected and safe for internal use
Click "Advanced" → "Proceed to localhost"

Creating Admin Account

On first access, create your administrator account:

Username: Choose an admin username (avoid "admin" for security)
Password: Minimum 12 characters with uppercase, lowercase, numbers, and symbols
Click Create user

Important: Save these credentials securely - password recovery requires database access.

Connect to Docker Environment

After creating your account:

Select Get Started with Docker
Portainer auto-detects the local Docker socket
Click Connect

You're now ready to manage the UnderstandTech platform.

2. Dashboard Overview

Main Dashboard

After login, you'll see the Portainer home screen showing an overview of your Docker environment.

Clicking on the 'local' environment will take you to the main dashboard:

Main Navigation:

Home - Environment overview
Containers - View and manage all containers
Images - View downloaded Docker images
Networks - View Docker networks
Volumes - View persistent storage volumes
Stacks - Manage the UnderstandTech stack

3. Managing Containers

Navigate to Containers in the left sidebar.

Understanding the Container List

The UnderstandTech platform consists of these containers:

Container

Purpose

Typical Status

ut-caddy

Reverse proxy/HTTPS

healthy

ut-frontend

React web interface

healthy

ut-api

Main backend API

healthy

ut-api-customer

Partner API

healthy

ut-workers-1, ut-workers-2

Background job processing

running

ut-workers-customer-1, ut-workers-customer-2

REST API background job processing

running

ut-llm

GPU-accelerated LLM service

healthy

ut-mongodb

Database

healthy

ut-redis

Cache and task queue

healthy

Status Indicators:

Green dot: Container is running and healthy
Orange dot: Container is running but unhealthy (check logs)
Red dot: Container is stopped

Basic Container Operations

Starting/Stopping Containers:

Locate the container in the list
Click the appropriate icon:
Play icon (▶): Start stopped container
Stop icon (■): Stop running container
Restart icon (↻): Restart container
Confirm the action if prompted

When to restart containers:

After configuration changes (for example, adding API keys for public LLMs in the Admin page requires restarting the API container)
When troubleshooting issues
If container appears unhealthy

Viewing Container Details:

Click on a container name to see:

Logs - Recent output (see Viewing Logs)
Inspect - Full configuration (JSON)
Stats - Real-time CPU, memory and network usage
Console - Terminal access (see Console Access)

Scaling Workers

The UnderstandTech platform uses worker containers for background job processing, handling operations like AI assistant training and creation, workflow processing, etc. You can scale workers up or down based on workload.

To scale workers:

Open the .env file of your deployment.
Set the WORKER_REPLICAS and WORKER_CUSTOMER_REPLICAS variables to the desired value, and save it (Ctrl+S).
Restart the stack:

cd understand-tech

# If the stack is already up
docker compose down

docker compose up -d

4. Viewing Logs

Logs are essential for troubleshooting and monitoring the UnderstandTech platform.

Accessing Container Logs

From Containers page:

Navigate to Containers
Click the Logs icon (document icon) next to any container

Or from Container details:

Click on container name
Select Logs tab

Log Viewer Features

Essential Controls:

Feature

Purpose

Auto-refresh

Toggle ON for live log streaming

Find specific text in logs

Lines

How many log lines to display (100-2000)

Timestamps

Show when each log entry occurred

Download

Save logs as text file

Wrap lines

Toggle line wrapping for long lines

Common Log Viewing Tasks

View live logs (like docker logs -f):

Toggle Auto-refresh ON
Logs update in real-time

Search for errors:

Enter "error" in the Search box
Matching lines are highlighted
Use arrows to navigate between matches

Check recent issues:

Set Lines to 500 or 1000
Enable Timestamps
Scroll through recent activity

Download logs for analysis:

Click Download button
Saves as <container-name>-logs.txt

Searching Logs Effectively

Common search terms:

error - Find error messages
exception - Find Python/Java exceptions
failed - Find failed operations
connection - Find connection issues
mongodb - Find database-related logs
gpu - Find GPU-related messages (in ut-llm)

Pro tip: Use the Filter logs toggle to show ONLY matching lines when you have a specific search term.

5. Console Access

The console provides direct terminal access to containers for advanced troubleshooting.

Accessing Container Console:

Navigate to Containers
Click Console icon (terminal) next to container
Or click container name → Console tab

Before Connecting:

Select the appropriate shell:

Container Type

Shell to Use

ut-api, ut-workers, ut-mongodb, ut-llm

/bin/bash

ut-frontend, ut-redis

/bin/sh

Connecting:

Select shell command from dropdown
Click Connect
Terminal opens in browser

Common Console Tasks

Check service health:

# From ut-api
curl http://localhost:8501/health

# From ut-llm
nvidia-smi  # Verify GPU access

Test connectivity:

# Check MongoDB connection
ping mongodb
curl -v mongodb:27017

# Check Redis
redis-cli -h redis ping

View environment:

env | grep MONGODB
env | grep REDIS

Disconnect: Click Disconnect button

6. Container Stats

Portainer provides real-time monitoring of container resource usage through the Stats view. This is invaluable for understanding performance characteristics and identifying resource bottlenecks.

Accessing Container Stats

Navigate to Containers
Click on a container name
Select the Stats tab

Understanding the Stats Dashboard

The stats view shows four key metrics updated in real-time:

Memory Usage

The memory graph shows how much RAM the container is currently using. For UnderstandTech containers, you'll typically see:

ut-api: 500-700MB during normal operation
ut-llm: Several GB (depends on loaded model size)
ut-mongodb: 1-2GB with working dataset
ut-redis: 50-200MB
Workers: 300-500MB per worker

If you see memory usage climbing steadily over time without dropping, that could indicate a memory leak in the application that should be investigated.

CPU Usage

Shows the percentage of CPU resources being consumed. The graph displays spikes when the container is processing requests. The LLM service will show the highest CPU usage during inference, while most other services stay below 5% during idle periods. Sustained high CPU usage (above 80%) might indicate the container is overloaded and needs scaling.

Network Usage

Displays network I/O split between received (RX) and transmitted (TX) data. This helps identify containers handling heavy traffic. The API services will show higher network usage during peak request times, while internal services like Redis and MongoDB have relatively low external network activity since they primarily communicate within the Docker network.

I/O Usage (Disk)

Shows disk read and write operations. MongoDB shows the highest I/O activity due to database operations, while the LLM service has bursts of read activity when loading models. Sustained high write activity on non-database containers might indicate excessive logging that should be investigated.

The Process List

At the bottom of the stats page, you'll see a list of processes running inside the container. This is similar to running top or ps inside the container, but accessible without opening a console. The process list shows:

PID: Process ID
Command: The actual command being executed
CPU %: Per-process CPU usage
Memory: Memory consumed by that process

7. Managing Volumes

Volumes provide persistent storage for containers. Unlike the container filesystem (which gets wiped when the container is removed), volumes persist data across container restarts and recreations. Understanding volume management is crucial for maintaining your UnderstandTech deployment.

Viewing Volumes

Navigate to Volumes in the left sidebar. You'll see a list of all volumes on the system, showing:

Name: Volume identifier
Stack: Which stack created it (if any)
Driver: Storage driver (usually local)
Mount point: Where it's stored on the host
Created: Timestamp
Ownership: Who owns it

UnderstandTech Volumes

The platform uses several named volumes for persistent data:

Volume Name

Purpose

Typical Size

Critical?

ut-mongodb-data

Database files

5-50GB+

Yes - contains all platform data

ut-mongodb-backup

Automated backups

2-20GB

Yes - backup copies

ut-redis-data

Redis persistence

100MB-1GB

Moderate - can be rebuilt

ut-llm-ollama

Ollama configuration

50MB

No - just config

ut-llm-models

Downloaded LLM models

10-100GB

Moderate - can redownload

ut-uploads-data

User-uploaded files

Varies

Yes - user data

ut-caddy-data

TLS certificates

10MB

Moderate - certs can regenerate

ut-caddy-config

Caddy configuration

1MB

No - just config

The Anonymous Volume Problem

When you run docker compose down followed by docker compose up, Docker sometimes creates new anonymous volumes instead of reusing the named ones. These anonymous volumes appear in your volume list with long hexadecimal names and show as "Unused" (orange badge in Portainer).

This happens because of how Docker handles volume lifecycle. The volumes themselves persist (which is good - your data is safe), but they're no longer attached to any container (which is wasteful - they're just taking up disk space).

Each time you recreate the stack, you might accumulate unused volumes. If you do this frequently during testing or troubleshooting, you can end up with dozens of orphaned volumes consuming disk space. On a DGX system with limited SSD storage, this can become a real problem.

How to identify orphaned volumes:

In the Portainer volumes list, look for:

Volumes with hexadecimal names (not human-readable like ut-mongodb-data)
Orange "Unused" badges
Recent creation dates that match when you recently recreated containers

Cleaning Up Unused Volumes

Before deleting anything, verify the volume is truly unused. Click on the volume name to see its details. If it shows "Used by: 0 containers" and doesn't contain critical data, it's safe to remove.

To remove unused volumes:

In the Volumes list, check the boxes next to unused volumes
Click the Remove button at the top
Confirm the deletion

Bulk cleanup approach:

If you have many unused volumes, you can clean them all at once from the command line:

docker volume prune

This command removes all volumes not currently in use by any container. It will prompt for confirmation and show you how much space you'll reclaim.

Caution: Never delete a volume that's marked as part of the understandtech stack unless you intend to lose that data. The named volumes (ut-mongodb-data, ut-llm-models, etc.) should generally be left alone unless you're intentionally wiping data.

Preventing Volume Accumulation

The best way to avoid orphaned volumes is to avoid unnecessary stack recreation. Instead of:

docker compose down
docker compose up -d

Use:

docker compose up -d --force-recreate

This recreates containers without removing volumes. If you do need to fully tear down the stack (for example, when upgrading or troubleshooting deep issues), explicitly keep the volumes:

docker compose down          # Removes containers but keeps volumes
docker compose up -d         # Recreates containers using existing volumes

Only use docker compose down -v if you intentionally want to delete volumes and lose data.

Monitoring Volume Disk Usage

Volumes can grow quite large, especially ut-mongodb-data and ut-llm-models. Periodically check total volume disk usage:

docker system df -v

This shows disk usage for images, containers, volumes, and build cache. If volumes are consuming too much space, investigate which volumes are largest:

docker volume ls -q | xargs docker volume inspect --format '{{.Name}}: {{.Mountpoint}}' | \
  xargs -I {} sh -c 'echo {} $(du -sh $(echo {} | cut -d: -f2) 2>/dev/null | cut -f1)'

This one-liner shows the size of each volume, helping you identify storage hogs.

8. Common Tasks

Restarting a Service

When a service is misbehaving:

Go to Containers
Find the container (e.g., ut-api)
Click Restart icon (↻)
Wait for container to come back up

Checking Container Health

Quick health check:

Containers page shows status dots
Green = healthy, Orange = unhealthy
Click container name → Stats for resource usage

Investigating Errors

When something goes wrong:

Check container logs for errors:
1. Search for "error", "exception", "failed"
2. Enable timestamps to see when issues occurred
Use console to check connectivity:
1. ping mongodb, ping redis
2. curl http://llm:8000/health
Verify environment variables are set correctly
Check resource usage in Stats tab

Scaling for Performance

If workers can't keep up with jobs:

Check Redis queue length:

# From consoleredis-cli -h redis LLEN rq:queue:ut-api

If queue is backing up, scale workers (see Scaling Workers)
Monitor CPU/memory in Stats tab

9. Troubleshooting

Cannot Access Portainer

Symptom: Browser cannot reach https://understand.local:9443

Solutions:

Try direct IP: https://<DGX-IP>:9443
Check if Portainer is running:

docker ps | grep portainer

If stopped, start it:

docker start portainer

Check logs:

docker logs portainer

Container Won't Start

Symptom: Container status stays red or immediately stops

Check logs for clues:

Click container name → Logs tab
Look for error messages at the end of logs
Common issues:
Port already in use
Missing environment variables
Volume mount errors
Image pull failures

Console Won't Connect

Symptom: Console fails to open or disconnects immediately

Try different shell:

For Alpine containers: Use /bin/sh
For Ubuntu containers: Use /bin/bash

If still fails: Container may not include a shell (minimal images)

Logs Not Showing

Symptom: Log viewer is empty

Solutions:

Wait a few seconds - container might be starting
Check if container is running (green status)
Try refreshing the page
Check via CLI:

docker logs ut-api --tail 50

Stack Update Fails

Symptom: Error when updating stack

Check:

YAML syntax in Editor tab
All required environment variables are set
Image names are correct
No conflicting port mappings

8. Quick Reference

Common Container Operations

# Via Portainer UI:
Containers → Select container → Click icon

Start:    ▶ (play icon)
Stop:     ■ (stop icon)  
Restart:  ↻ (restart icon)
Logs:     📄 (document icon)
Console:  >_ (terminal icon)

# Check service health
curl http://localhost:8501/health


# Test connectivity
ping mongodb
ping redis


# Check environment
env | grep MONGO
env | grep REDIS


# Redis queue length
redis-cli -h redis LLEN rq:queue:ut-api

Useful Console Commands

# Check service health
curl http://localhost:8501/health

# Test connectivity
ping mongodb
ping redis

# Check environment
env | grep MONGO
env | grep REDIS

# Redis queue length
redis-cli -h redis LLEN rq:queue:ut-api

Getting Help

Portainer Docs: https://docs.portainer.io/
UnderstandTech Docs: See main platform documentation
Logging Guide: See ut-logging-guide.md

PreviousInstallation and Setup Guide NextMongoDB & Backups Guide

Last updated 1 month ago

Good evening

hashtagOverview

hashtagTable of Contents

hashtag1. First-Time Setup

hashtagInitial Access

hashtagCreating Admin Account

hashtagConnect to Docker Environment

hashtag2. Dashboard Overview

hashtagMain Dashboard

hashtag3. Managing Containers

hashtagUnderstanding the Container List

hashtagBasic Container Operations

hashtagScaling Workers

hashtag4. Viewing Logs

hashtagAccessing Container Logs

hashtagLog Viewer Features

hashtagCommon Log Viewing Tasks

hashtagSearching Logs Effectively

hashtag5. Console Access

hashtagCommon Console Tasks

hashtag6. Container Stats

hashtagAccessing Container Stats

hashtagUnderstanding the Stats Dashboard

hashtagThe Process List

hashtag7. Managing Volumes

hashtagViewing Volumes

hashtagUnderstandTech Volumes

hashtagThe Anonymous Volume Problem

hashtagCleaning Up Unused Volumes

hashtagPreventing Volume Accumulation

hashtagMonitoring Volume Disk Usage

hashtag8. Common Tasks

hashtagRestarting a Service

hashtagChecking Container Health

hashtagInvestigating Errors

hashtagScaling for Performance

hashtag9. Troubleshooting

hashtagCannot Access Portainer

hashtagContainer Won't Start

hashtagConsole Won't Connect

hashtagLogs Not Showing

hashtagStack Update Fails

hashtag8. Quick Reference

hashtagCommon Container Operations

hashtagUseful Console Commands

hashtagGetting Help

Overview

Table of Contents

1. First-Time Setup

Initial Access

Creating Admin Account

Connect to Docker Environment

2. Dashboard Overview

Main Dashboard

3. Managing Containers

Understanding the Container List

Basic Container Operations

Scaling Workers

4. Viewing Logs

Accessing Container Logs

Log Viewer Features

Common Log Viewing Tasks

Searching Logs Effectively

5. Console Access

Common Console Tasks

6. Container Stats

Accessing Container Stats

Understanding the Stats Dashboard

The Process List

7. Managing Volumes

Viewing Volumes

UnderstandTech Volumes

The Anonymous Volume Problem

Cleaning Up Unused Volumes

Preventing Volume Accumulation

Monitoring Volume Disk Usage

8. Common Tasks

Restarting a Service

Checking Container Health

Investigating Errors

Scaling for Performance

9. Troubleshooting

Cannot Access Portainer

Container Won't Start

Console Won't Connect

Logs Not Showing

Stack Update Fails

8. Quick Reference

Common Container Operations

Useful Console Commands

Getting Help