1 pending

Run Dossier

Training run for SD 2.1. Mari

21 metrics · 86,340 train · 184 val

Chart Overlays

Compare this run against its peers

0/48
Add peer runs here to layer their train and val traces onto the current metric charts.
crashed Apr 21, 17:43 Duration: 41h 10m Python 3.11.7 train_joint.py --config /cluster/home/drothenpiele/models/stable_diffusion_mari/mari/.euler-launches/euler_launch_32571115-81e6-41ae-b149-e6b444d88590/joint_vkitti.yaml
lr 4e-5batch 14epochs 200precision fp16model stable-diffusion-2-1 seed 42wd 0grad accum 1warmup 0.05000ema nograd clip 1
ID 64330173
mari/runs/2026-04-21_17-43-24_0cfb
Error
BrokenPipeError: [Errno 32] Broken pipe
Metrics 21/21
depth.train 4
diag 3
depth_abs_rel
depth_delta1
depth_mae
loss 1
depth
rgb.train 11
diag 2
image_psnr
image_ssim
loss 7
dehaze
preserve
total
visibility
visibility_aux
visibility_rank
visibility_tv
stat 2
visibility_mean
visibility_std
sys.train 6
gpu_mem_total_gb
gpu_mem_used_gb
gpu_mem_util_pct
gpu_util_pct
lr
visibility_aux_scale
X-Axis
Y-Scale
Series
Group
Namespace

Depth Abs Rel

kind=diag
V
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb val (raw)

Depth Delta1

kind=diag
V
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb val (raw)

Depth Mae

kind=diag
V
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb val (raw)

Depth

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Image Psnr

dB kind=diag
V
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb val (raw)

Image Ssim

kind=diag
V
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb val (raw)

Dehaze

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Preserve

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Total

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Visibility

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Visibility Aux

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Visibility Rank

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Visibility Tv

kind=loss
T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Visibility Mean

kind=stat
T V
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb train (raw)
2026-04-21_17-43-24_0cfb val (raw)

Visibility Std

kind=stat
T V
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb train (raw)
2026-04-21_17-43-24_0cfb val (raw)

Gpu Mem Total Gb

T V
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb train (raw)
2026-04-21_17-43-24_0cfb val (raw)

Gpu Mem Used Gb

T V
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb train (raw)
2026-04-21_17-43-24_0cfb val (raw)

Gpu Mem Util Pct

T V
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb train (raw)
2026-04-21_17-43-24_0cfb val (raw)

Gpu Util Pct

T V
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb val (smoothed)
2026-04-21_17-43-24_0cfb train (raw)
2026-04-21_17-43-24_0cfb val (raw)

Lr

T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

Visibility Aux Scale

T
2026-04-21_17-43-24_0cfb train (smoothed)
2026-04-21_17-43-24_0cfb train (raw)

No output snapshots found for this run.

Outputs are generated during training and saved to outputs/epoch_N_step_M/ directories.

epoch 1 / step 2790 Checkpoint #3038
4/21/2026, 6:02:11 PM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-1
epoch 3 / step 5580 Checkpoint #3040
4/21/2026, 7:32:34 PM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-3
epoch 5 / step 8370 Checkpoint #3071
4/21/2026, 9:07:37 PM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-5
epoch 7 / step 11160 Checkpoint #3078
4/22/2026, 7:33:48 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-7
epoch 9 / step 13950 Checkpoint #3079
4/22/2026, 7:33:48 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-9
epoch 11 / step 16740 Checkpoint #3080
4/22/2026, 7:33:48 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-11
epoch 13 / step 19530 Checkpoint #3081
4/22/2026, 7:33:48 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-13
epoch 15 / step 22320 Checkpoint #3082
4/22/2026, 7:33:48 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-15
epoch 17 / step 25110 Checkpoint #3083
4/22/2026, 7:33:48 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-17
epoch 19 / step 27900 Checkpoint #3093
4/22/2026, 8:23:01 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-19
epoch 21 / step 30690 Checkpoint #3104
4/22/2026, 9:44:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-21
epoch 23 / step 33480 Checkpoint #3138
4/22/2026, 12:14:54 PM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-23
epoch 25 / step 36270 Checkpoint #3151
4/22/2026, 1:38:19 PM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-25
epoch 27 / step 39060 Checkpoint #3191
4/22/2026, 3:02:16 PM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-27
epoch 29 / step 41850 Checkpoint #3207
4/22/2026, 5:52:14 PM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-29
epoch 31 / step 44640 Checkpoint #3223
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-31
epoch 33 / step 47430 Checkpoint #3224
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-33
epoch 35 / step 50220 Checkpoint #3225
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-35
epoch 37 / step 53010 Checkpoint #3226
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-37
epoch 39 / step 55800 Checkpoint #3227
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-39
epoch 41 / step 58590 Checkpoint #3228
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-41
epoch 43 / step 61380 Checkpoint #3229
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-43
epoch 45 / step 64170 Checkpoint #3230
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-45
epoch 47 / step 66960 Checkpoint #3231
4/23/2026, 8:53:18 AM
/cluster/scratch/drothenpiele/euler_train/mari/checkpoints/joint-dehaze-metric-2026-04-21-17-43-24-0cfb/checkpoint-47
Producer launch exports are available. Manage launch-owned exports here for quick reference.
Open Launch

Inherited Launch Exports

These exports are published by the run's producer launch.

Published

The producer launch does not publish any exports yet.

Run-Owned Exports

Publish direct filesystem paths here.

Published

No run-owned exports are published yet.

Raw Artifacts

Run-owned exports are typically direct paths, so there are no captured artifacts to publish from here.

Euler View - ML Experiment Monitor