1 pending

Run Dossier

Training run for SD 2.1. Mari

18 metrics · 6,146 train · 0 val

Chart Overlays

Compare this run against its peers

0/48
Add peer runs here to layer their train and val traces onto the current metric charts.
crashed Mar 29, 17:29 Duration: 28h 6m Python 3.11.7 train_joint.py --config /cluster/home/drothenpiele/models/stable_diffusion_mari/mari/.euler-launches/euler_launch_cc857c6e-8903-4c75-a115-6db03d3f3dfb/joint_vkitti.yaml
lr 3e-5batch 1epochs 200precision fp16model stable-diffusion-2-1 seed 42wd 0.01000grad accum 16warmup 0.05000ema nograd clip 1
ID 61717257 Job sd_train_concat Part gpuhe.120h Node eu-g6-080 CPU 8 GPU 1
mari/runs/2026-03-29_17-29-29_9d56
Error
BadZipFile: Caught BadZipFile in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/cluster/home/drothenpiele/.cache/venv/mari/lib/python3.11/site-packages/torch/utils/d...
Metrics 18/18
gpu 4
mem 3
total/gb
used/gb
util/pct
util/pct
val/visibility 2
mean
std
visibility 7
aux 2
loss
scale
loss
mean
rank/loss
std
tv/loss
dehaze/loss
depth/loss
loss
lr
preserve/loss
X-Axis
Y-Scale
Series

Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Dehaze Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Depth Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Gpu Mem Total Gb

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Gpu Mem Used Gb

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Gpu Mem Util Pct

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Gpu Util Pct

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Lr

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Preserve Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Val/Visibility Mean

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Val/Visibility Std

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Visibility Aux Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Visibility Aux Scale

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Visibility Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Visibility Mean

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Visibility Rank Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Visibility Std

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

Visibility Tv Loss

T
2026-03-29_17-29-29_9d56 train (smoothed)
2026-03-29_17-29-29_9d56 train (raw)

No output snapshots found for this run.

Outputs are generated during training and saved to outputs/epoch_N_step_M/ directories.

Producer launch exports are available. Manage launch-owned exports here for quick reference.
Open Launch

Inherited Launch Exports

These exports are published by the run's producer launch.

Published

The producer launch does not publish any exports yet.

Run-Owned Exports

Publish direct filesystem paths here.

Published

No run-owned exports are published yet.

Raw Artifacts

Run-owned exports are typically direct paths, so there are no captured artifacts to publish from here.

Euler View - ML Experiment Monitor