1 pending

Run Dossier

Training run for SD 2.1. Mari

18 metrics · 10,882 train · 0 val

Chart Overlays

Compare this run against its peers

0/48
Add peer runs here to layer their train and val traces onto the current metric charts.
crashed Mar 28, 22:27 Duration: 46h 8m Python 3.11.7 train_joint.py --config /cluster/home/drothenpiele/models/stable_diffusion_mari/mari/.euler-launches/euler_launch_b9f3e227-1dfe-40f5-93c2-db857a03e8c3/joint_vkitti.yaml
lr 3e-5batch 1epochs 200precision fp16model stable-diffusion-2-1 seed 42wd 0.01000grad accum 16warmup 0.05000ema nograd clip 1
ID 61670094 Job sd_train_concat Part gpuhe.120h Node eu-g6-059 CPU 8 GPU 1
mari/runs/2026-03-28_22-27-56_caf4
Error
BadZipFile: Caught BadZipFile in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/cluster/home/drothenpiele/.cache/venv/mari/lib/python3.11/site-packages/torch/utils/d...
Metrics 18/18
gpu 4
mem 3
total/gb
used/gb
util/pct
util/pct
val/visibility 2
mean
std
visibility 7
aux 2
loss
scale
loss
mean
rank/loss
std
tv/loss
dehaze/loss
depth/loss
loss
lr
preserve/loss
X-Axis
Y-Scale
Series

Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Dehaze Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Depth Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Gpu Mem Total Gb

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Gpu Mem Used Gb

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Gpu Mem Util Pct

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Gpu Util Pct

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Lr

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Preserve Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Val/Visibility Mean

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Val/Visibility Std

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Visibility Aux Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Visibility Aux Scale

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Visibility Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Visibility Mean

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Visibility Rank Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Visibility Std

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

Visibility Tv Loss

T
2026-03-28_22-27-56_caf4 train (smoothed)
2026-03-28_22-27-56_caf4 train (raw)

No output snapshots found for this run.

Outputs are generated during training and saved to outputs/epoch_N_step_M/ directories.

Producer launch exports are available. Manage launch-owned exports here for quick reference.
Open Launch

Inherited Launch Exports

These exports are published by the run's producer launch.

Published

The producer launch does not publish any exports yet.

Run-Owned Exports

Publish direct filesystem paths here.

Published

No run-owned exports are published yet.

Raw Artifacts

Run-owned exports are typically direct paths, so there are no captured artifacts to publish from here.

Euler View - ML Experiment Monitor