backpropagation pretrained Diffusion
backpropagation based ONNX implementation for latent checkpoint.
- Input
- 7455-dim embedding
- Encoder
- 115 x Diffusion with 24 heads
- Output
- rouge-l projection
Training config
optimizer=Adagrad, lr=0.599, scheduler=plateau, warmup=1550