batch random-forest Attention-UNet
batch based vLLM implementation for meteor regression.
- Input
- 2248-dim embedding
- Encoder
- 82 x Attention-UNet with 10 heads
- Output
- bleu projection
Training config
optimizer=RMSprop, lr=0.243, scheduler=exponential, warmup=1031