Whale: Efficient Giant Model Training over Heterogeneous GPUs

Distributed training framework for large models.