arxiv DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

名称
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
首页
https://yiyibooks.cn/arxiv/2305.10429v4/index.html
原始地址
https://arxiv.org/abs/2305.10429
描述
(lm)(lm)的性能... ...