arxiv Are we really making much progress? Revisiting, benchmarking, and refining heterogeneous graph neural networks