Tensor Language Model Revolutionizes Tensor Compilation with Generative Scheduling
A new Tensor Language Model (TLM) has been developed to enhance the efficiency of tensor compilation through generative scheduling. This model significantly reduces compilation time by using offline-trained knowledge to generate efficient tensor programs quickly, contrasting with traditional iterative search methods. TLM's approach allows for rapid adaptation to new hardware targets and workloads, making it a valuable tool for machine learning deployment. The model has been benchmarked against standard frameworks, demonstrating superior performance in both latency and compile time.