INT8量化归档 - 每时AI

美团基于SGLang提供INT8无损满血版DeepSeek R1部署方案

2025年3月6日19时作者 GiantPandaCV

.co/meituan/DeepSeek-R1-Block-INT8/tree/main/infer

2024年12月27日8时作者极市干货

写在前面
：之前笔者写过 4 篇关于 Nvidia 官方项目 Faster Transformer

2024年12月15日20时2024年11月21日23时作者极市干货

本文介绍了如何使用TensorRT加速通过PyTorch Eager Mode量化接口生成的量化模型，包括量化步骤、修复ONNX模型图以及构建和验证TensorRT引擎等内容。