Apple and Nvidia have partnered to enhance large-scale AI deployment by leveraging ReDrafter technology. Introduced by Apple in November 2024, ReDrafter optimizes artificial intelligence processes, accelerates the processing of large language models (LLMs), and reduces operational costs for organizations.
According to Techradar, ReDrafter employs a speculative decoding method that integrates recurrent neural networks (RNN) for draft generation, alongside beam search and dynamic tree attention techniques. Apple’s tests indicate that this technology produces 2.7 times more tokens per second compared to traditional autoregressive methods.
By incorporating ReDrafter into Nvidia’s TensorRT-LLM platform, the collaboration broadens the range of applications, enhancing LLM inference on Nvidia GPUs, which are commonly used in production settings. Nvidia has revised existing operators and introduced new ones within TensorRT-LLM to align with ReDrafter’s algorithms, facilitating performance optimization for large-scale models.
ReDrafter not only boosts processing speed but also minimizes latency for end users while requiring fewer GPUs. This approach not only cuts computational costs but also conserves energy—a crucial benefit for organizations managing large-scale AI implementations.
Although the current focus is on Nvidia’s infrastructure, Apple is open to the possibility of extending ReDrafter technology to competing GPUs from AMD or Intel in the future. Such an expansion could enable the AI industry to exploit machine learning capabilities across various platforms while diminishing reliance on a single vendor.
Nvidia commented on the partnership, stating, “This collaboration has made TensorRT-LLM more powerful and flexible, allowing the LLM community to create more advanced models and deploy them easily for performance.” They expressed enthusiasm about the new features and the potential for the next generation of innovative models that will utilize TensorRT-LLM, fostering further advancements in LLM workloads.
With the support of ReDrafter, AI applications are becoming increasingly efficient and accessible to developers and organizations of all sizes. These technological advancements are poised to significantly influence the future of machine learning, paving the way for breakthroughs in the field.