Apple disclosed information about a collaboration with computer tech corporation NVIDIA to enhance the LLM or large language models by using a new technique of text generation that provides considerable improvements in speed to be applied to AI applications.
Advertisements
In November, Apple issued a ReDrafter or an open source Recurrent Drafter, a technique that combines dynamic tree attention and beam search to speed up the generation of text. Tree attention eliminates any unnecessary overlaps in the sequences enhancing its efficiency, while the beam navigates through possible text sequences and looks for better results.
Apple now utilized the technology needed into the TensorRT-LLM structure made by NVIDIA improving the large language models that run on the GPUs on NVIDIA, reducing the latency users are experiencing as well as the power consumption and usage of the GPU. For interested developers, more innformation can be found on the NVIDIA Developer Blog and Apple Website.
Advertisements