Ara2 Runs Generative AI On Edge
Author: Anand Joshi
The new Ara2 neural-processor unit (NPU) from Kinara, which means edge in Hindi, provides an 8× increase in performance over the first-generation chip and can run large language workloads. The chip targets retail, warehouse robotics, industrial, and PC/laptop markets.
The new chip enhances the compute element to support more data types and delivers an estimated 20 TOPS at less than 6 W TDP. The chip includes 4 MB SRAM and supports 16 GB DRAM to support large language models (LLMs). The chip is currently sampling and will hit production in H2 2024.
Kinara, headquartered in Los Altos, California, was formerly known as Deep Vision. It was founded by CTO Rehan Hameed and chief architect Wajahat Qadeer in 2014 to take their work begun at Stanford University to industry. The company has raised $52 million in funding over two rounds.
The company sampled the first Ara1 silicon in Q2 2019. It has not announced any customer names, although it says it has found traction in the retail cashier-less checkout market, and a top-10 retailer is its customer. The company plans to sell USB, M.2, and PCIe form-factor cards in addition to chips.