Apple has shifted from its established Integrated Fan-Out Package-on-Package (InFO-PoP) design to a newer Wafer-Level Multi-Chip Module (WMCM) packaging for its upcoming A20 Pro chipset. This change addresses the thermal and bandwidth constraints that previously hindered on-device AI performance.

The PoP packaging stacked DRAM directly on the silicon die, which caused thermal bottlenecks during intensive tasks involving both the processor and memory. Despite cooling solutions like vapor chambers, the design limited the efficiency of machine learning processes by constraining heat dissipation and data throughput.

With the A20 Pro’s WMCM approach, Apple plans to physically separate the DRAM from the chipset die. This separation reduces thermal stress on both components and improves heat dissipation, especially when executing AI workloads. Early leaks indicate the inclusion of a larger Neural Engine and a bigger vapor chamber, further enhancing AI capabilities and temperature management.

Another potential boost comes from the memory upgrade. The A20 Pro might transition to 96-bit LPDDR6 RAM, offering higher bandwidth compared to the previous LPDDR5X standards, though this awaits official confirmation.

This packaging evolution comes amid a broader industry shift influenced by the rise of AI applications requiring faster, more thermally efficient chip designs. While Apple has been cautious in fully committing to extensive on-device AI, this packaging refinement suggests a readiness to support more demanding AI features.

It is also relevant that similar packaging improvements have appeared in Apple’s M5 Pro and M5 Max processors, indicating a company-wide move away from the performance ceiling imposed by legacy PoP designs. This adjustment parallels moves by competitors like Samsung, which is upgrading its Exynos line to cope with similar demands.

Although AI growth has accelerated this packaging innovation, the limitations of PoP packaging likely influenced Apple's decision. The industry trend points toward modular, thermally optimized chip architectures as a foundational step to sustain the next generation of AI-driven mobile devices.