Production AI Inferencing has server, accelerator, and infrastructure requirements for optimal performance but often uses expensive and proprietary data center infrastructures. A scale-out Inferencing infrastructure design requires optimum use of general-purpose compute elements married with networking, security and storage elements. Leveraging the innovative Yosemite server design, we show how increasingly efficient, performant, secure and open designs can be achieved.