Michael’s retrospectives are always interesting – lots of good stuff in this one. In this one, he details the pain of building a production system on top of Debian (rPI). He started using Ansible to install everything, but that proved to not be a good fit. Ansible is great for small scale server deployments (I use it). But once you get to the production line, it is too slow. They are now in the process of switching to Debian packages, but it appears this is not simple either.
With Embedded Linux, there are many ways to do things. But one pattern I see over and over again is people not using Yocto, because it is too hard. I agree, it is hard – when things don’t work it can be nearly impossible. However, the alternatives don’t look all that attractive either. At least with Yocto, it was built from the ground up with goal of building custom Embedded Linux images. And when you are shipping products based on Embedded Linux, that is exactly what you are doing – there are no shortcuts. You can pay now, or pay later – take your choice. If you want to scale, you have to do the hard work of build/deployment systems that scale. Yocto has excellent tooling for building custom kernels and your custom applications. The Yoe distribution adds sane defaults, a simple but robust update mechanism, etc. Once you get a Yocto build set up, you can easily build production images for years with a single command. Yocto is high in initial development costs, but ongoing production costs are much lower.
One similarity I’ve noticed with both Ansible and Yocto (both written in Python) – when things are not working, the error messages are in the form of a long, hard to decipher Python stacktrace – on the scale of some heavily templatized C++ code that won’t compile. This probably indicates Python is probably not the best language to implement tooling like this. It may have been the best option at the time, but there are likely better ways to do things now.