Hadoop distributions can be combination of 25+ open source projects. Enterprise adoptions have various kinds of workloads, environments with vectors like Operating systems, JDK, Database, Security, Ranger Authorization, Encryption, TDE and so on. Ensuring quality for a complex stack and the combinations can be overwhelming.
In this talk we will cover details of technologies involved in automated validation of the stack. Our testing journey begins with ingestion of commits from apache and meets the finish line as we GA the stack distribution. As we speak about this journey, we will walk through how quality is established at various stages like commit, nightly testing, pre prod and readiness. We will go over the challenges we face as we cater to several releases of major, maintenance, hot-fixes all at the same time , how we tackled them with the YARN on YARN infrastructure, using test methodologies to bring efficiencies and how LOG AI comes to the rescue. We will conclude with talk with a case study of end to end workflow test