Be among the first to get the latest insights from LEI’s Lean Product and Process Development (LPPD) thought leaders and practitioners. This article was delivered to subscribers of The Design Brief, LEI’s newsletter devoted to improving organizations’ innovation capability.
Software is eating the world. That means the software industry cannot keep producing the estimated average of 10 defects per 1,000 lines of code.
One solution is to adopt the standards of the aerospace industry. The software engineering team for the space shuttle produced one defect in 400,000 lines of code, a 4000x better quality standard. But at what cost? Their processes include writing four pages of specifications for every ten lines of code. This is not a scalable approach to writing the volume of software the world requires.
Looking for alternative solutions to achieve quality and scale, we stumbled upon Sadao Nomura’s book The Toyota Way of Dantotsu Radical Quality Improvement, where he describes his incredible quality achievements at Toyota Logistics & Forklift (TL&F).
Sadao Nomura had been at Toyota Motor Corporation since 1965 when executives assigned him to improve quality at TL&F in 2006. It was not his first time leading quality improvement programs; he had successfully turned around a GM plant in Australia and helped Toyota South Africa achieve the quality levels Toyota HQ needed to authorize global export. He did everything from the inside, building strong relationships with teams over many years.
At TL&F, he served as an advisor to seven plants across five countries, most of which the company had recently acquired. He started the typical lean way by frequently going to see at the gemba. He captured his problem-solving insights on A3s and shared them with management. But no change happened. It seemed no one was paying attention to his advice. It didn’t help that the plants’ quality was relatively good compared to industry standards.
Nomura tried twice more to share his wisdom without success. After the third attempt failed and a year passed, he changed his strategy to make sure quality would become a priority for everyone. With support from headquarters, he created a program called “Dantotsu Quality Activities.” Dantotsu is a Japanese term that means “extreme,” “radical,” or “unparalleled.” The program aimed to motivate and train workers to achieve the ambitious goal of halving defects yearly. Through relentless adherence to dantotsu activities, the team should reach the three-year target of reducing defects by 88 percent.
Teams on the ground had seen quality programs fail before, so they only half-trusted this new one. However, they realized the need for something different, and Nomura’s dantotsu approach was decidedly different. By obsessively focusing on improving quality, he finally brought change to the factories. After eight years, the seven plants reduced defects between 91-98 percent. Raymond Corporation, the U.S. plant, won the “Best Plant Award” from Industry Week magazine.
Nomura’s story is about improving quality in manufacturing plants. Nonetheless, it inspired Theodo to adopt a right-first-time approach in software engineering. Here are three ideas we took from the book and transposed to software.
A new approach to measuring defects
The way to improve quality is straightforward – decrease the number of defects. But the simplest way to decrease the number of defects is to expend less effort looking for them. To avoid this, Nomura categorizes issues by detection stage and emphasizes not reducing the number of defects but detecting them as early as possible in production. We applied this to software engineering with the following detection stages:
- Stage A if it was detected by the developer in a final review before pushing the code;
- Stage B if it was detected by someone else on the team or by the continuous integration pipeline before reaching an internal customer;
- Stage C if it was detected after reaching an internal customer (product owner, QA, etc.) and before pushing to production;
- Stage D if it was detected after pushing to production, where it could have affected an external customer, and before receiving a complaint;
- Stage E if it resulted in a customer complaint.
This categorization provides a healthy target. Teams strive to detect defects in stages A and B before they affect end-users. It is also easier and cheaper to fix defects in these stages. By doing so, teams can avoid defects in stages D and E before they impact end-users. This is known as a shift-left approach.
Systematically analyzing defects
By having a systematic approach to analyzing the defects they produce, teams can quickly identify the source of quality problems and how to prevent them. It also helps the team leaders frame the quality challenge as a learning opportunity. Analyzing defects reveals knowledge gaps that can then be addressed with training.
Nomura’s book has significantly improved our approach to analyzing defects, including settling an old debate about whether to focus on preventing a defect or detecting it earlier. Nomura’s answer is straightforward: we should analyze both how the team could have prevented the defect and how they could have detected it earlier.
Adoption of weak point management
By systematically analyzing defects, teams start to see patterns and identify categories of causes. Nomura calls these “weak points.” Once teams clarify weak points, they can choose one to address and eradicate once and for all.
For example, we had been suffering for quite some time from intermittent failures in the automated testing of our code. These were not related to underlying issues in our code but to deeper issues in the Jest open-source testing library we were using. The teams had dismissed them as “unavoidable flakiness.” Since it was a known problem affecting many of our teams and others worldwide, we decided to investigate further. It was a hard problem to solve. But with focused effort, we devised a permanent fix, which we contributed to the open-source library. The problem is now permanently solved not only for our teams but for all the library’s other users, not to mention the energy savings that would result from preventing millions of wasted CPU cycles.
After two years of deploying such learnings across Theodo, 80 percent of our projects now measure the number of defects categorized by detection stages A to E. We have refined a standard for effectively analyzing those defects to help tech leads adopt it within their teams. And we are working on making defect analysis part of the team’s routine to accelerate their learning and identify the recurring problems that would benefit from an organizational solution.
A few teams are even experimenting with systematic defect analysis at Stage A. Teams mark a code contribution as defective if it fails at the first human check. This is an original approach, as engineers code iteratively with multiple rounds of writing code and visually checking that it works. But those teams decided to aim for right-first-time code. The results are promising. One team built a medical application with 6,000 lines of code and delivered only two defects in production. That’s 30 times less than the industry average without having to document every line of code over hundreds of pages.
We are still early in our journey of transposing Sadao Nomura’s book to software, but seeing such dramatic improvements has been inspiring. It is another example of how lean is an indispensable source of learning – no matter the industry – when it comes to achieving quality at scale.
Download the latest issue of the Design Brief.
Designing the Future
An Introduction to Lean Product and Process Development.