This paper presents a new heuristic algorithm, called extendedrough set theory, for reduct selection in rough set theory (RST) applications. This algorithm is efficient and quick in selecting reducts especially if the problem size is large. The algorithm is able to derive the rules and identify the most significant features simultaneously, which is unique and useful in solving quality control problems. A detailed comparison between traditional statistical methods, the RST approach, and the extended RST approach is presented. The developed algorithm is applied to an industrial case study involving quality control of printed circuit boards (PCBs). The case study addresses one of the common quality problems faced in the PCB manufacturing, namely, solder ball defects. Several features that cause solder ball defects were identified and the features that significantly impact the quality were considered in this case study. Two experiments with equal and unequal weights were conducted and the results were compared. The end result of the extended RST investigation is a set of decision rules that shows the cause for the solder ball defects. The rules help to discriminate the good and bad parts to predict defective PCBs. A large sample of 3,568 PCBs was used to derive the set of rules. Results from the extended RST are very encouraging compared to statistical approaches. The rules derived from the data set provide an indication of how to effectively study this problem in further investigations. This paper forms the basis for solving many other similar problems that occur in manufacturing and service industries.
In modern manufacturing environments, vast amounts of data are collected in database management
systems and data warehouses from all involved areas, such as product and process design, assembly,
materials planning and control, order entry and scheduling, maintenance, recycling, and so on. Many
knowledge-based components have also been added to (semi)-automate certain steps. Examples are expert systems for decision support, intelligent scheduling systems for concurrent production, fuzzy
controllers, etc. A persistent problem is the gathering of required expert knowledge to implement
knowledge-based components. Data mining provides some solutions to this knowledge acquisition
problem. Data mining is the process of extracting and refining knowledge from large databases (Berry
and Linoff 1997; Dhar and Stein 1997; Cheung et al. 1996). It is a process that uses a variety of data
analysis tools to discover patterns and relationships in data. The extracted information can be used to
predict, classify, model, and summarize the data being mined. There is a wide range of scenarios within manufacturing environments in which data mining has been applied successfully. Fault diagnosis is one area where data mining is applied more often. Error rates at a manufacturing process are used as input to identify knowledge for further assistance to engineers. Identifying patterns, which indicate the potential failure of a component or machine, is another potential exercise of data mining. Other relevant areas include machine maintenance, process and quality control, and process analysis. Texas Instruments has isolated faults during semiconductor manufacturing using automated discovery from wafer tracking databases (Saxena 1993). Associations are created to identify interrelationships among processing steps, which can isolate faults during the manufacturing processes. Apte, Weiss, and Grout (1993) facilitated five classification methods to predict defects in hard drive manufacturing.
In this paper, a data mining technique using a new heuristic algorithm for reduct selection in rough set
theory (RST), called extended RST, will be applied to solve a quality control problem in the printed circuit board (PCB) manufacturing process. Literature review suggests that RST has not been widely applied to quality control problems, thus making this research novel. In RST, each object is characterized by attributes, and RST discovers the dependencies between them. Compared to the usual statistical tools with a population-based approach, RST uses an individual object model based approach that provides a very good tool for analyzing quality control problems (Kusiak 2001). While promising, the RST approach is computationally intensive in selecting the required reducts. The proposed extended rough set theory approach addresses this challenge of reduct selection by using a weighed approach. Furthermore, the extended RST is able to identify "defective" and "significant factors" simultaneously, which is unique and useful in solving quality control problems. The
efficiency of this algorithm is tested with an industrial case study.
This paper presents a new heuristic algorithm for reduct selection in RST, called extended RST. This
(854 PCBs) and 2nd set (910 PCBs). algorithm is applied to a quality control problem in PCB manufacturing. Several features that cause solder ball defects were identified, and the features that
significantly impact the quality were considered in this case study. Two experiments with equal and unequal weights were conducted and the results are compared. The end result of the extended RST investigation is a set of decision rules that is very easy to comprehend. Quality engineers should apply those elicited decision rules to solve the solder ball problem. For example, high-accuracy rules should be used to distinguish good and defective PCBs, Moreover, quality engineers should focus their efforts to control significant features. Specifically, conditions of the features (factors) that result in solder-ball-free
boards need to be maintained and given more attention. A detailed comparison between traditional statistical methods and the rough set theory approach was presented. It is concluded that RST is best suited for analysis of imprecise (noisy) data; this ability at the pre-processing stage could include removal of these data, methods of averaging for missing items of data, and so on. RST also avoids the costly experimentation process involved in conducting a DOE analysis. Furthermore, the knowledge extracted from the rules can be integrated with expert systems and other Web-based application software. The rules derived from the data set provide an indication of how to study this problem further and pave a path for effective further investigation.