Comparison of statistical and neural networks-based methods in analysis of significance and interaction of manufacturing processes parameters
Marcin Perzyk, Jacek Kozłowski
Warsaw University of Technology.
DOI:
https://doi.org/10.7494/cmms.2006.2.0100
Abstract:
Due to development of computer techniques, large amounts of data are collected and stored in many manufacturing companies, related to designs, products, equipment, materials, manufacturing processes etc. This data can be a source of a valuable information. The extracting useful knowledge from that data, using intelligent and partly automated techniques, is called data mining. Until now, data mining has been primarily used in business area. Applications to manufacturing and design problems are seldom. Some important problems that can be solved through extracting knowledge from a recorded past data in a manufacturing company include: detection of causes of deteriorating product quality, prediction of a break-downs of machines, indication of optimal or critical process parameters and their combinations. These problems can be solved by determining relative significance factors of input variables as well as interaction coefficients between them. Many of the data mining tasks can be performed using different methods. In general, complex problems, about which no knowledge is available, require learning systems – type models. The statistical methods can be used for less complex tasks and for those about which at least a general character of dependencies is known (e.g. reasonable assumptions about their linearity and occurrence of interactions between variables can be made). The purpose of the present work was a comparison of some statistical (ANOVA, contingency tables, polynomial approximation) and neural network based methods. The general methodology employed in this research is based on utilization of simulated data sets containing assumed hidden relationships between variables. The various types of significance factors were also evaluated for some industrial problems, related to influence of alloying components and process parameters on tensile strength of ductile cast iron. It was found that the best performance exhibit relative significance factors for single parameters as well interaction coefficients, based on interrogation of trained neural networks. They were calculated according to special rocedures, developed by the authors. The results obtained for the industrial data sets confirmed good properties of significance factor of that type. The ANOVA based factors, utilized in some commercial data mining software, essentially underestimate the significance of less important variables while the contingency based factors overestimate them. The ANOVA based factors also exhibit much higher sensitivity to the noise existing in the data. Although the performance of some techniques discussed in the project is satisfactory, a vast further work is needed. The main goals include development of procedures and tools for cleaning and preparation of rough data, further analysis of behavior of various data mining methods and development of improved definitions of significance and interaction of variables (e.g. detection of synergetic action of more than two variables) as well as development of the software oriented at manufacturing problems.
Cite as:
Perzyk, M., Kozłowski, J. (2006). Comparison of statistical and neural networks-based methods in analysis of significance and interaction of manufacturing processes parameters. Computer Methods in Materials Science, 6(2), 81 – 93. https://doi.org/10.7494/cmms.2006.2.0100
Article (PDF):
Keywords:
Data mining, Manufacturing processes, Significance of parameters, Statistical methods, Artificial neural networks
References: