PD ISO/TR 16705:2016 Statistical methods for implementation of Six Sigma — Selected illustrations of contingency table analysis BSI Standards Publication WB11885_BSI_StandardCovs_2013_AW.indd 1 15/05/2013 15:06PD ISO/TR 16705:2016 PUBLISHED DOCUMENT National foreword This Published Document is the UK implementation of ISO/TR 16705:2016. The UK participation in its preparation was entrusted to Technical Committee MS/6, Methodologies for business process improvement using statistical methods. A list of organizations represented on this committee can be obtained on request to its secretary. This publication does not purport to include all the necessary provisions of a contract. Users are responsible for its correct application. © The British Standards Institution 2016. Published by BSI Standards Limited 2016 ISBN 978 0 580 74469 3 ICS 03.120.30 Compliance with a British Standard cannot confer immunity from legal obligations. This Published Document was published under the authority of the Standards Policy and Strategy Committee on 31 August 2016. Amendments issued since publication Date Text affectedPD ISO/TR 16705:2016 © ISO 2016 Statistical methods for implementation of Six Sigma — Selected illustrations of contingency table analysis Méthodes statistiques pour l’implémentation de Six Sigma — Exemples sélectionnés d’application de l’analyse de tableau de contingence TECHNICAL REPORT ISO/TR 16705 Reference number ISO/TR 16705:2016(E) First edition 2016-08-15PD ISO/TR 16705:2016ISO/TR 16705:2016(E)ii © ISO 2016 – All rights reserved COPYRIGHT PROTECTED DOCUMENT © ISO 2016, Published in Switzerland All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester. ISO copyright office Ch. de Blandonnet 8 • CP 401 CH-1214 Vernier, Geneva, Switzerland Tel. +41 22 749 01 11 Fax +41 22 749 09 47

[email protected] www.iso.orgPD ISO/TR 16705:2016ISO/TR 16705:2016(E)Foreword iv Introduction v 1 Scope . 1 2 Normative references 1 3 T erms and definitions . 1 4 Symbols and abbreviated terms . 2 5 General description of contingency table analysis . 2 5.1 Overview of the structure of contingency table analysis . 2 5.2 Overall objectives of contingency table analysis 3 5.3 List attributes of interest 3 5.4 State a null hypothesis 3 5.5 Sampling plan. 3 5.6 Process and analyse data . 4 5.6.1 Chi-squared test 4 5.6.2 Linear trend test . 6 5.6.3 Correspondence analysis 6 5.7 Conclusions 7 6 Description of Annexes A through D 7 Annex A (informative) Distribution of number of technical issues found after product r elease t o the field. 8 Annex B (informative) People’s perception about contented life .15 Annex C (informative) Customer satisfaction research on a brand of beer 20 Annex D (informative) Proportions of nonconforming parts of production lines .26 Bibliography .31 © ISO 2016 – All rights reserved iii Contents PagePD ISO/TR 16705:2016ISO/TR 16705:2016(E) Foreword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. The procedures used to develop this document and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives). Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent rights identified during the development of the document will be in the Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents). Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement. For an explanation on the meaning of ISO specific terms and expressions related to conformit y assessment, as well as information about ISO’s adherence to the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see the following URL: www.iso.org/iso/foreword.html. The committee responsible for this document is ISO/TC 69, Applications of statistical methods, Subcommittee SC 7, Applications of statistical and related techniques for the implementation of Six Sigma.iv © ISO 2016 – All rights reservedPD ISO/TR 16705:2016ISO/TR 16705:2016(E) Introduction The Six Sigma and international statistical standards communities share a philosophy of continuous improvement and many analytical tools. The Six Sigma community tends to adopt a pragmatic approach driven by time and resource constraints. The statistical standards community arrives at rigorous documents through long-term international consensus. The disparities in time pressures, mathematical rigor, and statistical software usage have inhibited exchanges, synergy, and mutual appreciation between the two groups. The present document takes one specific statistical tool (Contingency Table Analysis), develops the topic somewhat generically (in the spirit of International Standards), then illustrates it through the use of several detailed and distinct applications. The generic description focuses on the commonalities across studies designed to assess the association of categorical variables. The Annexes containing illustrations do not only follow the basic framework, but also identify the nuances and peculiarities in the specific applications. Each example will offer at least one “winkle” to the problem, which is generally the case for real Six Sigma and other fields application.© ISO 2016 – All rights reserved vPD ISO/TR 16705:2016PD ISO/TR 16705:2016Statistical methods for implementation of Six Sigma — Selected illustrations of contingency table analysis 1 Scope This document describes the necessary steps for contingency table analysis and the method to analyse the relation between categorical variables (including nominal variables and ordinal variables). This document provides examples of contingency table analysis. Several illustrations from different fields with different emphasis suggest the procedures of contingency table analysis using different software applications. In this document, only two-dimensional contingency tables are considered. 2 Normative references There are no normative references in this document. 3 Terms a nd definiti ons For the purposes of this document, the terms and definitions given in ISO 3534-1 and ISO 3534-2 and the following apply. ISO and IEC maintain terminological databases for use in standardization at the following addresses: — IEC Electropedia: available at http://www.electropedia.org/ — ISO Online browsing platform: available at http://www.iso.org/obp 3.1 categorical variable variable with the measurement scale consisting of a set of categories 3.2 nominal data variable with a nominal scale of measurement [SOURCE: ISO 3534-2:2006, 1.1.6] 3.3 ordinal data variable with an ordinal scale of measurement [SOURCE: ISO 3534-2:2006, 1.1.7] 3.4 contingency table tabular representation of categorical data, which shows frequencies for particular combinations of values of two or more discrete random variables Note 1 to entry: A table that cross-classifies two variables is called a “two-way contingency table;” the one that cross-classifies three variables is called a “three-way contingency table.” A two-way table with r rows and c columns is also named “r × c table.” EXAMPLE Let n items be classified by categorical variables X and Y with levels X 1 , X 2and Y 1 , Y 2 , respectively. The number of items with both attribute X iand Y jis n ij . Then, a 2 × 2 table is as follows. TECHNICAL REPORT ISO/TR 16705:2016(E) © ISO 2016 – All rights reserved 1PD ISO/TR 16705:2016ISO/TR 16705:2016(E) Table 1 — 2 × 2 contingency table Variable X Variable Y Y 1 Y 2 X 1 n 11 n 12 X 2 n 21 n 22 3.5 p-value probability of observing the observed test statistic value or any other value at least as unfavorable to the null hypothesis [SOURCE: ISO 3534-1:2006, 1.49] 4 Symbols and abbreviated terms H 0 null hypothesis H a alternative hypothesis χ 2 Chi-square statistic G 2 likelihood-ratio statistic n total number of cell count r × c table contingency table with r rows and c columns DF degree of freedom 5 General description of contingency table analysis 5.1 Overview of the structure of contingency table analysis This document provides general guidelines on the design, conduct, and analysis of contingency table analysis and illustrates the steps with distinct applications given in Annexes A through D. Each of these examples follows the basic structure given in Table 2. Table 2 — Basic steps for contingency table analysis 1 State the overall objective 2 List attributes of interest 3 State a null hypothesis 4 Sampling plan 5 Process and analyse data 6 Accept or reject the null hypothesis (Conclusions) Contingency table analysis is used to assess the association of two or more categorical variables. This document focuses on two-way contingency table analysis, which only considers the relation of two categorical variables. Particular methods for three or more categorical variables analysis are not included in this document. The steps given in Table 1 provide general techniques and procedures for contingency table analysis. Each of the six steps is explained in general in 5.2 to 5.7.2 © ISO 2016 – All rights reservedPD ISO/TR 16705:2016ISO/TR 16705:2016(E) 5.2 Overall objectives of contingency table analysis Contingency table analysis can be employed in Six Sigma 1)projects in the “Analyse” phase of DMAIC methodologies, and often used in sampling survey, social science and medical research, etc. Apart from the usual statistical methods focusing on continuous variables, contingency table analysis mainly handles the categorical data, including nominal data and ordinal data. In the case that the observed value is the frequency of certain combinations of several objective conditions, but not the continuous value from the equipment, the contingency table analysis is needed. The primary motivation of this method is to test the association of categorical variables, including the following situations: a) to assess whether an observed frequency distribution differs from a theoretical distribution; b) to assess the independence of two categorical variables; c) to assess the homogeneity of several distributions of same type; d) to assess the trend association of observations on ordinal variables; e) to assess extensive association between levels of categorical variables. 5.3 List attributes of interest This document considers the association of two categorical variables based on the observed frequency of the characteristic corresponding to combinations of different levels of attributes of interest. If the association between quantitative variable and categorical variable is of interest (e.g. cup size versus surface decoration), it is necessary to divide quantitative data into ordinal classes (e.g. small, medium, large). 5.4 State a null hypothesis This document is to determine whether row variable and column variable are independent. The null hypothesis for Chi-square test is H 0 : the row variable and column variable are independent; and the alternative hypothesis is H a : the row variable and column variable are not independent. 5.5 Sampling plan In the sampling plan for contingency table analysis, variables and the levels should be determined first. For two-way contingency tables, there are four possible sampling plans to generate the tables. a) The total number of cell count n is not fixed. b) The total number of cell count n is fixed, but none of the total rows or columns are fixed. c) The total number of cell count n is fixed, and either the row marginal totals or the column marginal totals are fixed; d) The total number of cell count n is fixed, and both row marginal totals and the column marginal totals are fixed. 1) Six Sigma is the trademark of a product supplied by Motorola, Inc. This information is given for the convenience of users of this document and does not constitute an endorsement by ISO of the product named. Equivalent products may be used if they can be shown to lead to the same results.© ISO 2016 – All rights reserved 3PD ISO/TR 16705:2016ISO/TR 16705:2016(E) The aforementioned four sampling plans correspond to different purposes of categorical data analysis. Case a) is a random sampling, that all frequency numbers are independent. For example, the number of customers entering a supermarket during the day is a random variable. The customers are divided into four classes based on their gender and whether they are shopping or not (male/shopping, male/no shopping, female/shopping, female/no shopping). These four numbers form a contingency table. Case b) is applicable to a sampling survey where the sample size is fixed. Case c) is usually an analysis of a comparative analysis. For example, when conducting a research on the relationship of lung cancer and smoking, a group of patients with lung cancer and a group of healthy people with similar age, gender, and other physical condition are chosen for the research. The total number of people in each group is fixed. Case d) is another test of attribute agreement analysis, usually used to test whether the results from two measurement systems are consistent with each other. For attribute agreement analysis, one can refer to ISO 14468. The calculated statistics of the test of independence for the first three cases are the same. Randomization is very important when sampling for experiments. The observations in each cell are made on a random sample. When it is inconvenient or difficult to attain adequate samples, one should pay close attention to any confounding factors that may affect the results of the analysis. Table 3 shows a two-way contingency table with r levels of variable X and c levels of Y. The observed frequency of each combination of the two variables is n ij(i =1,…, r, j=1,…,c). Table 3 — Layout of a generic r × c continge