# ASTM G16 - 13 (2019)

Designation G16 13 Reapproved 2019Standard Guide forApplying Statistics to Analysis of Corrosion Data1This standard is issued under the fixed designation G16; the number immediately following the designation indicates the year of originaladoption or, in the case of revision, the year of last revision.Anumber in parentheses indicates the year of last reapproval.Asuperscriptepsilon indicates an editorial change since the last revision or reapproval.1. Scope1.1 This guide covers and presents briefly some generallyaccepted s of statistical analyses which are useful in theinterpretation of corrosion test results.1.2 This guide does not cover detailed calculations ands, but rather covers a range of approaches which havefound application in corrosion testing.1.3 Only those statistical s that have found wideacceptance in corrosion testing have been considered in thisguide.1.4 The values stated in SI units are to be regarded asstandard. No other units of measurement are included in thisstandard.1.5 This international standard was developed in accor-dance with internationally recognized principles on standard-ization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recom-mendations issued by the World Trade Organization TechnicalBarriers to Trade TBT Committee.2. Referenced Documents2.1 ASTM Standards2E178 Practice for Dealing With Outlying ObservationsE691 Practice for Conducting an Interlaboratory Study toDetermine the Precision of a Test G46 Guide for Examination and uation of Pitting Cor-rosionIEEE/ASTM SI 10 American National Standard for Use ofthe International System of Units SI The Modern MetricSystem3. Significance and Use3.1 Corrosion test results often show more scatter thanmany other types of tests because of a variety of factors,including the fact that minor impurities often play a decisiverole in controlling corrosion rates. Statistical analysis can bevery helpful in allowing investigators to interpret such results,especially in determining when test results differ from oneanother significantly. This can be a difficult task when a varietyof materials are under test, but statistical s provide arational approach to this problem.3.2 Modern data reduction programs in combination withcomputers have allowed sophisticated statistical analyses ondata sets with relative ease. This capability permits investiga-tors to determine if associations exist between many variablesand, if so, to develop quantitative expressions relating thevariables.3.3 Statistical uation is a necessary step in the analysisof results from any procedure which provides quantitativeination. This analysis allows confidence intervals to beestimated from the measured results.4. Errors4.1 DistributionsIn the measurement of values associatedwith the corrosion of metals, a variety of factors act to producemeasured values that deviate from expected values for theconditions that are present. Usually the factors which contrib-ute to the error of measured values act in a more or less randomway so that the average of several values approximates theexpected value better than a single measurement. The patternin which data are scattered is called its distribution, and avariety of distributions are seen in corrosion work.4.2 HistogramsA bar graph called a histogram may beused to display the scatter of the data. A histogram isconstructed by dividing the range of data values into equalintervals on the abscissa axis and then placing a bar over eachinterval of a height equal to the number of data points withinthat interval. The number of intervals should be few enough sothat almost all intervals contain at least three points; however,there should be a sufficient number of intervals to facilitatevisualization of the shape and symmetry of the bar heights.Twenty intervals are usually recommended for a histogram.Because so many points are required to construct a histogram,it is unusual to find data sets in corrosion work that lendthemselves to this type of analysis.1This guide is under the jurisdiction of ASTM Committee G01 on Corrosion ofMetals and is the direct responsibility of Subcommittee G01.05 on LaboratoryCorrosion Tests.Current edition approved Feb. 15, 2019. Published February 2019. Originallyapproved in 1971. Last previous edition approved in 2013 as G16 13. DOI10.1520/G0016-13R19.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume ination, refer to the standards Document Summary page onthe ASTM website.Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United StatesThis international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade TBT Committee.14.3 Normal DistributionMany statistical techniques arebased on the normal distribution. This distribution is bell-shaped and symmetrical. Use of analysis techniques developedfor the normal distribution on data distributed in anothermanner can lead to grossly erroneous conclusions.Thus, beforeattempting data analysis, the data should either be verified asbeing scattered like a normal distribution, or a transationshould be used to obtain a data set which is approximatelynormally distributed. Transed data may be analyzed sta-tistically and the results transed back to give the desiredresults, although the process of transing the data back cancreate problems in terms of not having symmetrical confidenceintervals.4.4 Normal Probability PaperIf the histogram is notconfirmatory in terms of the shape of the distribution, the datamay be examined further to see if it is normally distributed byconstructing a normal probability plot as described as follows1.34.4.1 It is easiest to construct a normal probability plot ifnormal probability paper is available. This paper has one linearaxis, and one axis which is arranged to reflect the shape of thecumulative area under the normal distribution. In practice, the“probability” axis has 0.5 or 50 at the center, a numberapproaching 0 percent at one end, and a number approaching1.0 or 100 at the other end. The marks are spaced far apartin the center and close together at the ends. A normalprobability plot may be constructed as follows with normalprobability paper.NOTE 1Data that plot approximately on a straight line on theprobability plot may be considered to be normally distributed. Deviationsfrom a normal distribution may be recognized by the presence ofdeviations from a straight line, usually most noticeable at the extreme endsof the data.4.4.1.1 Number the data points starting at the largest nega-tive value and proceeding to the largest positive value. Thenumbers of the data points thus obtained are called the ranks ofthe points.4.4.1.2 Plot each point on the normal probability paper suchthat when the data are arranged in order y 1, y 2, y 3, .,these values are called the order statistics; the linear axisreflects the value of the data, while the probability axis locationis calculated by subtracting 0.5 from the number rank of thatpoint and dividing by the total number of points in the data set.NOTE 2Occasionally two or more identical values are obtained in aset of results. In this case, each point may be plotted, or a composite pointmay be located at the average of the plotting positions for all the identicalvalues.4.4.2 If normal probability paper is not available, thelocation of each point on the probability plot may be deter-mined as follows4.4.2.1 Mark the probability axis using linear graduationsfrom 0.0 to 1.0.4.4.2.2 For each point, subtract 0.5 from the rank and dividethe result by the total number of points in the data set. This isthe area to the left of that value under the standardized normaldistribution. The cumulative distribution function is thenumber, always between 0 and 1, that is plotted on theprobability axis.4.4.2.3 The value of the data point defines its location on theother axis of the graph.4.5 Other Probability PaperIf the histogram is not sym-metrical and bell-shaped, or if the probability plot showsnonlinearity, a transation may be used to obtain a new,transed data set that may be normally distributed. Al-though it is sometimes possible to guess at the type ofdistribution by looking at the histogram, and thus determine theexact transation to be used, it is usually just as easy to usea computer to calculate a number of different transationsand to check each for the normality of the transed data.Some transations based on known non-normaldistributions, or that have been found to work in somesituations, are listed as followsy logxy exp xy5xy x2y 1xy5sin21x/nwherey transed datum,x original datum, andn number of data points.Time to failure in stress corrosion cracking usually is bestfitted with a log x transation 2, 3.Once a set of transed data is found that yields anapproximately straight line on a probability plot, the statisticalprocedures of interest can be carried out on the transeddata. Results, such as predicted data values or confidenceintervals, must be transed back using the reverse transfor-mation.4.6 Unknown DistributionIf there are insufficient datapoints, or if for any other reason, the distribution type of thedata cannot be determined, then two possibilities exist foranalysis4.6.1 A distribution type may be hypothesized based on thebehavior of similar types of data. If this distribution is notnormal, a transation may be sought which will normalizethat particular distribution. See 4.5 above for suggestions.Analysis may then be conducted on the transed data.4.6.2 Statistical analysis procedures that do not require anyspecific data distribution type, known as non-parametrics, may be used to analyze the data. Non-parametric testsdo not use the data as efficiently.4.7 Extreme Value AnalysisIn the case of determining theprobability of perforation by a pitting or cracking mechanism,the usual descriptive statistics for the normal distribution arenot the most useful. In this case, Guide G46 should beconsulted for the procedure 4.4.8 Significant DigitsIEEE/ASTM SI 10 should be fol-lowed to determine the proper number of significant digitswhen reporting numerical results.4.9 Propagation of VarianceIf a calculated value is afunction of several independent variables and those variableshave errors associated with them, the error of the calculated3The boldface numbers in parentheses refer to a list of references at the end ofthis standard.G16 13 20192value can be estimated by a propagation of variance technique.See Refs 5 and 6 for details.4.10 MistakesMistakes either in carrying out an experi-ment or in calculations are not a characteristic of the populationand can preclude statistical treatment of data or lead toerroneous conclusions if included in the analysis. Sometimesmistakes can be identified by statistical s by recogniz-ing that the probability of obtaining a particular result is verylow.4.11 Outlying ObservationsSee Practice E178 for proce-dures for dealing with outlying observations.5. Central Measures5.1 It is accepted practice to employ several independentreplicate measurements of any experimental quantity toimprove the estimate of precision and to reduce the variance ofthe average value. If it is assumed that the processes operatingto create error in the measurement are random in nature and areas likely to overestimate the true unknown value as tounderestimate it, then the average value is the best estimate ofthe unknown value in question. The average value is usuallyindicated by placing a bar over the symbol representing themeasured variable.NOTE 3In this standard, the term “mean” is reserved to describe acentral measure of a population, while average refers to a sample.5.2 If processes operate to exaggerate the magnitude of theerror either in overestimating or underestimating the correctmeasurement, then the median value is usually a betterestimate.5.3 If the processes operating to create error affect both theprobability and magnitude of the error, then other approachesmust be employed to find the best estimation procedure. Aqualified statistician should be consulted in this case.5.4 In corrosion testing, it is generally observed that averagues are useful in characterizing corrosion rates. In cases ofpenetration from pitting and cracking, failure is often definedas the first through penetration and in these cases, averagepenetration rates or times are of little value. Extreme valueanalysis has been used in these cases, see Guide G46.5.5 When the average value is calculated and reported as theonly result in experiments when several replicate runs weremade, ination on the scatter of data is lost.6. Variability Measures6.1 Several measures of distribution variability are availablewhich can be useful in estimating confidence intervals andmaking predictions from the observed data. In the case ofnormal distribution, a number of procedures are available andcan be handled with computer programs. These measuresinclude the following variance, standard deviation, and coef-ficient of variation. The range is a useful non-parametricestimate of variability and can be used with both normal andother distributions.6.2 VarianceVariance, 2, may be estimated for an experi-mental data set of n observations by computing the sampleestimated variance, S2, assuming all observations are subject tothe same errorsS25d2n 2 11whered the difference between the average and the mea-sured value,n1 the degrees of freedom available.Variance is a useful measure because it is additive in systemsthat can be described by a normal distribution; however, thedimensions of variance are square of units.Aprocedure knownas analysis of variance ANOVA has been developed for datasets involving several factors at different levels in order toestimate the effects of these factors. See Section 9.6.3 Standard DeviationStandard deviation, , is definedas the square root of the variance. It has the property of havingthe same dimensions as the average value and the originalmeasurements from which it was calculated and is generallyused to describe the scatter of the observations.6.3.1 Standard Deviation of the AverageThe standarddeviation of an average, Sx, is different from the standarddeviation of a single measured value, but the two standarddeviations are related as in Eq 2Sx 5Sn2wheren the total number of measurements which were used tocalculate the average value.When reporting standard deviation calculations, it is impor-tant