Designation: G169 − 01 (Reapproved 2013)Standard Guide forApplication of Basic Statistical Methods to WeatheringTests1This standard is issued under the fixed designation G169; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon (´) indicates an editorial change since the last revision or reapproval.1. Scope1.1 This guide covers elementary statistical methods for theanalysis of data common to weathering experiments. Themethods are for decision making, in which the experiments aredesigned to test a hypothesis on a single response variable. Themethods work for either natural or laboratory weathering.1.2 Only basic statistical methods are presented. There aremany additional methods which may or may not be applicableto weathering tests that are not covered in this guide.1.3 This guide is not intended to be a manual on statistics,and therefore some general knowledge of basic and interme-diate statistics is necessary. The text books referenced at theend of this guide are useful for basic training.1.4 This guide does not provide a rigorous treatment of thematerial. It is intended to be a reference tool for the applicationof practical statistical methods to real-world problems thatarise in the field of durability and weathering. The focus is onthe interpretation of results. Many books have been written onintroductory statistical concepts and statistical formulas andtables. The reader is referred to these for more detailedinformation. Examples of the various methods are included.The examples show typical weathering data for illustrativepurposes, and are not intended to be representative of specificmaterials or exposures.2. Referenced Documents2.1 ASTM Standards:2E41 Terminology Relating To ConditioningG113 Terminology Relating to Natural and Artificial Weath-ering Tests of Nonmetallic MaterialsG141 Guide for Addressing Variability in Exposure Testingof Nonmetallic Materials2.2 ISO Documents:ISO 3534/1 Vocabulary and Symbols – Part 1: Probabilityand General Statistical Terms3ISO 3534/3 Vocabulary and Symbols – Part 3: Design ofExperiments33. Terminology3.1 Definitions—See Terminology G113 for terms relatingto weathering, Terminology E41 for terms relating to condi-tioning and handling, ISO 3534/1 for terminology relating tostatistics, and ISO 3534/3 for terms relating to design ofexperiments.3.2 Definitions of Terms Specific to This Standard:3.2.1 arithmetic mean; average—the sum of values dividedby the number of values. ISO 3534/13.2.2 blocking variable—a variable that is not under thecontrol of the experimenter, (for example, temperature andprecipitation in exterior exposure), and is dealt with byexposing all samples to the same effects3.2.2.1 Discussion—The term “block” originated in agricul-tural experiments in which a field was divided into sections orblocks having common conditions such as wind, proximity tounderground water, or thickness of the cultivatable layer.ISO 3534/33.2.3 correlation—in weathering, the relative agreement ofresults from one test method to another, or of one test specimento another.3.2.4 median—the midpoint of ranked sample values. Insamples with an odd number of data, this is simply the middlevalue, otherwise it is the arithmetic average of the two middlevalues.3.2.5 nonparametric method—a statistical method that doesnot require a known or assumed sample distribution in order tosupport or reject a hypothesis.3.2.6 normalization—a mathematical transformation madeto data to create a common baseline.1This guide is under the jurisdiction of ASTM Committee G03 on Weatheringand Durability and is the direct responsibility of Subcommittee G03.93 on Statistics.Current edition approved June 1, 2013. Published June 2013. Originallyapproved in 2001. Last previous edition approved in 2008 as G169 – 01 (2008)ε1.DOI: 10.1520/G0169-01R13.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at

[email protected] For Annual Book of ASTMStandards volume information, refer to the standard’s Document Summary page onthe ASTM website.3Available from American National Standards Institute, 11 W. 42nd St., 13thFloor, New York, NY 10036.Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States13.2.7 predictor variable (independent variable)— a variablecontributing to change in a response variable, and essentiallyunder the control of the experimenter. ISO 3534/33.2.8 probability distribution (of a random variable)—afunction giving the probability that a random variable takes anygiven value or belongs to a given set of values. ISO 3534/13.2.9 random variable—a variable that may take any of thevalues of a specified set of values and with which is associateda probability distribution.3.2.9.1 Discussion—A random variable that may take onlyisolated values is said to be “discrete.” A random variablewhich may take any value within a finite or infinite interval issaid to be “continuous.” ISO 3534/13.2.10 replicates—test specimens with nominally identicalcomposition, form, and structure.3.2.11 response variable (dependent variable)— a randomvariable whose value depends on other variables (factors).Response variables within the context of this guide are usuallyproperty measurements (for example, tensile strength, gloss,color, and so forth). ISO 3534/34. Significance and Use4.1 The correct use of statistics as part of a weatheringprogram can greatly increase the usefulness of results. A basicunderstanding of statistics is required for the study of weath-ering performance data. Proper experimental design and sta-tistical analysis strongly enhances decision-making ability. Inweathering, there are many uncertainties brought about byexposure variability, method precision and bias, measurementerror, and material variability. Statistical analysis is used tohelp decide which products are better, which test methods aremost appropriate to gauge end use performance, and howreliable the results are.4.2 Results from weathering exposures can show differ-ences between products or between repeated testing. Theseresults may show differences which are not statistically signifi-cant. The correct use of statistics on weathering data canincrease the probability that valid conclusions are derived.5. Test Program Development5.1 Hypothesis Formulation:5.1.1 All of the statistical methods in this guide are designedto test hypotheses. In order to apply the statistics, it isnecessary to formulate a hypothesis. Generally, the testing isdesigned to compare things, with the customary comparisonbeing:Do the predictor variables significantly affect theresponse variable?Taking this comparison into consideration, it is possible toformulate a default hypothesis that the predictor variables donot have a significant effect on the response variable. Thisdefault hypothesis is usually called Ho, or the Null Hypothesis.5.1.2 The objective of the experimental design and statisti-cal analysis is to test this hypothesis within a desired level ofsignificance, usually an alpha level (α). The alpha level is theprobability below which we reject the null hypothesis. It can bethought of as the probability of rejecting the null hypothesiswhen it is really true (that is, the chance of making such anerror). Thus, a very small alpha level reduces the chance inmaking this kind of an error in judgment. Typical alpha levelsare 5 % (0.05) and 1 % (0.01). The x-axis value on a plot of thedistribution corresponding to the chosen alpha level is gener-ally called the critical value (cv).5.1.3 The probability that a random variable X is greaterthan the critical value for a given distribution is writtenP(Xcv). This probability is often called the “p-value.” In thisnotation, the null hypothesis can be rejected ifP(Xcv) cv)Separate variances 3.116 4.9 0.036Pooled variances 3.000 6.0 0.024P(Xcv) indicates the probability that a Student’st-distributed random variable is greater than the cv, that is, theFIG. 1 Selecting a MethodTABLE 2 STUDENT’S t-TEST EXAMPLEColor Change Formula1.000 A1.200 A1.100 A0.900 A1.100 A1.300 B1.400 B1.200 BG169 − 01 (2013)4area under the tail of the t-distribution to the right of Point t.Since this value in either case is below a pre-chosen alpha levelof 0.05, the result is significant. Note that this result would notbe significant at an alpha level of 0.01.6.3 ANOVA:6.3.1 Analysis of Variance (ANOVA) performs comparisonslike the t-Test, but for an arbitrary number of predictorvariables, each of which can have an arbitrary number oflevels. Furthermore, each predictor variable combination canhave any number of replicates. Like all the methods in thisguide, ANOVA works on a single response variable. Thepredictor variables must be discrete. See Table 3.6.3.2 The ANOVA can be thought of in a practical sense asan extension of the t-Test to an arbitrary number of factors andlevels. It can also be thought of as a linear regression modelwhose predictor variables are restricted to a discrete set. Hereis the example cited in the t-Test, extended to include anadditional formula, and another factor. The new factor is to testwhether the resulting formulation is affected by the technicianwho prepared it. There are two technicians and three formulasunder consideration.6.3.3 This example also illustrates that one need not haveidentical numbers of replicates for each sample. In thisexample, there are two replicates per factor combination forFormula A, but no replication appears for the other formulas.Analysis of VarianceResponse variable: COLOR CHANGESourceSum ofSquaresDegrees ofFreedom Mean square F Ratio P(Xcv)Formula 0.483 2 0.241 16.096 0.025Technician 0.005 1 0.005 0.333 0.604Error 0.045 3 0.015 - -6.3.4 Assuming an alpha level of 0.05, the analysis indicatesthat the formula resulted in a significant difference in colorchange means, but the technician did not. This is evident fromthe probability values in the final column. Values below thealpha level allow rejection of the null hypothesis.6.4 Linear Regression:6.4.1 Linear regression is essentially an ANOVA in whichthe factors can take on continuous values. Since discretefactors can be set up as belonging to a subset of some largercontinuous set, linear regression is a more general method. It isin fact the most general method considered in this guide. SeeTable 4.6.4.2 The most elementary form of linear regression is easyto visualize. It is the case in which we have one predictorvariable and one response variable. The easy way to think ofthe predictor variable is as an x-axis value of a two dimensionalplot. For each predictor variable level, we can plot thecorresponding measurement (response variable) as a value onthe ordinate axis. The idea is to see how well we can fit a lineto the points on the plot. See Table 5.6.4.3 For example, the following experiment looks at theeffect of an impact modifying ingredient level on impactstrength after one year of outdoor weathering in Arizona.6.4.4 The plot of ingredient level versus retained impactstrength shown with a linear fit and 95 % confidence bandslooks like: (See Fig. 2)6.4.5 This example illustrates the use of replicates at one ofthe levels. It is a good idea to test replicates at the levels thatare thought to be important or desirable. The analysis indicatesa good linear fit. We see this from the R2value (squaredmultiple R) of 0.976. The R2value is the fraction of thevariability of the response variable explained by the regressionmodel, indicates the degree of fit to the model.6.4.6 The analysis of variance indicates a significant rela-tionship between modifier level and retained impact strength inthis test (the probability level is well below an alpha level of5 %).Linear Regression AnalysisResponse Variable: Impact Retention (%)Number of Observations: 7Multiple R: 0.988Squared Multiple R: 0.976SourceDegrees ofFreedom Sum of Squares F Ratio P(Xcv)Regression 1 0.0464 205.1 less than 0.0001Residual 5 0.0011 - -6.4.7 Regression can be easily generalized to more than onefactor, although the data gets difficult to visualize since eachfactor adds an axis to the plot (it is not so easy to viewmultidimensional data sets). It can also be adapted to nonlinearmodels.Acommon technique for achieving this is to transformdata so that it is linear. Another way is to use nonlinear leastsquares methods, which are beyond the scope of this guide.Regression can also be extended to cover mixed continuousTABLE 3 ANOVA EXAMPLEColor Change Formula Technician1.000 A Elmo1.100 A Elmo1.100 A Homer0.900 A Homer1.300 B Elmo1.400 B Judd1.200 B Homer0.700 C Elmo0.600 C HomerTABLE 4 REGRESSION EXAMPLEModifier Level Impact Retention After Exposure0.005 0.5350.01 0.60.02 0.6350.02 0.620.03 0.680.04 0.7540.05 0.79TABLE 5 PATHOLOGICAL LINEAR REGRESSION EXAMPLExv0.01 0.0299790.02 0.0543380.03 0.0885810.04 0.0824150.05 0.1266310.06 0.0734640.07 0.1232220.08 0.0970030.09 0.0997280.75 0.8059090.86 0.865667G169 − 01 (2013)5and discrete factors. It should be noted that most spreadsheetand elementary data analysis applications can perform fairlysophisticated regression analysis.6.4.8 Another use of regression is to compare two predictorrandom variables at a number of levels for each. For example,results from one exposure test can be plotted against the resultsfrom another exposure. If the points fall on a line, then onecould conclude that the tests are “in agreement.” This is calledcorrelation. The usual statistic in a linear correlation analysis isR2, which is a measure of deviation from the model (a straightline). The R2values near one indicate good agreement with themodel, while those near zero indicate poor agreement. Thistype of analysis is different from the approaches suggestedabove which were constructed to test whether one randomvariable depended somehow on others. It should be noted,however, that correlation can always be phrased in ANOVA-like terms. The correlation example included for the Spearmanrank correlation method illustrates this. The observations thenmake up a response random variable. Correlation on absoluteresults is not recommended in weathering testing. Instead,relative data (ranked data) often provide more meaningfulanalysis (see Spearman’s rank correlation).6.4.9 Regression/correlation can lead to misleadingly highR2values when the x-axis values are not well-spaced. Considerthe following example, which contains a cluster of data thatdoes not exhibit a good linear fit, along with a few outliers. Dueto the large spread in the x-axis values, the clustered dataappears almost as a single data point, resulting in a high R2value. (See Fig. 3).Linear Regression AnalysisNumber of Observations: 11Multiple R: 0.997Squared Multiple R: 0.994SourceDegrees ofFreedom Sum of Squares F Ratio P(Xcv)Regression 1 0.9235 1509 less than 0.0001Resi