Documenting the linearity of an assay method has become a required practice in the clinical laboratory. Approaches to this procedure may present with extensive calculations and some ambiguities in interpretation. On the other hand, we know that the precision of most automated and semi-automated methods is very high, and that significant deviations of the squared correlation coefficient from 1.000 are usually the result of non-linearity of those methods. Because of this, we evaluated the use of the squared correlation coefficient as a screening method for linearity of automated and semi-automated assays in two different laboratories.

Fifty-one different automated and semi-automated assays were evaluated at the University of Pittsburgh Medical Center and twenty-eight at the Magee-Women's Hospital. The squared correlation coefficients were calculated using spread sheet regression function, and the linearity plots were evaluated by three clinical pathologists as clinically acceptable or non-acceptable. The coefficients were then ranked in ascending order and plotted against this evaluation. A clear breakpoint at an R2 value of 0.9970 was found, with values above this considered acceptable and below this as non-acceptable.

We propose that the squared correlation coefficient can be used as a screening test for evaluating the linearity of all the automated and semi-automated tests. It is readily available in spread sheet functions, easy to calculate, and conceptually simple. Using this screen approximately 90 % of all tests in our laboratories can be shown to be sufficiently linear for clinical use. We also suggest an algorithm which we have found useful for evaluation of methods that do not meet this screening criterion.


Validating and documenting the linearity of a laboratory method has now become a required practice in the clinical laboratory. While clinically significant nonlinearity in a method is an attribute that is readily recognized by most laboratory workers, a precise definition that is readily accepted by most workers presents unexpected difficulties. Two approaches to this problem involve extensive calculations and some ambiguities in interpretation, ultimately leaving the decision to the clinical judgment of the laboratory worker. The format recommended by NCCLS depends upon a comparison between the errors about a linear regression and the errors of replication . One difficulty of this approach is that very precise methods detect minor degrees of nonlinearity and the worker must still decide whether to accept or reject the method based on clinical experience and not statistical criteria. A second approach uses a series of fitted polynomials. Although avoiding the precision term in the denominator, it depends upon the significance of the slopes in the higher order best-fit equations that are calculated. Again, after these extensive calculations, one is left with making a clinical judgment on how much deviation from the first order curve can be accepted before considering the non-linearity significant.


Since most automated and semi-automated methods are now extremely precise, particularly if between-run error is eliminated, we here consider the hypothesis that measuring only the degree of scatter around the regression would be an adequate indicator of non-linearity in most cases. Tetrault (Tetrault G. Evaluation of Assay Linearity, Clin. Chem 1990; 36:585-6) first suggested using the correlation coefficient as a measure of linearity; however, this approach has not been documented by data and analysis. This idea has a number of attractive features to recommend its routine use in the clinical laboratory. First, the squared correlation coefficient is a built-in calculation in most spreadsheets, listed as part of the regression calculation. Second, the R2 takes a series of values from 0 to 1.000,where 1.000 indicates values close to the first order regression, expected in a linear method, while values significantly below 1.000 would suggest nonlinearity in a precise method. Third, the squared correlation statistic is readily understood by most laboratory workers. The question to be answered was whether there is a particular R2 value which corresponds to what is considered significant nonlinearity by a panel of experienced laboratory professionals. To answer this question we have calculated R on a number of common laboratory assays in two different laboratories and compared the values with the clinical judgment of three experienced laboratory professionals.


We calculated the squared correlation coefficient for the linearity standards available for the most commonly used automated and semi-automated tests in the clinical laboratory. We specifically excluded from this analysis manual methods or research methods.

CAP-provided material for linearity studies was used for most of the studies (CAP, Northfield, IL). Commercially-available material was used in some instances (Document TDM I Set, Casco Standards, Portland, ME). When purchased materials were not available, we used high and low patient samples diluted appropriately. An example of this dilution scheme has been outlined by Emancipator and Kroll. Most linearity samples were analyzed at five concentrations in duplicate, resulting in an R2 with eight degrees of freedom. In some cases four concentrations were used, but the specimens were analyzed in triplicate, giving an R2 with 10 degrees of freedom. In instances where Document TDM I Set were used, manufacturer's instructions were followed, with duplicate values obtained for at least 7 different concentrations.

The assayed results in each case were placed in a spread sheet (QuatroPro, Borland), placing the concentration obtained from the analysis on the Y axis and the proportion of the highest linearity sample on the X axis. For example a linearity standard prepared with 3 parts of the low standard and 2 parts of the high standard was plotted as 0.4. For each set of data we obtained the squared correlation coefficient, provided as part of the programmed regression calculation in the program. The plots of these data were presented to three chemical pathologists for their clinical evaluation of the linearity of each curve over the specified range for each assay. The correlation coefficients were not made available to the reviewers at this time. The reviewers were asked to judge each curve as either clinically acceptable linearity or clinically unacceptable linearity. If professional judgment were not in full agreement, we classified the curves as borderline acceptable.


In this study the linearity classification made by different pathologists were always in agreement. Typical examples of curves that were considered linear and nonlinear are presented in Fig. 1a and Fig. 1b. For these automated procedures replicate values were usually within five percent of each other and curvature could be easily detected. The plot of the AST concentrations from approximately 1000 to 50 units on the Kodak Ektachem was typical of an acceptable linear study (Fig. 1a). In contrast, Fig. 1b shows the plot for serum calcium, with an R2 of 0.9922, suggesting nonlinearity at the higher end of the regression line. We should emphasize that by classifying an assay as nonlinear over the defined range does not necessarily mean that the assay is not clinically useful over this range (see discussion).

We have ranked 51 methods by ascending squared correlation coefficients in Table 1, along with the ranges evaluated for linearity, and the classification by the reviewing pathologists. The squared correlation coefficients were plotted against rank in Fig.2a. A clear break in this plot can be seen at an R2 of 0.9970, with values below this point associated with curves that were considered nonlinear by the reviewers. To ascertain whether this specific breakpoint could be a relevant indicator in other laboratories, we repeated the study in a different hospital (Table 2 and Fig. 2b). The breakpoint of 0.9970 appeared to be valid for this laboratory as well.


Approximately 90 % of automated and semi-automated procedures in two clinical chemistry laboratories were considered acceptably linear by three reviewing pathologists and gave squared correlation coefficients of 0.9970 or greater. The seven procedures that showed R2 values of less than 0.9970 were all independently judged to be nonlinear by the reviewers. Thus, it appears that for automated, high precision tests, the squared correlation coefficient can be used as a good indicator of linearity. We should add, however, that the few tests judged nonlinear by the reviewers gave R2 values of 0.9922 or below. We do not know how well the agreement would be between the the reviewers when the squared correlation coefficients would be between 0.9930 and 0.9970. Tetrault had recommended a cut-off at an R2 of 0.9950 for a test to be considered acceptably linear. From our data we would consider values above 0.9970 as acceptable linear and values below 0.9930 as unacceptable. Intermediate values would have to be considered borderline and subjected to more detailed studies.

Those methods deemed "non-acceptable" indicate that further work and professional input is needed to define acceptable and useful range. This does not necessarily mean that the method is not useful in the clinical laboratory.

Proposed Algorithm for Tests with Low Correlation Coefficients

For methods not attaining the R2 value of 0.9970, we propose a series of addition steps to define the clinical usefulness of the procedure.

Who's who -- wwwadmin