Agreement Between Devices
We often seek to evaluate the performance of a quantitative medical device by comparing it with an existing quantitative medical device or an existing laboratory instrument. This is typically done to show, for example, that a new medical device offers better performance (e.g., improved accuracy or faster response) while producing results that are clinically equivalent to an existing medical device or an accepted laboratory system. This comparison may be necessary for submitting a 510(k) application to the Food and Drug Administration and is commonly known as demonstrating substantial equivalence. The comparison is carried out by analyzing pairs of identical samples using both analytical platforms. There are several ways to look at the data generated by such comparison, but let's start by discussing two wrong, but unfortunately common, ways to analyze data from such comparison studies.
First, you should not rely on the correlation coefficient to evaluate the agreement between two sets of results. The correlation coefficient between the results obtained with two platforms is not a measure of agreement. The correlation coefficient is a measure of a linear relationship. If the values reported by one device are about twice the values reported by another device, the correlation between the two sets of results would be a respectable number close to 1 even though the two devices are clearly not clinically equivalent.
Second, and closely related, do not regress the results of the new device on the results of the existing device (or the accepted laboratory system) and use the coefficient of determination (R2) as a measure of agreement between the two analytical platforms. Again, you can have a large fixed bias and/or a large proportional bias, which makes the two platforms clinically non-equivalent, and still obtain a respectable (close to 1) coefficient of determination.