How do you ensure the quality of measurements and derived results?

Questions regarding quality of measurement can take many forms, including:

  • Do tests produce consistent results?
  • Are the test results accurate in measuring properties of interest?
  • Are the test results useful?

We expect that we should get similar results if we repeat a test on the same person. To ensure the consistency of testing, test protocols are designed for simplicity and all tests include automated error checking to ensure compliance of test subjects. “Reliability” (a term used in clinical research) testing and analysis are done for relevant result measures to ensure they exhibit test/retest reliability from a statistical perspective (using ICC calculations). 

“Validity”, also a term from clinical research, refers to the accuracy of a measurement in representing some target property. E.g., if a test were intended to measure body temperature through some new technique, its validity could be ascertained by comparison to a trusted temperature measure on a set of test subjects. Sparta uses ground truth data provided by our users to compute the validity of test results where applicable (typically through AUC calculations on test data sets). 

The ultimate goal of a test measurement system is to provide useful insights into the people that are tested. Usefulness, however, is not solely defined by specific thresholds of reliability & validity. For example, there are many cases where the test results are used to simply understand differences in people - before those differences are related to any verifiable outcome. Often, in early phases, accuracy expectations for a prediction measure should be low, but then grow as more data and outcomes are collected over time.


All of our published research can be found in the Publications section of our website.