Terminology Standardization for Healthcare data
As life-sciences organizations look at undertaking initiatives on creating clinical data repositories which aggregate data across patients’ hospital visits, it is imperative that the underlying patient data and terminologies are standardized. This is a vital step so as to ensure uniformity while reporting and analyzing information. In simple terms, terminology standardization ensures comparison of like with like so that there is no variance when data is reported or trends are analyzed.
To illustrate with an example, while standardizing gender, the following are a few examples of variants that could be used across healthcare facilities.
While aggregating data in a repository, the values must be standardized to a standard set of values recognized as the standard codes for gender. Let’s assume the standard codes for gender defined within the repository are M, F, T & Others. Then all the above values need to be converted into one of the previous four values. Hence, the values on the right hand side of the table below would be the new standardized codes to be implemented.
|Local terminology||Standard code|
We would then be able to compare like with like i.e. all values for male would be represented in the repository as M and hence, can be used for statistical reporting. Sounds simple? Well, it’s not always so simple, especially if we start looking at clinical data, i.e. Labs, problems, and so on. Even in case of the above example on gender, there could be other complexities, for example – Numerical values instead of alphabetical representation, i.e. 1,2,3,4 etc. and other variations of entries e.g. Unknown or undeclared. How do we map these values to the standard M, F, T & Others? That’s where human intervention is needed in order to decipher either the context of the values associated with the data to decipher the mapping or look at the statistical significance of the divergent values (i.e. whether the volumes are significant enough to affect the end report statistically?) or co-relate the values by checking how these have been represented and displayed in the EMR application capturing these.
Applying the same logic to the healthcare data (let’s take glucose tests for an example), variants of test names of glucose found in the healthcare records could be (and this is just an illustrative example)
|Glucose point of care|
With some analysis, one can determine that all these records can be mapped to the ‘Serum Glucose’ test. However, without mapping these tests to a standard reference test name i.e Serum glucose in the above example, it’s difficult to generate any kind of report from the data repository around the glucose level of a population without fist mapping these tests to a standard reference.
Now, the above were simplistic illustrations of the challenge. However, terminology variations in healthcare are significantly more complex than the above example. In our subsequent articles we will explore these and other terminology mapping concepts in greater detail as well as look at mapping with some of the industry standards such as SNOMED & LOINC code sets.
Dr Kunalsen Sawant, a former employee of Atos Syntel, is a medical physician with more than a decade of experience working with healthcare and clinical IT systems (EMRs).