Evaluation of the RESTful API for molar to mass concentration conversion using UCUM and LOINC

Sam Tomioka

May 5, 2019

1. Intruduction

The verification of scientific units and conversion from the reported units to standard units have been always challenging for Data Science due to several reasons:

  1. Need a lookup table that consists of all possible input and output units for measurements, name of the measurements (e.g. Glucose, Weight, ...), conversion factors, molar weights etc.
  2. The names of the measurement in the lookup table and incoming data must match
  3. The incoming units must be in the lookup table
  4. Maintenance of the lookup table must be synched with standard terminology update
  5. Require careful medical review in addition to laborsome Data Science review and more...

Despite the challenges, the lookup table approach is the norm for many companies for verification of the units and conversion. Consideration was given for more systematic approach that does not require to use the lab test names[1], but some units rely on molar weight and/or valence of ion of the specific lab tests, so this approach does not solve the problem. The regulatory agencies require the sponsor to use standardized units for reporting and analysis[2]. The PMDA requires SI units for all reporting and analysis[3,4]. The differences in requirement force us to maintain region specific conversion for some measurements which add additional complexity.

The approach Jozef Aerts discussed uses RestAPI available through Unified Code for Units of Measure (UCUM) Resources which is maintained by the US National Library of Medicine (NLM)[5]. The benefit is obvious that we can potentially eliminate the maintenance of the lab conversion lookup table. Here is what they say about themselves.

The Unified Code for Units of Measure (UCUM) is a code system intended to include all units of measures being contemporarily used in international science, engineering, and business. The purpose is to facilitate unambiguous electronic communication of quantities together with their units. The focus is on electronic communication, as opposed to communication between humans. A typical application of The Unified Code for Units of Measure are electronic data interchange (EDI) protocols, but there is nothing that prevents it from being used in other types of machine communication.

The UCUM is the ISO 11240 compliant standard and has been used in ICSR E2B submissions for regulators adopted ICH E2B(R3). FDA requires the UCUM codes for the eVAERS ICSR E2B (R3) submissions, dosage strength in both content of product labeling and Drug Establishment Registration and Drug Listing. UCUM codes have been adopted by HL7 FHIR.

1-1 Introduction for mol-mass/mass-mol conversion

Jozef Aerts announced an updated RESTful API which accounts for the molecular weights of the analyte into the conversion between molar and mass concentrations. This additional functionality would facilitate the conversion of the lab results, verification of the standardized lab results and LOINC code provided by the vendors.

Although CDISC released a downloadable CDISC UNIT and UCUM mapping xlsx file, this evaluation will not use it since the CDISC UNIT does not cover all reported units used by the clinical laboratory/bioanalytical/PK vendors. Regular expression along with UCUM unit validity service was used to convert and verify the units provided by the lab vendors. In the future, this will be done with encoder-decoder or transformer + sequence-to-sequence model which demonstrated near perfect to generate iso 8601 from numerous date formats.

An initial evaluation was done on RestAPI available through the Unified Code for Units of Measure (UCUM) Resources and the findings are summarized in 2-1. The second evaluation is completed on the test version of RestAPI provided by Jozef Aerts at xml4pharma

2 Findings

2-1 Prior Work

Previoiusly the production version of RestAPI provided by US National Library of Medicine was evaluated. See here for more detail.

6458 laboratory records were used to test UCUM RestAPI. These records are from one of the ongoing clinical trial with standard set of clinical laboratory tests. Out of 6458 records, there were 321 records identified as incorrect conversions. Out of 322 findings, 169 was false positive which is due to lack of accounting valence of ion with respect to mEq to molar unit conversion.

Table 1: Number of samples, and the results
Records
Total Records6458
Identified as incorrect conversion321
True Positive153
False Positive169

2142 records were identified as error. Out of 2142 errors, 120 records identified as error due to having a categorical data despite unit was given. There were 2022 records where the source and target unit do not have the same property. Most of them are cause by lack of mass-mol conversions, and the rest appeared to be correct but medical judgement would be neccessary.

Table 2: Summary of Error Messages
Type of Error Records
ERROR: unexpected result: Error: Source and Target unit do not seem to belong to the same property 2022
ERROR: unexpected result: NEGATIVE is not a numeric value 119
ERROR: unexpected result: Negative is not a numeric value 1

Overall, this approach worked for majority of the records 6458, however, a few improvements are required by NLM/NIH to full utilize this RestAPI.

  1. Need for mass-mol conversions *
  2. Need to account for valence of ion with respect to mEq to molar unit conversion
  3. Need for an option to specify molar weight or LOINC code for accurate unit conversion with respect to molar unit.

2-2 Findings on Updated UCUM Conversion (May 2019)

A total of 419103 laboratory records were obtained from 17 clinical trials. Following steps were taken to reduce the number of records for the evaluation of a test version of UCUM Conversion API.

  1. Removal of records with missing units before and after UCUM unit conversion
  2. Removal of character type results such as ['Negative','None','Trace',...]
  3. Removal of records does not require conversion. For example, the records with LBORRESU==LBSTRSU were removed.
  4. Removal of duplicate records

17384 records of laboratory results were used for the evaluation. This evaluation does not cover the use of MOLWEIGHT.

The Table 3 below summarizes the number of records from each step.

Table 3: Number of sample after each data clearning steps
- Number of Records
Number of studies 17
Input data 419103
After removal of missing units before UCUM unit conversion 276144
After removal of character results 255937
After removal of records do not require conversion 172205
After removal of duplicate records* 17550
After removal of missing units after UCUM unit conversion 17384
Note: *Dropped duplicates except for the first occurrence.

2-2-1 Conversion Results

There were 2620 records from 27 tests where the LBSTRESN and UCUM conversion results did not match. Observed differences are plotted in Section 4-1.

Out of 27 tests,PHOS, TSH, and MG had a large difference between the LBSTRESN and UCUM conversion. PHOS (n=72) and TSH (n=139) had true positive findings. One test, MG (n=20), had false positive findings. In Figure 1, the left light colored bars show the LBSTRESN, and the right dark colored bars show the difference between LBSTRESN and the returned value from UCUM conversion. The details are discussed in Sec 4-1, but the Table 4 summarizes the findings on these 3 tests.

Figure 1: LBSTRESN and Difference between LBSTRESN and Return Value of UCUM Conversion




Table 4: Findings from the UCUM Conversion
LBTESTCD Source of Issue My Note
MG UCUM API ion channel is ignored in conversion
PHOS Input Data mass-molar conversion was done incorrectly by the lab vendor
TSH Input Data mass-molar conversion was done incorrectly by the lab vendor

2-2-2 Error Messages from UCUM Conversion

7389 records were returned with error messages from UCUM Conversion as shown in the Figure 2. Table 5 summarizes the type of errors received.

  • The most frequent error was ERROR: invalid double for Molecular Weight value = null which turns out the be true positive finding. This is related to missing m.w. or LOINC when the conversion requires m.w..
  • There were 9 kinds of ERROR: No MW value for the LOINC code **xxxxxxx** is available or the LOINC code is invalid . One of the error was due to invalid LOINC code 15153-0, but the rest of errors appear to be due to missing m.w. in the LOINC database. It would be helpful if this error message is split into each condition (1. invalid LOINC code or missing MW in LOINC) for our verification purpose.
  • Creatinine Clearance (mL/min/1.73 m2) conversion failed with ERROR: number of annotations in source and target is different. The surface area 1.73 m2 was added as annotation for the source but the target did not include the same annotation. This will be solve with http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/121/from/mL/min/%7B1.73_m2%7D/to/mL/s/%7B1.73_m2%7D
  • Several conversions failed with ERROR: unexpected result: Error: Source and Target unit do not seem to belong to the same property. See Table 6 for more detail.

Figure 2: Error Messages by LBTESTCD
Table 5: Summary of Error Messages
Message My Note Sample Call
ERROR: invalid double for Molecular Weight value = null True positive finding http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.3/from/mg/dL/to/umol/L
ERROR: No MW value for the LOINC code 13457-7 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/103/from/mg/dL/to/mmol/L/LOINC/13457-7
ERROR: No MW value for the LOINC code 15153-0 is available or the LOINC code is invalid Invalid LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.3/from/mg/dL/to/umol/L/LOINC/15153-0
ERROR: No MW value for the LOINC code 18262-6 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/118/from/mg/dL/to/mmol/L/LOINC/18262-6
ERROR: No MW value for the LOINC code 1968-7 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.1/from/mg/dL/to/umol/L/LOINC/1968-7
ERROR: No MW value for the LOINC code 3094-0 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/10/from/mg/dL/to/mmol/L/LOINC/3094-0
ERROR: No MW value for the LOINC code 35192-4 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/1.07/from/mg/dL/to/umol/L/LOINC/35192-4
ERROR: No MW value for the LOINC code 35197-3 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/85/from/mg/dL/to/mmol/L/LOINC/35197-3
ERROR: No MW value for the LOINC code 35217-9 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/42/from/mg/dL/to/mmol/L/LOINC/35217-9
ERROR: No MW value for the LOINC code 35234-4 is available or the LOINC code is invalid Valid LOINC, No m.w from LOINC http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/12/from/mg/dL/to/mmol/L/LOINC/35234-4
ERROR: number of annotations in source and target is different ?? http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/121/from/mL/min/%7B1.73_m2%7D/to/mL/s
ERROR: unexpected result: Error: Source and Target unit do not seem to belong to the same property Why the error is not 'ERROR: invalid double for Molecular Weight value = null' http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/12.9/from/uU/mL/to/pmol/L

The following converions as listed in the Table 6 returend the following error message.

ERROR: unexpected result: Error: Source and Target unit do not seem to belong to the same property

Table 6: List of unit conversions recieved error
LBTESTCD LBORRESU LBSTRESU
INSULIN uU/mL pmol/L
BASO G/L 10*9/L
EOS G/L 10*9/L
LYM G/L 10*9/L
MONO G/L 10*9/L
NEUT G/L 10*9/L
PLAT G/L 10*9/L
RBC T/L 10*12/L
WBC G/L 10*9/L
LYMAT G/L 10*9/L
MYCY G/L 10*9/L
ALP %5BIU%5D/L U/L
ALT %5BIU%5D/L U/L
AST %5BIU%5D/L U/L
CK %5BIU%5D/L U/L
INSULIN u%5BIU%5D/mL pmol/L
  • G or Giga per liter is often used in the hematology panel and is equivalent to 10*9 as per ucum-essence.xml, but the conversion was not sucessful.
  • IU and U are equivalent, but the conversion failed.
  • T and `10*12' are equivalent as per ucum-essence.xml, but the conversion failed.
G U and IU T

Source:ucum-essence.xml

Thought

This approach has potential and can be used for any units (PK, Lab, ECG, Vital Signs, etc). The addition of mass-mol/mol-mass conversion is a great addition and very useful to verify the results obtained from the vendors.

A few improvements would allow us to use this API at full potential. As previously discusses, the varence of an ion with respect to mEq to molar unit conversion was one of the issues. Some LOINC based conversion was not performed due to lack of m.w.. Conversion for units with G, T,IU, and U were not successful. 'ERROR: No MW value for the LOINC code xxxxxxx is available or the LOINC code is invalid' could be split into two errors for each condition for verification purpose. Otherwise, one need to lookup LOINC to confirm whether or not the LOINC is valid or m.w. is missing.

Overall, the tool was very useful and we found the conversion issues caused by two vendors affecting many clinical trials. We will implement this in SAS for programmers, and Python for automated checking using the production release by NIH.

Something to note:

The units used in API call has to be compliant with the USUM specifications. In addition, URL encoding has to be applied for some special characters. URL encoding can be found here.

3 Scripts

3-1 Initialization

In [1]:
import boto3
import botocore
import re
import os
import pandas as pd
import pandas_profiling as pp
import pixiedust as px
import pickle
# Visual
%matplotlib inline

# my utilities
from lib.ucum import *

bucket='snvn-sagemaker-1' #data bucket
s3 = boto3.resource('s3')
url='http://xml4pharmaserver.com:8080/UCUMService2/rest'
Pixiedust database opened successfully
Pixiedust version 1.1.15

3-2 Copy data to notebook

In [2]:
KEY=os.path.join('mldata','Sam','data','project','pool','lb.sas7bdat') 
os.makedirs('data', exist_ok=True)

try:
    s3.Bucket(bucket).download_file(KEY, os.path.join('data','sdtm_lb.sas7bdat'))
except botocore.exceptions.ClientError as e:
    if e.response['Error']['Code'] == "404":
        print("The object does not exist.")
    else:
        raise 

3-3 Retain records for verification

In [3]:
rlb=pd.read_sas('data/raw_lb.sas7bdat',encoding='latin')
print('Number of studies: ',len(list(set(rlb['STUDYID']))))
df=rlb[['LBTESTCD','LBTEST','LBORRES','LBORRESU','LBSTRESU','LBSTRESN','LBLOINC']]
#df=df[df['LBORRESU']!='LBSTRESU'] #since we don't need to verify
print('Number of records in the input: ',df.shape)

#Cleaning Input Data
df1=df.copy()
df1.dropna(axis=0, subset=['LBORRESU'], inplace=True)
df1.dropna(axis=0, subset=['LBSTRESU'], inplace=True)
#df1['ge']=df1['LBORRES'].str.findall(r'<|>=')
#df1['LBORRES']=df1['LBORRES'].str.replace(r'<','')
#df1['LBORRES']=df1['LBORRES'].str.replace(r'>=','')
print('Removed records with missing units: ', df1.shape)
#Remove records not needed for verification
#1. results contain character
df1['LBORRES']=df1['LBORRES'].str.replace(r'[a-zA-Z ]+','')
df1=df1[df1['LBORRES']!='']
df1.dropna(axis=0, subset=['LBSTRESN'], inplace=True)

print('Removed records with character results: ',df1.shape)
#2. both units are the same
df1=df1[df1['LBORRESU']!=df1['LBSTRESU']]
print('Removed records do not require conversion: ',df1.shape)
df1.to_csv('lb.csv')
Number of studies:  17
Number of records in the input:  (419103, 7)
Removed records with missing units:  (276144, 7)
Removed records with character results:  (255937, 7)
Removed records do not require conversion:  (172205, 7)

Let's see what units have been used in the input datasets

In [4]:
bar_hm(df1,'Units found in the input data (counts)')
Out[4]:
(<module 'matplotlib.pyplot' from '/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/matplotlib/pyplot.py'>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7f4ab1788d68>)

3-4 Define regular expression used to convert the input units to UCUM

In [5]:
#Regular expressions. --- update this based on raw data
patterns = [("%","%25"),
           ("\A[xX]?10[^E]", "10*"),
           ("IU", "%5BIU%5D"),
           ("\Anan", ""),
           ("\ANONE", ""),
           ("\A[rR][Aa][Tt][Ii][Oo]", ""),
           ("\ApH", ""),            
           ("Eq[l]?","eq"),
           ("\ATI/L","T/L"),
           ("\AGI/L","G/L"),
           ("V/V","L/L"),
           ("[a-z]{0,4}/HPF","/%5BHPF%5D"),
           ("[a-z]{0,4}/LPF","/%5BLPF%5D"), 
           ("fraction of 1","1"),
            ("sec","s"),
            ("1.73m2","%7B1.73_m2%7D")       
           ]

3-5 Functions used

  1. cleanlist: a helper function which takes the elements of patterns to output UCUM conformant unit
  2. orresu2ucum: a function which takes the dataframe containing LBORRESU and LBSTRESU, and apply cleanlist
  3. ucumVerify: a function which takes the list of units to verify the conformance to UCUM using isValidUCUM and returns a list with either True or False.
  4. convert_unit: a function which takes the dataframe containing LBORRESU and LBSTRESU in UCUM and returns a dataframe where original LBSTRESN is not equal to LBSTRESN from the UCUM-LHC Converter

3-6 Verify converted UCUM

In [6]:
dfconverted, ucumlist=orresu2ucum(df1,patterns)
ucumVerify(ucumlist, url)
Out[6]:
['g/dL = true',
 'mg/dL = true',
 'ng/mL = true',
 '10*3/uL = true',
 '%25 = true',
 '10*6/uL = true',
 '10*3/mm3 = true',
 'meq/L = true',
 'mL/min = true',
 'ng/L = true',
 'ng/dL = true',
 'u%5BIU%5D/mL = true',
 '10*3/uL = true',
 '10*6/uL = true',
 'uU/mL = true',
 'mg/L = true',
 'meq/L = true',
 'G/L = true',
 'ug/L = true',
 'T/L = true',
 'm%5BIU%5D/mL = true',
 '10*3/uL = true',
 '10*6/uL = true',
 'pg/mL = true',
 '%5BIU%5D/L = true',
 'U/L = true',
 '10*4/uL = true',
 '/uL = true',
 'L/L = true',
 'm%5BIU%5D/L = true',
 '/%5BLPF%5D = true',
 '/%5BLPF%5D = true',
 '/%5BHPF%5D = true',
 'mL/min/%7B1.73_m2%7D = true',
 'u%5BIU%5D/mL = true',
 'g/L = true',
 'umol/L = true',
 'mmol/L = true',
 '10*9/L = true',
 '10*12/L = true',
 'mL/s = true',
 '1 = true',
 'pmol/L = true',
 'nmol/L = true',
 'ukat/L = true',
 '%5BIU%5D/mL = true',
 '/%5BLPF%5D = true',
 '/%5BHPF%5D = true']
In [7]:
bar_hm(dfconverted,'UCUM found in the input data (counts)')
Out[7]:
(<module 'matplotlib.pyplot' from '/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/matplotlib/pyplot.py'>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7f4ab1a8fba8>)

3-7 Verify Conversion

In [8]:
nodupdf=df1.drop_duplicates(subset=['LBTESTCD','LBORRESU','LBORRES','LBSTRESN','LBSTRESU'], keep='first')
print('Removed duplicate records: ', nodupdf.shape[0])
Removed duplicate records:  17550
In [9]:
if os.path.isfile('output/findings.pickle'):
    findings= open('output/findings.pickle', mode='rb')
    findings=pickle.load(findings)
    full= open('output/full.pickle', mode='rb')
    full=pickle.load(full)
    response= open('output/response.pickle', mode='rb')
    response=pickle.load(response)    
else:
    findings,full,response=convert_unit(nodupdf, url, patterns,loinconly=0)

In [10]:
with open('output/findings.pickle', 'wb') as handle:
    pickle.dump(findings, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('output/full.pickle', 'wb') as handle:
    pickle.dump(full, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('output/response.pickle', 'wb') as handle:
    pickle.dump(response, handle, protocol=pickle.HIGHEST_PROTOCOL)

4 Output Issues

4-1 Discrepancies between LBSTRESN and Conversion based on UCUM

In [11]:
findings[(findings['fromucum'].notnull())]
Out[11]:
LBTESTCD LBTEST LBORRES LBORRESU LBSTRESU LBSTRESN LBLOINC checklist fromucum response
1 BILI Bilirubin 0.4 mg/dL umol/L 6.00 1975-2 http://xml4pharmaserver.com:8080/UCUMService2/... 6.841431 None
2 CREAT Creatinine 0.96 mg/dL umol/L 85.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 84.865629 None
4 GLUC Glucose 92 mg/dL mmol/L 5.10 2345-7 http://xml4pharmaserver.com:8080/UCUMService2/... 5.106685 None
6 URATE Urate 4.5 mg/dL mmol/L 0.27 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.267679 None
21 BILI Bilirubin 0.6 mg/dL umol/L 10.00 1975-2 http://xml4pharmaserver.com:8080/UCUMService2/... 10.262147 None
22 CREAT Creatinine 1.01 mg/dL umol/L 89.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 89.285714 None
24 GLUC Glucose 88 mg/dL mmol/L 4.90 2345-7 http://xml4pharmaserver.com:8080/UCUMService2/... 4.884656 None
26 URATE Urate 5.0 mg/dL mmol/L 0.30 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.297421 None
38 BILI Bilirubin 0.2 mg/dL umol/L 4.00 1975-2 http://xml4pharmaserver.com:8080/UCUMService2/... 3.420716 None
39 CREAT Creatinine 0.87 mg/dL umol/L 77.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 76.909477 None
42 URATE Urate 4.4 mg/dL mmol/L 0.26 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.261730 None
54 BILI Bilirubin 0.6 mg/dL umol/L 11.00 1975-2 http://xml4pharmaserver.com:8080/UCUMService2/... 10.262147 None
55 GLUC Glucose 113 mg/dL mmol/L 6.30 2345-7 http://xml4pharmaserver.com:8080/UCUMService2/... 6.272342 None
56 URATE Urate 3.9 mg/dL mmol/L 0.23 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.231988 None
65 CREAT Creatinine 0.86 mg/dL umol/L 76.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 76.025460 None
66 GLUC Glucose 86 mg/dL mmol/L 4.80 2345-7 http://xml4pharmaserver.com:8080/UCUMService2/... 4.773641 None
67 URATE Urate 6.2 mg/dL mmol/L 0.37 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.368802 None
78 CREAT Creatinine 0.81 mg/dL umol/L 72.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 71.605375 None
79 GLUC Glucose 117 mg/dL mmol/L 6.50 2345-7 http://xml4pharmaserver.com:8080/UCUMService2/... 6.494371 None
91 BILI Bilirubin 0.4 mg/dL umol/L 7.00 1975-2 http://xml4pharmaserver.com:8080/UCUMService2/... 6.841431 None
92 CREAT Creatinine 0.95 mg/dL umol/L 84.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 83.981612 None
93 GLUC Glucose 85 mg/dL mmol/L 4.70 2345-7 http://xml4pharmaserver.com:8080/UCUMService2/... 4.718133 None
95 URATE Urate 5.7 mg/dL mmol/L 0.34 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.339060 None
103 CREAT Creatinine 1.05 mg/dL umol/L 93.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 92.821782 None
105 GLUC Glucose 95 mg/dL mmol/L 5.30 2345-7 http://xml4pharmaserver.com:8080/UCUMService2/... 5.273208 None
107 URATE Urate 6.7 mg/dL mmol/L 0.40 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.398544 None
116 BILI Bilirubin 0.3 mg/dL umol/L 5.00 1975-2 http://xml4pharmaserver.com:8080/UCUMService2/... 5.131073 None
117 CREAT Creatinine 1.03 mg/dL umol/L 91.00 2160-0 http://xml4pharmaserver.com:8080/UCUMService2/... 91.053748 None
120 URATE Urate 4.9 mg/dL mmol/L 0.29 3084-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.291472 None
134 BILI Bilirubin 0.2 mg/dL umol/L 3.00 1975-2 http://xml4pharmaserver.com:8080/UCUMService2/... 3.420716 None
... ... ... ... ... ... ... ... ... ... ...
17229 NEUTLE Neutrophils/Leukocytes 56.3 %25 1 0.56 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.563000 None
17235 NEUTLE Neutrophils/Leukocytes 56.7 %25 1 0.57 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.567000 None
17248 BASOLE Basophils/Leukocytes 0.1 %25 1 0.00 706-2 http://xml4pharmaserver.com:8080/UCUMService2/... 0.001000 None
17249 NEUTLE Neutrophils/Leukocytes 64.9 %25 1 0.65 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.649000 None
17252 NEUTLE Neutrophils/Leukocytes 49.1 %25 1 0.49 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.491000 None
17254 LYMLE Lymphocytes/Leukocytes 33.8 %25 1 0.34 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.338000 None
17259 NEUTLE Neutrophils/Leukocytes 63.5 %25 1 0.64 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.635000 None
17264 LYMLE Lymphocytes/Leukocytes 14.3 %25 1 0.14 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.143000 None
17267 NEUTLE Neutrophils/Leukocytes 75.4 %25 1 0.75 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.754000 None
17281 EOSLE Eosinophils/Leukocytes 7.7 %25 1 0.08 713-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.077000 None
17283 LYMLE Lymphocytes/Leukocytes 23.2 %25 1 0.23 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.232000 None
17287 CHOL Cholesterol 273 mg/dL mmol/L 7.07 2093-3 http://xml4pharmaserver.com:8080/UCUMService2/... 7.060393 None
17296 NEUTLE Neutrophils/Leukocytes 50.7 %25 1 0.51 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.507000 None
17302 NEUTLE Neutrophils/Leukocytes 71.3 %25 1 0.71 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.713000 None
17305 LYMLE Lymphocytes/Leukocytes 21.1 %25 1 0.21 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.211000 None
17306 NEUTLE Neutrophils/Leukocytes 71.8 %25 1 0.72 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.718000 None
17312 CRP C Reactive Protein 49.5 mg/L nmol/L 471.40 30522-7 http://xml4pharmaserver.com:8080/UCUMService2/... 471.428570 None
17316 NEUTLE Neutrophils/Leukocytes 64.1 %25 1 0.64 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.641000 None
17319 LYMLE Lymphocytes/Leukocytes 35.4 %25 1 0.35 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.354000 None
17322 LYMLE Lymphocytes/Leukocytes 23.3 %25 1 0.23 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.233000 None
17326 LYMLE Lymphocytes/Leukocytes 34.5 %25 1 0.35 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.345000 None
17331 NEUTLE Neutrophils/Leukocytes 48.1 %25 1 0.48 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.481000 None
17334 LYMLE Lymphocytes/Leukocytes 47.2 %25 1 0.47 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.472000 None
17339 LYMLE Lymphocytes/Leukocytes 22.9 %25 1 0.23 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.229000 None
17344 EOSLE Eosinophils/Leukocytes 6.8 %25 1 0.07 713-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.068000 None
17349 NEUTLE Neutrophils/Leukocytes 40.9 %25 1 0.41 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.409000 None
17355 NEUTLE Neutrophils/Leukocytes 55.1 %25 1 0.55 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.551000 None
17369 NEUTLE Neutrophils/Leukocytes 38.9 %25 1 0.39 770-8 http://xml4pharmaserver.com:8080/UCUMService2/... 0.389000 None
17376 LYMLE Lymphocytes/Leukocytes 41.6 %25 1 0.42 736-9 http://xml4pharmaserver.com:8080/UCUMService2/... 0.416000 None
17379 RDW Erythrocytes Distribution Width 16.7 %25 1 0.17 788-0 http://xml4pharmaserver.com:8080/UCUMService2/... 0.167000 None

2620 rows × 10 columns

In [12]:
discrepant=findings[(findings['fromucum'].notnull())]
diff=discrepant.copy()
diff['diff']=np.array(discrepant['LBSTRESN'])-np.array(discrepant['fromucum'])
diff.dropna(axis=0, how='any',subset=['diff'], inplace=True)        
dfpiv=diff.pivot(columns='LBTESTCD', values='diff')
difftests=len(dfpiv.columns)
#dfpiv.describe()

Check differences between reported LBSTRESN vs Converted Results

In [13]:
discrepant0=diff.groupby(by=['LBTESTCD','LBSTRESU'],as_index =True)
discrepant0=discrepant0.describe().xs('mean', level=1, axis=1)[['LBSTRESN','diff']]
discrepant1=pd.melt(discrepant0.reset_index(), id_vars=['LBTESTCD','LBSTRESU'],value_vars=['LBSTRESN','diff'])
In [14]:
g = sns.FacetGrid(discrepant1, col='LBSTRESU', row='LBTESTCD', sharey='row', margin_titles=True)
g.map(sns.barplot, 'LBTESTCD', 'value', 'variable', hue_order=['LBSTRESN','diff'], alpha=.8)
/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/seaborn/axisgrid.py:703: UserWarning: Using the barplot function without specifying `order` is likely to produce an incorrect plot.
  warnings.warn(warning)
Out[14]:
<seaborn.axisgrid.FacetGrid at 0x7f4a5c6a3cf8>
In [37]:
discrepant2=discrepant1[discrepant1['LBTESTCD'].isin(['MG','PHOS','TSH'])]
In [38]:
g = sns.FacetGrid(discrepant2, col='LBSTRESU', row='LBTESTCD', sharey='row', margin_titles=True)
g.map(sns.barplot, 'LBTESTCD', 'value', 'variable', hue_order=['LBSTRESN','diff'], alpha=.8)
/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/seaborn/axisgrid.py:703: UserWarning: Using the barplot function without specifying `order` is likely to produce an incorrect plot.
  warnings.warn(warning)
Out[38]:
<seaborn.axisgrid.FacetGrid at 0x7f4a7a09aa90>

Note: Issues were identified in MG, PHOS, and TSH. Other differences are neglibile.

In [15]:
#pp.ProfileReport(dfpiv)
#px.display(dfpiv)
In [16]:
import random
c_=[]
for i in range(27):
    r = lambda: random.randint(0,255)
    c='#%02X%02X%02X' % (r(),r(),r())
    c_.append(c)
In [17]:
fig = plt.figure( figsize=(20,20))
fig.subplots_adjust(hspace=0.4, wspace=0.4)

idx = np.arange(1, difftests+1)
for i, col, c in zip(idx, dfpiv.columns, c_):
    ax = fig.add_subplot(4, 7, i)
    #dfpiv.loc[:, col].plot.hist(label=col, color=c, range=(diff['diff'].min(), diff['diff'].max()), bins=15)
    dfpiv.loc[:, col].plot.hist(label=col, color=c,  bins=15)
    plt.yticks(np.arange(0, 100, 10))
    plt.suptitle('Distributions of differences between LBSTRESN and UCUM conversion. \nExcluding exact match between LBSTRESN and UCUM conversion', fontsize=14, fontweight='bold')


    plt.legend()

4-1-1 $\text{Mg}^{+2}$

In [20]:
check=sumstat('MG',findings, rlb, dfpiv)
MG: Summary stats of LBSTRESN

         LBSTRESN                                               
            count      mean       std   min  25%   50%  75%  max
LBSTRESU                                                        
mmol/L     5397.0  0.851776  0.070848  0.43  0.8  0.85  0.9  1.2
--------------------------------------------------------------
MG: Summary stats of UCUM Conversion

         fromucum                                              
            count   mean       std  min    25%  50%    75%  max
LBSTRESU                                                       
mmol/L       20.0  1.715  0.326505  1.2  1.475  1.7  1.925  2.4
--------------------------------------------------------------
MG: Summary stats of Differences between LBSTRESN and UCUM Conversion

count    20.000000
mean     -0.857500
std       0.163252
min      -1.200000
25%      -0.962500
50%      -0.850000
75%      -0.737500
max      -0.600000
Name: MG, dtype: float64
In [21]:
check.head()
Out[21]:
LBTESTCD LBTEST LBORRES LBORRESU LBSTRESU LBSTRESN LBLOINC checklist fromucum response
1945 MG Magnesium 1.6 meq/L mmol/L 0.80 54919-6 http://xml4pharmaserver.com:8080/UCUMService2/... 1.6 None
1946 MG Magnesium 1.7 meq/L mmol/L 0.85 54919-6 http://xml4pharmaserver.com:8080/UCUMService2/... 1.7 None
1985 MG Magnesium 1.5 meq/L mmol/L 0.75 54919-6 http://xml4pharmaserver.com:8080/UCUMService2/... 1.5 None
2015 MG Magnesium 1.8 meq/L mmol/L 0.90 54919-6 http://xml4pharmaserver.com:8080/UCUMService2/... 1.8 None
2045 MG Magnesium 1.9 meq/L mmol/L 0.95 54919-6 http://xml4pharmaserver.com:8080/UCUMService2/... 1.9 None

Note: The error is originated from UCUM API. Ion channel of the $\text{Mg}^{+2}$ was not been considered.

4-1-2 PHOS

In [22]:
check=sumstat('PHOS', findings,rlb, dfpiv)
PHOS: Summary stats of LBSTRESN

         LBSTRESN                                                       
            count      mean       std      min   25%   50%      75%  max
LBSTRESU                                                                
mmol/L     5453.0  1.093965  0.259846  0.20006  0.98  1.12  1.25931  2.0
--------------------------------------------------------------
PHOS: Summary stats of UCUM Conversion

         fromucum                                                            \
            count     mean       std       min      25%       50%       75%   
LBSTRESU                                                                      
mmol/L       72.0  0.40232  0.117916  0.189534  0.30536  0.400126  0.494893   

                    
               max  
LBSTRESU            
mmol/L    0.652838  
--------------------------------------------------------------
PHOS: Summary stats of Differences between LBSTRESN and UCUM Conversion

count    72.000000
mean      0.831776
std       0.243789
min       0.390466
25%       0.633743
50%       0.828384
75%       1.023329
max       1.347162
Name: PHOS, dtype: float64
In [23]:
check.head()
Out[23]:
LBTESTCD LBTEST LBORRES LBORRESU LBSTRESU LBSTRESN LBLOINC checklist fromucum response
1947 PHOS Phosphate 3.0 mg/dL mmol/L 0.96870 35221-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.315889 None
1986 PHOS Phosphate 3.8 mg/dL mmol/L 1.22702 35221-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.400126 None
2016 PHOS Phosphate 3.9 mg/dL mmol/L 1.25931 35221-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.410656 None
2017 PHOS Phosphate 4.6 mg/dL mmol/L 1.48534 35221-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.484363 None
2046 PHOS Phosphate 4.2 mg/dL mmol/L 1.35618 35221-1 http://xml4pharmaserver.com:8080/UCUMService2/... 0.442245 None

Phosphate molecular weight is 94.97g/mol 3.0mg/dL in mmol/L is

$\frac{3.0\text{mg}}{\text{dL}} =\frac{30\text{mg}}{\text{L}} = \frac{0.03\text{g}}{\text{L}}*\frac{\text{mmol}}{0.09497\text{g}} = 0.3158892281773191 $

In [24]:
(3*10/1000)/(94.97/1000)
Out[24]:
0.3158892281773191

Note: The identified error is originated from the lab vendor.

4-1-3 TSH

In [25]:
check=sumstat('TSH',findings, rlb, dfpiv)
TSH: Summary stats of LBSTRESN

         LBSTRESN                                                     
            count      mean       std    min   25%   50%    75%    max
LBSTRESU                                                              
IU/mL       203.0  1.524631  0.806163  0.310  0.97  1.39  1.865   5.72
mIU/L      2404.0  1.718240  1.374644  0.005  1.01  1.48  2.120  42.06
--------------------------------------------------------------
TSH: Summary stats of UCUM Conversion

            fromucum                                                      \
               count      mean           std           min           25%   
LBSTRESU                                                                   
%5BIU%5D/mL    139.0  0.000002  9.062473e-07  3.100000e-07  9.850000e-07   

                                           
                  50%       75%       max  
LBSTRESU                                   
%5BIU%5D/mL  0.000001  0.000002  0.000006  
--------------------------------------------------------------
TSH: Summary stats of Differences between LBSTRESN and UCUM Conversion

count    139.000000
mean       1.614315
std        0.906246
min        0.310000
25%        0.984999
50%        1.449999
75%        2.009998
max        5.719994
Name: TSH, dtype: float64
In [26]:
check.head()
Out[26]:
LBTESTCD LBTEST LBORRES LBORRESU LBSTRESU LBSTRESN LBLOINC checklist fromucum response
15154 TSH Thyrotropin 1.01 m%5BIU%5D/L %5BIU%5D/mL 1.01 NaN http://xml4pharmaserver.com:8080/UCUMService2/... 1.010000e-06 None
15155 TSH Thyrotropin 1.85 m%5BIU%5D/L %5BIU%5D/mL 1.85 NaN http://xml4pharmaserver.com:8080/UCUMService2/... 1.850000e-06 None
15165 TSH Thyrotropin 0.98 m%5BIU%5D/L %5BIU%5D/mL 0.98 NaN http://xml4pharmaserver.com:8080/UCUMService2/... 9.800000e-07 None
15166 TSH Thyrotropin 1.69 m%5BIU%5D/L %5BIU%5D/mL 1.69 NaN http://xml4pharmaserver.com:8080/UCUMService2/... 1.690000e-06 None
15168 TSH Thyrotropin 0.85 m%5BIU%5D/L %5BIU%5D/mL 0.85 NaN http://xml4pharmaserver.com:8080/UCUMService2/... 8.500000e-07 None

m[IU]/L to [IU]/mL

$\frac{\text{mIU}}{\text{L}}=\frac{\text{IU}}{1000\text{L}}=\frac{\text{IU}}{10^6\text{mL}}$

In [27]:
1.01/10**6
Out[27]:
1.01e-06

Note: The identified error is originated from the lab vendor.

4-2 List of records in which UCUM conversion was unable to perform

In [28]:
ucumfail=findings[(findings['fromucum'].isnull())]
In [29]:
ucumfail
Out[29]:
LBTESTCD LBTEST LBORRES LBORRESU LBSTRESU LBSTRESN LBLOINC checklist fromucum response
7 UREAN Urea Nitrogen 10 mg/dL mmol/L 3.70 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
27 UREAN Urea Nitrogen 18 mg/dL mmol/L 6.40 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
43 UREAN Urea Nitrogen 16 mg/dL mmol/L 5.60 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
68 UREAN Urea Nitrogen 26 mg/dL mmol/L 9.20 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
81 UREAN Urea Nitrogen 29 mg/dL mmol/L 10.40 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
96 UREAN Urea Nitrogen 16 mg/dL mmol/L 5.70 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
108 UREAN Urea Nitrogen 18 mg/dL mmol/L 6.30 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
121 UREAN Urea Nitrogen 23 mg/dL mmol/L 8.30 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
139 UREAN Urea Nitrogen 20 mg/dL mmol/L 7.00 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
151 UREAN Urea Nitrogen 26 mg/dL mmol/L 9.40 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
160 UREAN Urea Nitrogen 37 mg/dL mmol/L 13.10 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
170 UREAN Urea Nitrogen 12 mg/dL mmol/L 4.40 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
181 UREAN Urea Nitrogen 15 mg/dL mmol/L 5.20 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
192 UREAN Urea Nitrogen 17 mg/dL mmol/L 6.10 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
201 UREAN Urea Nitrogen 22 mg/dL mmol/L 7.90 3094-0 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 3094-0 i...
212 BILI Bilirubin 0.3 mg/dL umol/L 5.10 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
213 CREAT Creatinine 0.73 mg/dL umol/L 65.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
215 GLUC Glucose 109 mg/dL mmol/L 6.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
216 URATE Urate 5.5 mg/dL umol/L 327.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
217 UREAN Urea Nitrogen 23 mg/dL mmol/L 8.20 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
225 BILI Bilirubin 0.4 mg/dL umol/L 6.80 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
226 GLUC Glucose 97 mg/dL mmol/L 5.40 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
227 URATE Urate 4.9 mg/dL umol/L 292.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
228 UREAN Urea Nitrogen 20 mg/dL mmol/L 7.10 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
237 BILI Bilirubin 0.5 mg/dL umol/L 8.60 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
238 CREAT Creatinine 1.26 mg/dL umol/L 111.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
239 GLUC Glucose 73 mg/dL mmol/L 4.10 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
240 URATE Urate 3.7 mg/dL umol/L 220.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
241 UREAN Urea Nitrogen 21 mg/dL mmol/L 7.50 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
248 BILI Bilirubin 0.9 mg/dL umol/L 15.40 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
... ... ... ... ... ... ... ... ... ... ...
17243 INSULIN Insulin 6.2 u%5BIU%5D/mL pmol/L 43.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17245 CREATCLR Creatinine Clearance 113 mL/min/%7B1.73_m2%7D mL/s 1.88 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: number of annotations in source and tar...
17247 CREATCLR Creatinine Clearance 81 mL/min/%7B1.73_m2%7D mL/s 1.35 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: number of annotations in source and tar...
17256 PROLCTN Prolactin 14.7 ng/mL pmol/L 639.13 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17262 INSULIN Insulin 19.0 u%5BIU%5D/mL pmol/L 132.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17271 INSULIN Insulin 3.3 u%5BIU%5D/mL pmol/L 23.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17274 PROLCTN Prolactin 21.2 ng/mL pmol/L 921.73 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17278 PROLCTN Prolactin 13.6 ng/mL pmol/L 591.30 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17280 INSULIN Insulin 11.7 u%5BIU%5D/mL pmol/L 81.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17288 PROLCTN Prolactin 8.6 ng/mL pmol/L 373.91 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17300 PROLCTN Prolactin 0.5 ng/mL pmol/L 21.74 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17304 INSULIN Insulin 74.6 u%5BIU%5D/mL pmol/L 518.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17311 PROLCTN Prolactin 7.4 ng/mL pmol/L 321.74 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17318 INSULIN Insulin 9.9 u%5BIU%5D/mL pmol/L 69.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17329 PROLCTN Prolactin 16.5 ng/mL pmol/L 717.39 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17333 INSULIN Insulin 16.8 u%5BIU%5D/mL pmol/L 117.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17337 PROLCTN Prolactin 10.3 ng/mL pmol/L 447.82 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17341 PROLCTN Prolactin 27.5 ng/mL pmol/L 1195.65 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17343 INSULIN Insulin 17.4 u%5BIU%5D/mL pmol/L 121.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17346 TRIG Triglycerides 258 mg/dL mmol/L 2.92 35217-9 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 35217-9 ...
17350 INSULIN Insulin 15.3 u%5BIU%5D/mL pmol/L 106.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17357 INSULIN Insulin 14.1 u%5BIU%5D/mL pmol/L 98.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17358 PROLCTN Prolactin 4.9 ng/mL pmol/L 213.04 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17359 CREATCLR Creatinine Clearance 141 mL/min/%7B1.73_m2%7D mL/s 2.35 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: number of annotations in source and tar...
17360 TRIG Triglycerides 274 mg/dL mmol/L 3.10 35217-9 http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: No MW value for the LOINC code 35217-9 ...
17365 INSULIN Insulin 17.1 u%5BIU%5D/mL pmol/L 119.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17366 PROLCTN Prolactin 1.6 ng/mL pmol/L 69.56 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...
17371 CREATCLR Creatinine Clearance 117 mL/min/%7B1.73_m2%7D mL/s 1.95 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: number of annotations in source and tar...
17381 INSULIN Insulin 5.6 u%5BIU%5D/mL pmol/L 39.00 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: unexpected result: Error: Source and Ta...
17383 PROLCTN Prolactin 20.9 ng/mL pmol/L 908.69 NaN http://xml4pharmaserver.com:8080/UCUMService2/... NaN ERROR: invalid double for Molecular Weight val...

7389 rows × 10 columns

In [30]:
errormsg=ucumfail.dropna(subset=['response'])
errormsg=errormsg.groupby(['LBTESTCD','response']).count().iloc[:,0].reset_index()
errormsg.columns=['LBTESTCD','RESPONSE','COUNT']
In [31]:
fig1 = plt.figure( figsize=(10,10))
sns.set_palette("rainbow", 13)
ax4 = errormsg.pivot('LBTESTCD','RESPONSE','COUNT').plot(kind='barh',figsize=(10,15),stacked=True)
plt.legend(title='Error Messages',loc=5, bbox_to_anchor=(1, -0.2))
plt.suptitle('Error Messages from UCUM Conversion', fontsize=14, fontweight='bold')
plt.xlabel("Count")
Out[31]:
Text(0.5,0,'Count')
<Figure size 720x720 with 0 Axes>
In [32]:
ucumfail.dropna(subset=['response']).to_csv('errors.csv')

Summary of Error Message finding

In [44]:
msglist=list(set(ucumfail['response']))
for i in range(len(msglist)):
    
    e_=ucumfail['response']==msglist[i]
    o_=ucumfail[e_][['LBTESTCD','LBORRESU','LBSTRESU','checklist','response']].drop_duplicates(subset=['LBTESTCD','LBORRESU','LBSTRESU'])
    print('Error: '+msglist[i])
    print('Call:  '+o_['checklist'])  
    print()
    print(o_[['LBTESTCD','LBORRESU','LBSTRESU']])
    print()
    print('---------------------------------------------------------------------------------------')      
Error: ERROR: No MW value for the LOINC code 35192-4 is available or the LOINC code is invalid
3311    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/1.07/from/mg/dL/to/umol/L/LOINC/35192-4
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
3311   BILIND    mg/dL   umol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 35234-4 is available or the LOINC code is invalid
3108    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/12/from/mg/dL/to/mmol/L/LOINC/35234-4
3498     Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/6/from/mg/dL/to/mmol/L/LOINC/35234-4
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
3108    UREAN    mg/dL   mmol/L
3498      BUN    mg/dL   mmol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 35217-9 is available or the LOINC code is invalid
3503    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/42/from/mg/dL/to/mmol/L/LOINC/35217-9
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
3503     TRIG    mg/dL   mmol/L

---------------------------------------------------------------------------------------
Error: ERROR: number of annotations in source and target is different
15396    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/121/from/mL/min/%7B1.73_m2%7D/to/mL/s
Name: checklist, dtype: object

       LBTESTCD              LBORRESU LBSTRESU
15396  CREATCLR  mL/min/%7B1.73_m2%7D     mL/s

---------------------------------------------------------------------------------------
Error: ERROR: invalid double for Molecular Weight value = null
212        Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.3/from/mg/dL/to/umol/L
213       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.73/from/mg/dL/to/umol/L
215        Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/109/from/mg/dL/to/mmol/L
216        Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/5.5/from/mg/dL/to/umol/L
217         Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/23/from/mg/dL/to/mmol/L
6603      Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/35.99/from/ug/L/to/pmol/L
8609       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.3/from/mg/dL/to/umol/L
8610       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/9.0/from/mg/dL/to/mmol/L
8611       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/176/from/mg/dL/to/mmol/L
8621        Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/69/from/mg/dL/to/mmol/L
8624        Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/95/from/mg/dL/to/mmol/L
8629       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/2.3/from/mg/dL/to/mmol/L
8637       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/3.5/from/mg/dL/to/mmol/L
8642     Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/26.56/from/ng/mL/to/pmol/L
8648       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/3.0/from/pg/mL/to/pmol/L
8651       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.6/from/ng/dL/to/pmol/L
8654        Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/30/from/mg/dL/to/mmol/L
9262       Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.3/from/mg/dL/to/umol/L
11459     Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/11.4/from/mg/dL/to/mmol/L
Name: checklist, dtype: object

      LBTESTCD LBORRESU LBSTRESU
212       BILI    mg/dL   umol/L
213      CREAT    mg/dL   umol/L
215       GLUC    mg/dL   mmol/L
216      URATE    mg/dL   umol/L
217      UREAN    mg/dL   mmol/L
6603   PROLCTN     ug/L   pmol/L
8609    BILIND    mg/dL   umol/L
8610        CA    mg/dL   mmol/L
8611      CHOL    mg/dL   mmol/L
8621       HDL    mg/dL   mmol/L
8624       LDL    mg/dL   mmol/L
8629        MG    mg/dL   mmol/L
8637      PHOS    mg/dL   mmol/L
8642   PROLCTN    ng/mL   pmol/L
8648      T3FR    pg/mL   pmol/L
8651      T4FR    ng/dL   pmol/L
8654      TRIG    mg/dL   mmol/L
9262    BILDIR    mg/dL   umol/L
11459     UREA    mg/dL   mmol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 35197-3 is available or the LOINC code is invalid
3500    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/85/from/mg/dL/to/mmol/L/LOINC/35197-3
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
3500      HDL    mg/dL   mmol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 3094-0 is available or the LOINC code is invalid
7    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/10/from/mg/dL/to/mmol/L/LOINC/3094-0
Name: checklist, dtype: object

  LBTESTCD LBORRESU LBSTRESU
7    UREAN    mg/dL   mmol/L

---------------------------------------------------------------------------------------
Error: ERROR: unexpected result: Error: Source and Target unit do not seem to belong to the same property
5652            Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/12.9/from/uU/mL/to/pmol/L
6586              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.03/from/G/L/to/10*9/L
6588              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.08/from/G/L/to/10*9/L
6591              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/1.60/from/G/L/to/10*9/L
6594              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.22/from/G/L/to/10*9/L
6597              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/3.73/from/G/L/to/10*9/L
6600               Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/174/from/G/L/to/10*9/L
6606              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/4.7/from/T/L/to/10*12/L
6609              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/5.66/from/G/L/to/10*9/L
7026              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.05/from/G/L/to/10*9/L
7828              Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.07/from/G/L/to/10*9/L
11395           Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/82/from/%5BIU%5D/L/to/U/L
11400           Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/18/from/%5BIU%5D/L/to/U/L
11403           Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/19/from/%5BIU%5D/L/to/U/L
11416          Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/202/from/%5BIU%5D/L/to/U/L
15398    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/70.4/from/u%5BIU%5D/mL/to/pmol/L
Name: checklist, dtype: object

      LBTESTCD      LBORRESU LBSTRESU
5652   INSULIN         uU/mL   pmol/L
6586      BASO           G/L   10*9/L
6588       EOS           G/L   10*9/L
6591       LYM           G/L   10*9/L
6594      MONO           G/L   10*9/L
6597      NEUT           G/L   10*9/L
6600      PLAT           G/L   10*9/L
6606       RBC           T/L  10*12/L
6609       WBC           G/L   10*9/L
7026     LYMAT           G/L   10*9/L
7828      MYCY           G/L   10*9/L
11395      ALP    %5BIU%5D/L      U/L
11400      ALT    %5BIU%5D/L      U/L
11403      AST    %5BIU%5D/L      U/L
11416       CK    %5BIU%5D/L      U/L
15398  INSULIN  u%5BIU%5D/mL   pmol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 18262-6 is available or the LOINC code is invalid
3669    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/118/from/mg/dL/to/mmol/L/LOINC/18262-6
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
3669      LDL    mg/dL   mmol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 15153-0 is available or the LOINC code is invalid
1932    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.3/from/mg/dL/to/umol/L/LOINC/15153-0
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
1932   BILIND    mg/dL   umol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 1968-7 is available or the LOINC code is invalid
1929    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/0.1/from/mg/dL/to/umol/L/LOINC/1968-7
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
1929   BILDIR    mg/dL   umol/L

---------------------------------------------------------------------------------------
Error: ERROR: No MW value for the LOINC code 13457-7 is available or the LOINC code is invalid
3501    Call:  http://xml4pharmaserver.com:8080/UCUMService2/rest/ucumtransform/103/from/mg/dL/to/mmol/L/LOINC/13457-7
Name: checklist, dtype: object

     LBTESTCD LBORRESU LBSTRESU
3501      LDL    mg/dL   mmol/L

---------------------------------------------------------------------------------------
In [43]:
pd.options.display.max_colwidth =150
In [ ]: