SOFTWARE TRICKS: SDTM Mapping Scenarios to Master

SDTM Mapping Scenarios to Master

One of the most puzzling programming problems involves SDTM mapping: mapping datasets from a non-CDISC structure to CDISC SDTM structure.

If CDISC standards haven’t been implemented from the beginning of the data collection process (an approach that by now should be considered best practice, but isn’t always possible), there’s going to be some rejigging involved when the time comes to map the data to SDTM.

We’re assuming that if you’re mapping and converting data to SDTM, you have a basic knowledge of SDTM and how it works. But if that’s not the case, the SDTM Implementation Guide (SDTMIG) is an essential resource for anyone working in mapping, or programming SDTM datasets. It is a detailed, high-level overview of the specifications and metadata for all SDTM domains, and it includes guidance for producing SDTM domain files. Make yourself familiar with the SDTMIG before you start - it’ll make the process considerably smoother.

A typical mapping scenario has five basic steps to work through:

Identify the datasets you’re planning to map. (This one is easy)
Identify the SDTM datasets that correspond to those datasets.
Gather the metadata of the datasets, and the corresponding SDTM metadata.
Map variables in the datasets from step one to their SDTM domain variables.
Create custom domains for other datasets that don’t have corresponding SDTM datasets.

Arun Raj Vidhyadharan and Sunil Mohan Jairath’s 2014 PharmaSUG paper identifies nine scenarios that can crop up in the actual mapping process. Master these, and SDTM mapping will become that little bit less problematic.

The direct carry forward

Variables that are SDTM-compliant can be directly carried forward to the SDTM dataset and don’t need to be modified.

The variable rename

Some variables need to be renamed in order to map to the corresponding SDTM variable. An example: if the original variable is GENDER, it should be renamed SEX as per SDTM guidelines.

The variable attribute change

As well as variable names, variable attributes have to be mapped. Attributes such as label, type, length and format must comply with the SDTM attributes.

The reformat

The value that is represented does not change, but the format in which it is stored does. Example: converting a SAS date to an ISO8601 format character string.

The combine

In some cases, multiple variables must be combined to form a single SDTM variable.

The split

Conversely, a non-SDTM variable might need to be split into two or more SDTM-compliant variables

The derivation

Some SDTM variables are obtained by deriving a conclusion from data in the non-SDTM dataset. For example, using date of birth and study start date to derive a patient’s age, as opposed to manually entering the age upfront.

The variable value map and new codelist application

Some variable values need to be recoded or mapped to match with the values of a corresponding SDTM variable. This mapping is recommended for variables with a codelist attached that has non-extensible controlled terminology. It’s also advised to map for all values in the controlled terminology than just for the values present in the dataset. This would cover for values that are not in the dataset currently but may come in during future dataset updates.

The horizontal-to-vertical data structure transpose

In situations where the structure of the non-CDISC dataset is completely different to its corresponding SDTM dataset, it might be necessary to transform its structure to one that is SDTM-compliant. The Vital Signs dataset is a prime example: when data is collected in wide form, every test and recorded value is stored in separate variables. As SDTM requires data to be stored in lean form, the dataset must be transported to have the tests, values and unit under three variables. If there are variables that cannot be mapped to an SDTM variable, they would go into supplemental qualifiers.

Source

Labels

Wednesday, 13 December 2017