Sunday, March 17, 2019

Logically Skipped Items in SDTM-QS

The last versions of the FDA "Study DataTechnical Conformance Guide"

have caused some confusion in the CDISC standardization world, due to the following paragraph:
"QS Domain (Questionnaires): records for logically skipped items should be included in the submission dataset with the following QSSTAT = "NOT DONE"; "QSREASND = "LOGICALLY SKIPPED ITEM"; ",

indicating that (solely) for the "Questionnaires" domain, items that were skipped during data collection due to application logic, MUST be collected anyway, and MUST have a record in the QS dataset. 

An example: if your questionnaire contains a question about pregnancy, this rule suggests that you should EXPLICITELY record that the question was skipped for non-pregnant males, AND that you should add a record for each of them in the QS dataset for that "skip" with QSSTAT = "NOT DONE" and QSREASND = "LOGICALLY SKIPPED ITEM".
So, if your study had 60% male subjects, your QS dataset might have 60% "NOT DONE" records for that question. 

In my opinion, this is a very questionable solution probably due to lack of good review tools or outdated review procedures!

Why this is not a good way
First of all, most data collection tools are programmed in such a way that only data is stored that has really been captured. Data that was not captured is not stored logical isn't it? Why should one store data that is not captured when the rules for not capturing are known anyway?
For example, CDISC's own ODM standard states :

 "IsNull is a flag to signify that an item's value is to be set to null. If the Value attribute is set, IsNull must not be set. If IsNull is set, the Value attribute must not be provided. In the interest of creating non-verbose XML instances, one should not use ItemData elements with IsNull set to "Yes" to indicate uncollected data. The better practice is to transmit only collected data."

As the study design already contains the rules for not capturing certain things, what is the additional value of storing and transporting records containing "nothing" but that they were skipped due to the well-known rules? As we (and especially the FDA) want to keep datasets small, not containing unnecessary ballast, we only store and exchange what has been done, not what has not been done due to known logic.
What I mean is: doesn't it make more sense to submit the "skip rules" rather than generating a huge number "not done" records?

QS datasets are usually very large in size. The by the FDA mandated XPT format already is a format that wastes large amounts of bytes and then we should further considerably increase the file size by adding records for "skipped" items?

A Phuse working group

The statements in the "Study Data Technical Conformance Guide" on "skip items" in QS seem to come out of the results of a former Phuse work group of which only (to my knowledge) a very limited amount of results was published.In the only public document I could found,"define.xml" was even not mentioned, although this is the most natural place to add information about the "skip rules". 

I got some information that the group considered an additional "Trial Design" SAS-XPT dataset "TQ" (Questionnaire Design) that would contain all the skip rules. Ok: "if you only have a hammer, everything looks like a table, if you only have XPT, everything looks like a table".
We recently discussed this in the "CDISC Data Exchange Standards Team". I promised that I would write a define.xml that has the "skip rules" documented in it. A few days later, it was found out that Sally Cassells had already done the same some time ago, essentially finding the same (easy solution).

Describing "skip rules" in the define.xml

Define.xml allows to add one or more comments to each variable definition. This can easily be used to also add the "skip rules" in a human-readable way. The "human-readable way" is the only way the define.xml is used (through the stylesheet) by most FDA reviewers, so this alone would suffice.
Here is a define.xml snippet:

Which is part of the skip rules of the well-known "Disability Rating Scale" questionnaire, for which also an annotated CRF exist. 

In the define.xml I created, and which you can download from HERE, all this "skip" rules are described using "def:CommenDef" and referenced from "ValueList" ItemDefs, i.e. definitions of items for test codes.
And here is the view in the browser through the (standard) stylesheet:

As one can see, the items and skip questions are on the "ValueLevel" level, as indeed the metadata for QSORRES depend on the value of QSTESTCD.
So we do not need something like a TQ ("Questionnaire Design") dataset anyway

But isn't this study design?

Sure it is! Not only skip questions, but also whole workflows are defined at study design time. The "Disability Rating Scale" form essentially does not contain "skip questions" it is a complete workflow! However, to be able to bring this information in a tabular form, which cannot easily be done for workflows, the FDA (and the Phuse group) decided to call it "skip questions" (if you only have a hammer ).

What is the CDISC standard for Study Design?
Yes, it is ODM, the Operational Data Standard. The ODM standard allows to define all aspects of a study design, including "skip questions" in an XML file. Using a stylesheet, this information can easily be made "human consumable" in a much nicer way than just tables. Furthermore, ODM allows to annotate each item (question on a form) with SDTM information, thus essentially creating a set of "annotated CRFs". Already far more than 10 years ago, Dave Iberson-Hurstfrom Assero created a stylesheet to generate visual displays of such "ODM annotated CRFs" (no PDF of course). Here is a screenshot:

Or, for the DRS Questionnaire:

Also, the SDM-XML extension (Study Design Model in XML) allows to describe all trial parameters, trial visits, the inclusion/exclusion criteria, workflows and timing and much more.
So, wouldn't it be more logical to submit ODM-SDM-XML study design files to the FDA rather than a set of "trial design" tables and an artificially generated "annotated CRF" in non-machine-readable PDF format?
We have to recognize however that the SDTM "Trial Design" datasets are older, and that features such as workflow and skip questions in ODM have "only" been developed 8-10 years ago. But as the FDA only seems to have "hammers", it will probably not want to move from "nails" to "screws", "screwdrivers" or even smarter things anyway.

A few days ago, it was announced that FDA commissioner ScottGottlieb will soon leave the FDA, with one of his last acts criticizing the pharma industry for 'continued reluctance' to rethink clinical trials. But shouldn't the FDA itself not start rethinking its way of working with electronic submissions, which is completely based on "tables" (which are even not "relational") and then also mandating the sponsors to deliver these in a 30-year old, completely outdated format?