Cancer
This excel has the patient code (column A). From there, we are interested in 2 columns: the assessment date (C) and the response (F). As you can see, several patients have several assessments: 11001 (first patient) has one in August 23, then in October 23, etc. We are going to focus on the first one for now, I will explain other aspects of the responses later.
The type of response coding is the most important outcome we have now:
CR means complete response - disappearance of all evidence of disease;
PR is partial response (decrease in disease burden of more than 30%, but less than 100%);
SD is stable disease (changes from -30% to +20% in size);
PD is treatment failure or disease progression (increase of more than 20% of the total size, or appearance of new lesions).
Sometimes, given that all this is actually a continuum, we dichotomize it, between “clinical benefit” (CR+PR+SD) versus no benefit (PD); after all, if you do nothing, or what you do does not work, the tumor would grow (PD).
1) What does NE stand for ?
Treat it as a separate group for now, so there is clinical benefit, NE, no benefit
2) Are S_RECORD_id from evalucion_respuestas df and user from eb2 prod df matching ?
3) do we have control subjects in cancer data set or shoud we try to use emotions control subjects? time overlap?