ObjectivesThe UK's Improving Access to Psychological Therapies (IAPT) programme uses the Patient Health Questionnaire Depression Scale (PHQ-9; Kroenke, Spitzer, & Williams, , J. Gen. Intern. Med., 16, 606) and Generalized Anxiety Disorder Scale (GAD-7; Spitzer etal., , Arch. Intern. Med., 166, 1092) to assess patients' symptoms of depression and anxiety respectively. Data are typically collected via telephone or face-to-face; however, no study has statistically investigated whether the questionnaires' items operate equivalently across these modes of data collection. This study aimed to address this omission. Methods & ResultsQuestionnaire data from patients registered with an IAPT service in London (N=23,672) were examined. Confirmatory factor analyses suggested that unidimensional factor structures adequately matched observed face-to-face and telephone data for the PHQ-9 and GAD-7. Invariance analyses revealed that while the PHQ-9 had equivalent factor loadings and latent means across data collection methods, the GAD-7 had equivalent factor loadings but unequal latent means. In support of the scales' convergent validity, positive associations between scores on the PHQ-9 and GAD-7 emerged. ConclusionsWith the exception of the GAD-7's latent means, the questionnaires' factor loadings and latent means were equivalent. This suggests that clinicians may meaningfully compare PHQ-9 data collected face-to-face and by telephone; however, such comparisons with the GAD-7 should be done with caution. Practitioner points The PHQ-9 and GAD-7's factor loadings were equivalent across data collection methods. Only the PHQ-9's latent means were equivalent across data collection methods. Clinicians may be confident collecting PHQ-9 data by telephone and face-to-face and, then, comparing such data. Caution is recommended when determining clinical effectiveness using telephone and face-to-face GAD-7 data. More psychometric research is warranted.