We plan a longitudinal waitlist-control field study to address the objectives outlined above. For this purpose, we are developing the smartphone app “On a journey to feel a little better. Or BEDDA.” (BEDDA), which can collect the required data. The three main intervention components of the BEDDA app follow the talk-and-tools paradigm [58]. BEDDA contains two tools (Breeze and daily actionable advice, i.e., daily wisdom) and one talk element (conversation agent). We designed these three elements in line with our three research objectives: (1) Breeze to collect breathing sounds (objectives 1 and 2) and improve symptoms and severity of subclinical depression, (2) daily actionable advice to improve symptoms and severity of subclinical depression, and (3) a socially-oriented conversational agent guiding through the intervention by presenting the story, setting the goal, and presenting gamification elements to increase adherence and efficacy (objectives 1, 2, and 3).
Table of Contents
Conceptual model
Figure 1 outlines the developed conceptual model of the BEDDA study. This resulting model aims to trigger a causal chain using different intervention components. The intervention components aiming at increasing the engagement of the participants with BEDDA are the black boxes pointing toward “Perceived Characteristics of BEDDA”, “Working Alliance between Participant and CA”, and “Behavioral Intention to Use BEDDA” in Fig. 1. The intervention components that aim to improve the proximal outcomes (mood, agitation, anhedonia) and, in turn, the primary distal outcomes (subclinical depression) as well as secondary distal outcomes (subclinical anxiety) are described in the black boxes pointing toward “Proximal Outcomes” in Fig. 1. For an explanation regarding proximal and distal outcomes of digital interventions, see [13, 59]. In the following sections, we will briefly outline the intervention components we designed to influence the behavioral intention to use BEDDA and improve the proximal and distal outcomes in more detail.
Conceptual model. Conceptual design of the intervention. Intervention components with a number (#) are taken from Knittle [70]. For example, #39 corresponds to the intervention component number 39,“Credible Source” in Knittle, Heino [70]. The remaining intervention components are derived from Vorganti et al. [98], Kramer et al. [73], and De Vecchi et al. [99]
Behavioral intention to use the BEDDA app
To increase the “Behavioral intention to use and continue to use BEDDA”, we use the components “Perceived Characteristics of BEDDA” and “Working Alliance Between the Participant and CA”. Related work covering information systems and technology acceptance research [60,61,62,63], working alliance [64, 65] linked to conversational agents [52, 66,67,68], and behavior change theory [69, 70] theoretically informed these components. In addition to increasing the “Behavioral intention to use and continue to use BEDDA”, we will use reminders, progress reports, goal setting by daily challenge, and monetary daily incentives.
Smartphone-based biofeedback breathing training Breeze
Our group initially developed the smartphone-based biofeedback breathing training Breeze and further adapted it to address the objectives of this study [48,49,50, 71]. Breeze uses the smartphone's microphone to continuously detect breathing phases in real-time (i.e., inhalations, exhalations, and pauses between inhalations and exhalations). This detection is, in turn, used to trigger a gamified biofeedback-guided breathing training visualization. This gamified biofeedback is illustrated as a sailing boat moving down a river, which speeds up when the user performs slow-paced breathing correctly (Fig. 2). This targets experiential outcomes [62] in addition to instrumental outcomes of psychological wellbeing and heart-rate variably (HRV) [72]. The standard configuration of Breeze guides users to breathe with six breaths per minute, which can, however, be adjusted for untrained or well-trained individuals in a specific range safeguarding for too fast or too slow breathing. The gamified breathing training visualization of Breeze has been shown to increase the experiential value [50] compared to a standard breathing training visualization. Furthermore, it has been demonstrated that Breeze effectively increases HRV [49]. Breeze will be explained in the story presented by BEDDA as one of the intervention components.
Conversational agent BEDDA
Analogue to the previous [53, 73,74,75,76,77] and ongoing work of the project team, the intervention will involve a text-based conversational agent (CA) called BEDDA. The participant will choose one of four avatars representing BEDDA (Fig. 3). Like other conversational agents, BEDDA aims to imitate a conversation with a human being [78]. BEDDA will provide a general introduction to the study, explain, and move the story along, collect responses to self-reported questions, and motivate the participant to continue interacting with the intervention. BEDDA will rely on scripted answers to increase simplicity and minimize the risk of harm. No free-text entries will be used except for variables such as the nickname, demographic information, or feedback regarding what to improve in a future intervention version. The study team developed the logic of the BEDDA, and the responses (i.e., conversational turns) are scripted, fully transparent, and traceable. BEDDA will also provide feedback on the self-reported outcomes (self-efficacy, [79]) at the end of the journey.
Daily wisdom
The daily wisdom (i.e., short, actionable advice) are actionable tips related to the symptom measured that day. The participant can receive this daily wisdom at the end of each daily interaction on any given day or ignore it for the moment but review it later at any point in the app. We implemented this option of still being able to assess the daily wisdom at a later point to prevent inducing stress on the participants. The daily wisdom is presented as a non-binding and noncommittal option they may want to try at any time. It is designed to be easily implemented into the daily routine and serves to (1) increase the efficacy of the intervention [55]; (2) target improvements in self-confidence and self-efficacy [56]; (3) contribute to the intervention's intensity, which is linked to improving efficacy [57]; and (4) provide an additional measurement for engagement. The list of different daily wisdom can be found in Additional file 1.
Gamification and storytelling
The conversational agent named BEDDA will present the story in which BEDDA and the participant go on a treasure hunt. Since BEDDA's sailboat cannot be powered and steered by BEDDA alone, BEDDA asks the participant to power the boat by providing wind energy. BEDDA goes on to explain that at the end of their journey, when they collected 30 keys, a magic chest opens and the coins in it can be collected. BEDDA also explains that the magic chest has another magical power. Provided that the participant manages to collect all keys in less than 40 days, they have the chance to win even further coins (gamification—challenge [70]) or keep a smartwatch received at the beginning of the study (smartwatch group only). Second, if the participant needs more than 45 days to find all keys, the chest loses its magical power, and all gold coins disappear (gamification—challenge [70]). We included this time constraint due to the timescale of this study and to further motivate the participants.
As explained by BEDDA, 30 keys need to be found along the way (Fig. 4) that unlock a chest at the journey’s end (Fig. 5). These keys are hidden in bottles (Fig. 6) floating on the river on the daily trip. Besides the key, the bottle also contains the daily wisdom of the day. Each day, the participants can only find one key. Before and after each trip, BEDDA presents a map (Fig. 7) to the participants illustrating their progress (goal setting—complete a daily task [70]). The progress made is also indicated by an illustration of the number of keys collected and the days past since the journey started (gamification—progress indication or badges [70]). The participants can additionally interact with Breeze as often as they want on any given day, given that they have already completed the daily trip.
Keys. Graphic illustration of the number of keys collected in the study. The number is increased after each interaction with Breeze. Helen Galliker created images specifically for this study as part of her employment at the Center for Digital Health Interventions. The Center for Digital Health Interventions holds the copywriter to all images
Study design
Figure 8 shows the experimental design of the study. The study will run for 60 days, and participants will be randomly (using an algorithm on the recruitment website) allocated to one of four groups we listed in Table 1: intervention with smartwatch (IGsw), intervention without smartwatch (IGnsw), waitlist with smartwatch (WGsw), waitlist without smartwatch (WGnsw). The intervention part of the study will run for 30–45 days, depending on how many days the participant needs to finish the required 30 once-per-day guided interactions (described below). Participants allocated to the intervention group (IGsw and IGnsw) will start the experiment first. In their first interaction (T1), they will receive the study information, give written informed consent via the app (type of content approved by the ethics committee), complete the initial assessment, receive a tutorial on how to use the app, and complete a training with Breeze for the first time. Participants allocated to the waitlist group (WGsw and WGnsw) will complete the baseline assessment (T0) but will not yet start using the app for another 30 days. In these first 30 days, participants in the intervention group will interact daily with BEDDA. This interaction includes dialogues with the conversational agent, answering questions before and after Breeze, and conducting breathing training with Breeze. Each day, they can also choose whether they want to receive daily wisdom matching the symptom of the day. Additionally, the conversational agent moves the story further each day, and the app uses gamification elements to illustrate progress. On day 15 of the intervention, participants allocated to the intervention group will respond to the half-time assessment questions and on day 30, participants in the intervention group will respond to the final assessment.
Experimental Design. Experimental design of the study. Note: T0: Assessment at baseline for the control group only. T1: Start intervention with start interaction and assessment. T2: Half-time intervention with half-time interaction and assessment. T3: End of intervention with final interaction and assessment. Intervention: Daily engagement in the intervention consists of interaction with the conversational agent, Breeze, providing assessments, and receiving daily wisdom. Figure created by Gisbert W. Teepe as part of his employment at the Center for Digital Health Interventions
Participants assigned to the waitlist control group (WGsw and WGnsw) will respond to baseline assessments (T0) on day 1 and 15 (corresponding to T1 and T2 in the intervention groups). On day 30 (T3 for the intervention group) the control group will start the intervention part of the study. The control group participants will be informed about the assessments, and the study starts through a notification on their smartphones. Participants in the control group will then complete the same intervention as the intervention group including the initial, half-time, and final assessment.
Recruitment, inclusion criteria, exclusion criteria
A highly relevant group that shows alarming rates of symptoms of subclinical [80,81,82,83] and clinical depression are undergraduate [84], graduate [85], and Ph.D. students [86]. Our study aims at this population by recruiting participants from Swiss universities with no or subclinical symptoms of depression and anxiety. To increase the diversity of our sample, we plan to recruit participants from the general population as well. We will recruit participants via mailing lists of universities, social media, and other communication channels such as flyers and advertisements. Since participants may know each other spill-over effects could occur.
Participants must be at least 18 years old, not pregnant, not diagnosed with Asthma, COPD, or other respiratory conditions, and should be willing to invest approx. five minutes of their time per day for 30 days. We will use the 9-item version of the Patient Health Questionnaire Short Version (PHQ-9, [44]) for screening purposes to determine the severity of depression. We will also use the 7-item version of the General Anxiety Disorder Questionnaire (GAD-7, [87]) to assess the severity of anxiety. Participants with scores greater than 15 (more than mild symptoms) in either screening instrument will be excluded and directly referred to a mental health hotline for the general public and mental health services of universities. We will also exclude participants responding with “several days (+1)”, “more than half the days (+2)”, or “nearly every day (+3)” to the question of the PHQ-9 assessing suicidal or self-harm ideation, and we advise the participant to seek professional help immediately. We will not use minimal cut-off scores to exclude participants because it is difficult to determine at which point a participant shows sufficient symptoms to benefit from the intervention. Additionally, we expect a selection effect due to the advertisement content focusing on improving symptoms and severity of depression (e.g., one of the outlined potential benefits is “reduce stress”).
Furthermore, we will exclude participants with a current episode of a diagnosed mood disorder (Major Depressive Disorder, Bipolar Disorder, Persistent Depressive Disorder, or a Disruptive Mood Dysregulation Disorder). The same applies to participants with other diagnosed psychiatric disorders (e.g., Generalized Anxiety Disorder, Schizophrenia, Borderline Personality Disorder). Participants currently receiving psychotherapeutic or psychopharmacological treatment can not participate in the study.
Enrollment and allocation
Interested respondents will complete the initial assessment using an online survey to determine whether they can participate in the study. In this survey, we will also ask eligible participants whether they would be able to pick up the smartwatch from an external institute and be willing to provide the smartwatch data for the duration of the study. Sending smartwatches to participants is not feasible due to requirements from the Cantonal Ethics Commission (Ethics Commission of the Federal State of Zurich, Switzerland) regarding anonymous data collection. Eligible respondents will be enrolled in the study and randomly allocated to either the intervention or waitlist control group using an algorithm on the website (www.bedda.me). If participants are interested in using a smartwatch, it is randomly decided whether they receive a smartwatch or not. Participants not allocated to the smartwatch group will receive instructions on downloading and installing the BEDDA app on their own. Participants assigned to the smartwatch group will receive instructions on how to book an appointment at an external organization to pick up their smartwatches. An individual from the external organization will help participants to install the app on the participants’ smartphones and enter the key of the smartwatch in the study app. We will collect no personal data during this process and cannot associate any personal data with the collected data.
Usually, blinding participants and staff is complex in most digital health intervention studies. Participants can easily realize if they are in the intervention group that uses a form of digital therapeutic or if they are in a control group (e.g., receiving standard of care or a sham intervention such as printed health information). However, we aim to address this problem by providing installation instructions for all groups after recruitment and only disclosing the intervention’s start time via a notification of BEDDA. However, the smartwatch groups will have a clear indication that they are in one of the smartwatch groups (intervention or waitlist). Still, we will neither disclose the allocation to the intervention or waitlist group to staff or participants in the smartwatch groups.
Daily interaction
Participants will be asked to interact with BEDDA by completing the once-per-day guided interaction (i.e., daily trip). Each daily interaction consists of different parts using the different elements of the intervention. Figure 9 illustrates such a daily interaction. First, the participants will interact with the chatbot and a treasure map showing the day's journey. Second, the participants will assess a subset of the Multidimensional Mood State Questionnaire (MDMQ) about their mood, agitation, and anhedonia [88, 89]. To reduce the burden on the participants, only one dimension of the MDMQ will be randomly selected each day. Figure 10 illustrates how the symptom of the day (e.g., symptom of the day is mood) and which version of the symptom of the day is presented in which order (e.g., Mood Version A is presented first, followed by Mood Version B after Breeze). One dimension consists of positive and negative items and has two versions with five items each. The participant will also indicate where they are conducting the breathing training (e.g., living room, office, etc.). Third, we will ask the participants to perform breathing training with Breeze. There, the app will instruct the participants to say three sentences to start the training: (1)“Lift the anchor.”, (2)“Set the sails.” (3)“Let's start today's journey.”. After this, participants will perform the breathing training. While sailing over the river, the participants will collect a bottle floating on the river. In this bottle, two items will be found the daily wisdom and one of the 30 keys. At the end of the exercise, participants will be instructed to read three sentences: (1) “Haul in the sails,” (2) “Set the anchor”, and (3) “I have finished today's journey.” Fourth, the participants will answer five different questions on the same MDMQ dimension as asked before the breathing exercise about either mood, agitation, or anhedonia. Fifth, the participants will provide a short indication of how accurately they perceived the detection of their breathing by responding to a question, adapted from Efendic et al. [90], with a seven-level response Likert scale ranging from “very inaccurate” to“very accurate”. Finally, they will be asked if they want to receive the wisdom collected on the way, showing the map of the completed journey and the number of keys they have accumulated in total.
Interactions overview. Initial, daily, half-time, and final interaction within the intervention. After T0, the waitlist group will receive no further interaction until the intervention phase starts for this group, two weeks later with the interaction at the start of the intervention (T1). Note: T0: Assessment at baseline for control group only. T1: Start of intervention with start interaction and assessment. T2: Half-time of intervention with half-time interaction and assessment. T3: End of intervention with final interaction and assessment. Figure created by Gisbert W. Teepe as part of his employment at the Center for Digital Health Interventions
MDMQ symptom and version of the day. Selection process of which symptom is chosen for the day. First (far left, top), the app draws randomly whether Mood, Agitation, or Anhedonia is measured on a given day. Second (second box, top), the app draws randomly whether version A or version B of the symptom drawn in step one is presented first. Third (third box, top), the participant interacts with Breeze. Fourth, the app presents the remaining version of the symptom drawn of the day. In the bottom part of the figure, different draws are illustrated. The first one illustrates that on a given day Mood was randomly chosen as a symptom. The random draw determines that version A is presented first, followed by version B after interacting with Breeze. Below this, further examples are illustrated
Incentive mechanism
Participants will receive financial compensation for their participation if they meet certain criteria. First, participants need to interact with the app at least once per day on 30 days and complete the daily trip with the intervention. The participants can use Breeze more often on any given day, but 30 different days are needed to complete the intervention and be eligible for compensation. Second, participants need to complete the additional questions at baseline (T0, waitlist control group only), the start of the intervention (T1), the half-time (T2), and the final assessment (T3) within the once-per-day guided interaction. Third, participants must complete the intervention in 30–45 days to receive a compensation of 40 CHF. Participants needing more than 45 days will not receive any financial compensation. Forth, participants completing the intervention in 30–40 days have the chance to win additional monetary compensation in a raffle (ten times 100 CHF, ten times 200 CHF, five times 300 CHF). Participants in the smartwatch group are not eligible for this raffle but can keep the smartwatch if they complete the intervention in 30-40 days. Since we are not collecting any personal information about the participants enforcing participants with insufficient data to return the smartwatch is not feasible. However, we strongly urge the participant in the final interaction to return the smartwatch if they did not provide sufficient data.
Measurements and assessment times
Table 2 provides an overview of the measurements used at baseline (T0), the intervention’s start (T1), half-time (T2), the end of the intervention (T3), and during each daily interaction (Daily). We collect demographic data (age, gender, type of student, highest education level, occupation, or field of study) through open and multiple-choice questions at the start of the study.
Voice and breathing sounds
We developed speech commands to capture commonly occurring and easily measured speech interactions. The commands are designed to be realistic in a setting outside of a study, meaning their length is similar to an interaction when giving commands to a voice assistant but also fit in the general story and setting of BEDDA. We implemented the voice commands in Breeze. To start and finish Breeze, the participants are visually instructed to say the three simple sentences before and after described above. Besides these voice interactions, we also derive the breathing sounds from Breeze. As the participant follows a guided breathing training each session with Breeze provides a set of inhalation, exhalation, and pause sounds.
Audio features
From the different voice commands, we will extract different features reported in related work [16, 22]. We will extract vocal folds features (i.e., source features, e.g., jitter [%], shimmer [%], tremor [Hz]), vocal tract filter features (e.g. F1 mean [Hz], F2 mean), and prosodic features (i.e., melodic, e.g., F0 mean, F0 variability, intensity [dB]). We will approach the extraction of acoustic breathing features in a more exploratory way since less related work exists regarding breathing-based biomarkers for depression. For the analysis of breathing sounds, we plan to look at the spectrograms, MFCCs, and GTCCs of the recorded breathing sounds used in previous work [91, 92].
Proximal and distal outcomes
Following our conceptual model, we measure proximal and distal outcomes. Proximal outcomes are the symptom changes and physiological data changes (e.g., HRV, physical activity, sleep) due to the daily interactions, while distal outcomes are the symptom severity changes over the course of the intervention. For the investigation of efficacy, we use both proximal outcomes (daily reported symptoms) and distal outcomes (symptom severity changes).
For the daily measurements (i.e., proximal outcomes) before and after interacting with Breeze, we use the Multidimensional Mood State Questionnaire (MDMQ) [88, 89], which has the three dimensions mood, agitation, or anhedonia, and two versions (Version A and B). We will infer one of the dimensions using the two versions (one before and one after interacting with the main component Breeze) at each daily interaction. For example, on a given day, the randomly chosen dimension is mood. Then, another draw determines that the questions from version A should be used to measure mood before Breeze. In turn, the questions from Version B are used to measure mood after Breeze. To ensure equal distribution of the daily drawn dimensions and versions, we use a block design consisting of the six question sets (MoodA, MoodB, AgitationA, AgitationB, AnhedoniaA, AnhedoniaB) and draw from this set until each was drawn once before starting with a new block for the following six days.
Our primary distal outcome is depression symptom severity, which we measure using the PHQ-9. The PHQ-9 is a subset of the Patient Health Questionnaire [44] and focuses on major depressive disorder. It is used to measure the severity of symptoms of depression in general medical and mental health settings [44]. Our secondary distal outcomes are anxiety (measured using the GAD-7) and stress (measured using the TICS). The General Anxiety Disorder-7 Questionnaire (GAD-7) is a clinically validated instrument to screen for symptom severity for the four most common anxiety disorders (Generalized Anxiety Disorder, Panic Disorder, Social Phobia, and Post Traumatic Stress Disorder) [87]. The Trier Inventory of Chronic Stress Short Version (TICS) measures all nine domains of the systemic-requirement-resource model of health [93] of an individual. It is the short version developed from the original long version of the Trier Inventory for Chronic Stress [94]. All distal outcomes are measured at T0, T1, T2, and T3 to determine symptom severity and to assess the efficacy BEDDA.
Other measurements
To measure the therapeutic alliance between BEDDA and the participants, we will use the Working-Alliance Inventory (WAI) [64]. We will use further qualitative open answer questions regarding BEDDA and Breeze at T3. Furthermore, we will ask where the participant is conducting the breathing training before starting Breeze and how accurately the breathing detection was on that day after Breeze (daily). During the interaction with Breeze, we also collect additional sensor data from the following sensors if they are present on the device: accelerometer, gyroscope, magnetometer, light sensor, ambient temperature sensor, humidity sensor, pressure sensor, step counter (since app start), and proximity sensor. The other sensors provide more information about the surroundings during the breathing training and help determine whether the person is doing the exercise correctly (e.g., steps taken during the training).
Smartwatches
Participants in the smartwatch groups (intervention and waitlist) will receive a smartwatch (Garmin Vívosmart 4 Smartwatch, Garmin International Inc., 1200 East 151st Street, Olathe, KS 66062, USA) at the start of the study. The measurements recorded by the smartwatch are heart-rate and heart-rate variability (via inter-beat-intervals), oxygen saturation, respiration rate, motion-based activity (3-axis accelerometer), stress level, sleep, skin temperature, and steps.
Estimation of sample size
Sample size calculations for machine learning approaches, such as for objectives one and two, are less rigorously described compared to sample size estimations for investigation of efficacy. Therefore, we calculated the sample size needed for objective three and used this number of participants to estimate the expected correlation between features and outcomes for objectives one and two.
The third objective of our study is to investigate the efficacy of BEDDA. We operationalized this with three assessment times (at the start of the intervention, after 15 interactions, and after 30 interactions) in two groups (intervention and waitlist control). Assuming a small effect (Cohen’s d = 0.225), a power of 0.8, and an alpha level of.05, we calculated that we need at least 194 to detect an existing effect of BEDDA compared to control. Regarding our first and second objectives, we calculated that when 194 participants provide data, assuming a 0.05 alpha level and a power of 0.8, a Pearson moment correlation coefficient of 0.23 would be necessary for a significant result. Low et al. [22] reported a median number of participants for studies investigating voice changes in depression of 123 (with a range from 11 to 1688). With the estimated 194, we are above the mean but in a reasonable range.
Related work collecting data for the development of DBMs has reported a mean adherence rate of 86.6% [40]. Due to the extended assessment period, we assume a greater attrition rate. However, we also consider specific elements of our study (monetary incentives, storytelling, and gamification) that may lead to increased adherence. Considering these factors, we expect a dropout rate of approximately 20%, slightly greater than the dropout rate reported in related work [40]. Assuming this dropout rate, we aim to initially enroll 220 participants.