Study design and procedure
In this registry-based study, we analysed data from three national registries: the SSIA, which comprises data on employment status and type, sick leave, and sickness benefits; the Swedish National Board of Health and Welfare registry, which contains data on causes of death during the study period and inpatient care; and Statistics Sweden, which contains sociodemographic data of all people registered in Sweden. A unique Swedish personal identification number was used to pool the data. The data files were pseudonymised and contained serial numbers. The code key for the serial numbers was maintained by the registry holders.
Residents of Sweden aged ≥ 18 years who received sickness benefits due to COVID-19 (defined according to the ICD codes, including virus identified [U07.1] or virus not identified [U07.2]) were included in this study. Sick leave was required to have commenced between 1 March and 31 August 2020, with a follow-up period of 4 months. Individuals who required inpatient care or died during the study period were excluded.
Definitions of major concepts
The COVID-19 diagnosis included individuals with and without confirmation of SARS-CoV-2 infection because mass testing had not been implemented in Sweden during the first wave of the pandemic. If SARS-CoV-2 was identified and the virus was confirmed by laboratory tests, the ICD code U07.1 was assigned regardless of the severity of the clinical signs or symptoms. If SARS-CoV-2 was not identified (ICD code U07.2), COVID-19 was diagnosed clinically or epidemiologically in the absence of laboratory tests.
Sick leave was defined as receiving sickness benefits, regardless of the amount. Sickness benefits can be granted by the SSIA to anyone who has been working in Sweden (both employed and self-employed), and during parental leave, active studies, and registered unemployment . The employer provides sick pay for the first 2 weeks of absence from work due to sickness. From the 15th day onwards, the SSIA pays the sickness benefits. For the purpose of this study, if individuals received sick pay for 2 weeks followed by sickness benefits from the SSIA, the sick pay period was included in the total sick leave period. For unemployed individuals, sickness benefits are provided from the start of the sick leave .
The sick leave period due to COVID-19 was counted as the number of days with sickness benefits, and each individual had to have at least one registered COVID-19 diagnosis . Other predefined related diagnoses were merged with COVID-19 sick leave if the gap in non-registration between the sick leave periods was ≤ 2 weeks. The related diagnoses were unspecified virus infections, fever, or a second sick leave registration because of a COVID-19 diagnosis . The gap of 14 days was chosen for several reasons. For example, the physician might not have had time to provide the patient with a sick leave certificate, or the person might have tried to go to work but found that they were unable to work and needed more time for recovery.
If sick pay was received in the first 14 days, it was included in the sick leave period. Two sick leave periods were merged if they were separated by a gap of ≤ 14 days. The sick leave lasted for a minimum of 1 day and a maximum of 122 days (corresponding to a 4-month follow-up period after the initial diagnosis). Moreover, individuals were considered to have long COVID if the length of sick leave was ≥ 84 days .
Sick leave before COVID-19 was defined as being on sick leave for at least one 28-day period between 1 March 2019 and the date of the first COVID-19 sick leave registration . Five groups were created to make the results interpretable: group 1 included diagnostic codes corresponding to mental, behavioural, and neurodevelopmental disorders (ICD - F codes); group 2 included diagnostic codes corresponding to musculoskeletal system and connective tissue diseases (ICD - M codes); group 3 included diagnostic codes corresponding to respiratory system diseases (ICD - J codes); group 4 included all other isolated diagnostic codes (except for ICD codes - F, M, and J); and group 5 included individuals who were on sick leave during the year before the COVID-19 diagnosis for multiple diagnosis codes. We included one reference group ‘0’ which included individuals without sick leave during the year because their COVID-19 diagnosis. All sick leave diagnoses were included in the analysis regardless of the sick leave duration.
Employment status was defined as employed (including temporary parental leave and combined self-employment and employment), self-employed, and unemployed (including students) . The income variable was based on the disposable income per person in 2019 and was counted in thousands of Swedish Krona (average exchange rates during 2020 according to the Swedish Riksbank: 1 EUR = 10.5 SEK, 1 GBP = 11.8 SEK). Education level was divided into four strata. The country of birth was categorised into five groups. Marital status was classified as married, single, divorced, and widow/widower . Variables on children living at home and children aged ≤ 18 years were included in the analysis. Detailed information on the variables used in the study, their codes, and their roles are provided in Supplementary Table S1.
The data are presented as means and standard deviations (± SDs), medians and interquartile ranges (IQRs), and numbers with percentages (n, %). Non-parametric statistical tests were performed because many continuous variables had a skewed distribution. The chi-square test, Mann–Whitney U test, and Kruskal–Wallis test were used to compare the included and excluded individuals, and for intergroup comparisons based on COVID-19 diagnostic modalities. Spearman’s rank order correlation test (rs) was used to explore the correlation between possible explanatory variables (Supplementary Table S1). The strength of the correlation was interpreted as weak (< ± 0.39), moderate (± 0.40 to ± 0.69), or strong (≥ ± 0.70) .
Regression analyses were performed to determine how a previous diagnosis could predict the duration of sick leave during the first 4 months of the COVID-19 diagnosis.
Selection of the study variables
The primary independent variable in this study was a diagnosis necessitating sick leave in the year before the COVID-19 diagnosis. The study population was stratified into five groups according to prior diagnosis, and an additional group without any prior sick leave was used as the reference group. The outcome variable was the duration of sick leave during the 4 months following the COVID-19 diagnosis. Based on variable availability in the dataset and clinical reasoning, eight variables were identified, of which six were selected using a directed acyclic graph (DAG), which facilitates the presentation of a parsimonious model (Supplementary Figure S1). Age, sex, marital status, education level, employment status, and sick leave length ≥ 28 days in the year before COVID-19 diagnosis were identified as important variables according to the DAG.
Choosing the regression model
The outcome was a numerical variable (range, 1–122 days); the mean and variance values were compared to select the appropriate regression model. The mean was lower than the variance (43.1 vs. 657.7), indicating a positive (overdispersion) distribution of the outcome variable. Furthermore, the outcome variable did not contain ‘0’ counts. Therefore, a negative binomial regression analysis was used with a link function log .
Fitting and evaluation of the regression models
Negative binomial regression analyses were performed to identify the predictors of the length of sick leave related to COVID-19. All explanatory variables identified by DAG were entered into the model. The results were evaluated as follows. At the variable level, we reported regression coefficients and standard errors for the coefficients (β [SE]), p-values, and 95% confidence intervals (CIs). The models were evaluated using the Akaike information criterion (AIC), omnibus test (p < 0.05 indicates good fit), and log-likelihood ratio test. The analyses were performed on all study participants and subgroups stratified according to COVID-19 diagnoses (with and without SARS-CoV-2 detection).
Data were processed using SPSS software (Version 28.0., IBM Corp., Armonk, NY, USA). The significance level for all statistical tests was set at an alpha of 5%.