Statistical processing

Contact info

Science, Technology and Culture, Business Statistics
David Boysen Jensen
+45 61 50 73 82

Get as PDF

Data for this statistics are collected via questionnaires for app. 3,500 respondents among a population of app. 22,000 enterprises. The material is validated already during the response from the enterprise, and afterwards followed by computer-aided validation and manual validation. Imputations and calibrated weighting is also a part of the treatment of data.

Source data

The statistics are compiled on the basis of questionnaires collected from app. 3,500 enterprises drawn as a sample from a population of app. 22,000 enterprises. The statistics are collected as one part of a single questionnaire, that also covers enterprises' research and development (R&D). The enterprises are sampled depending on the number of full-time equivalents and type of activity (NACE). All enterprises with 100 or more full-time equivalents are included in the sample, and the likeliness of being chosen for the sample decreases in line with decrease in number of full-time equivalents. The probability of selection is higher within types of activities that are more R&D-intensive than within activities where R&D is less frequent. The enterprises in the sample are randomly selected.

Frequency of data collection

Yearly.

Data collection

The statistics are collected via http://www.virk.dk as an electronic questionnaire.

Data validation

A comprehensive validation of the data is carried out: In the electronic questionnaire validation is performed on a range of the variables, e.g. on totals. If the total entered by the respondent does not match the calculated total, the respondent will be presented to this, and has the opportunity to correct the total or one or more of the components. The same applies if a calculation in the questionnaire has to sum up to 100 per cent, and this is not the case. If the levels of some of the key data typed in by the respondent are much higher or lower than the previous year, the respondent will be notified, and has the opportunity to correct if necessary. This applies e.g. to R&D-full-time equivalents and R&D expenses. After the data collection the data are mechanically validated and to some extent corrected. The ICT-programs that checks the data for errors also forms lists of likely or de facto errors. The types of errors that are identified as those having the greatest influence on the quality of the statistics are listed together with identification numbers of the respondents. This list is checked manually. Finally outlier tests are carried out for key variables/combinations of these. A minor part of the data collected is compared to other sources with the aim of assessing whether the response is likely correct or should be corrected. This applies to e.g. the number of R&D full-time equivalents, which is compared to the total number of full-time equivalents in the enterprise, which comes from The Central Business Register. The total expenditure for innovation, including expenses for own R&D are compared to the total turnover of the enterprise, which also comes from The Central Business Register. Also public accounts from the enterprises are used as a supplying source of information.

Data compilation

The final, corrected data material is compared to the original sample. Enterprises with more than 10 full time employees, that have not responded to the questionnaire, will have their response imputed, either by using the data collected from the respondent in the previous year, or via cold-deck. A calibrated weighting is carried out.

Adjustment

Not relevant for these statistics.

Statistical presentation

Relevance