High performance computing
If you need high computing power for data analysis, modelling or machine learning, High Performance Computing (HPC) will give you access to an analysis platform that can handle complex calculations quickly and efficiently.
HPC is a solution where your project is executed on a super computer in an HPC centre. Unlike traditional servers, HCP allows you to scale the computing power up or down according to your needs.
Advantages of HCP:
- High-performance: Calculate and analyse large data volumes much faster than on regular servers.
- Adaptability: Scale your capacity according to the project requirements.
- Storage: HPC can store large volumes of data and quickly retrieve these from the memory.
Not all projects require HCP. If you are working with small data volumes, which can be processed efficiently on a traditional server, a hosted server may be a more cost-efficient solution. If your analyses require extensive data processing, complex simulations or machine learning models, HPC will serve you well. To be able to use HPC for projects with data from Statistics Denmark, the HPC center in question must have entered an agreement with Statistics Denmark.
Multiple HPC solutions – which one suits your needs?
Statistics Denmark offers three different solutions, which are outlined below. Read more, so you can choose the best solution.
The API-solution
This solution is available to anyone who has an approved project in Denmark’s Data Portal (DDP) and an agreement with one of the HPC centers under the solution. You can find an updated list of the HPC centers included further down this page.
With this solution, the user can decide whether data should be stored both on Research Services servers or a hosted server and the HPC center, or only at the HPC center. It is also possible to have some users associated with the project only in DDP, while others are linked to the project both in DDP and at the HPC center. This solution also allows you to connect HPC resources throughout the entire or parts of the project’s lifetime.
NGC Solution
This solution is available for health related projects, as this is a requirement from NGC. The project must also be located on NGC for the entire duration of the project. An advantage with this solution is that you make use of SSPE (see below).
Shared Secure Processing Environment (SSPE) Solution
This solution is developed for projects that have their own additional data that needs to be linked to the project and where the data either:
- is very large and therefore difficult and costly to send to Research Services for de-identification, or
- for legal reasons, cannot be transferred to Research services servers.
This solution means that Research Services can process the data directly on NGC’s servers. Read more about this further down the page.
It is now possible to link a project in the Denmark’s Data Portal to an HPC center for a short or long period. Below, you can find answers to questions regarding the solution.
Which HPC centers are included?
Statistics Denmark has entered into agreements with the following HPC centers:
- Computerome (DTU) Read more at Computerome
- GenomeDK (AU) Read more at GenomeDK
We expect more centers will be added to the solution, so the list may change over time.
What does it cost to set up?
For the setup, Research Services invoices five hours according to the current hourly rate. In addition, billing for data extraction, consultation, disk usage, processing of additional data etc. are charged according to our normal rates. You can see our current hourly rates here
The five hours do not cover the HPC center's costs for setting up the project in their environment or their operating costs. If you have questions about the prices of HPC centers, please contact them directly.
How does the API solution work with external data?
External data that needs to be merged with Statistics Denmark’s bank of basic data must first be sent to the Research Service for de-identification. The external data will then be transferred to the project environment at the HPC center.
Which projects can use the program?
All projects that are either on Research Services servers or a hosted server can purchase this solution, provided they have entered into an agreement with the relevant HPC center.
The project must have an approved project proposal before Research Services transfers data to the HPC area. Since this is an add-on service, you can prepare and submit your project proposal to Research Services before the agreement with the HPC center is finalized. The data you are provided with will only be extracted and transferred to the HPC center once the project proposal is approved and Research Services has received a copy of the written agreement between the institution and the HPC center. Contact the HPC center directly if you are interested in this type of agreement.
Note: Projects located at the National Genome Center (NGC) cannot use this solution.
How to get started?
First, you must have an agreement with one or more HPC centers. This agreement is valid for the entire institution. Once the agreement with the HPC center is in place, you can prepare a list of which users on which projects should have access to which HPC center.
The list should be sent to your contact person at Research Services, along with a copy of the agreement with the HPC center, who will ensure that access is granted to individual users under the relevant projects.
Do I have access to Statistics Denmark’s bank of basic data on both the research machine and in the HPC environment?
It is up to the individual project whether Statistics Denmark’s bank of basic data should be stored both on the research machine (or a hosted server) and at the HPC center, or only at the HPC center.
This means that some users on a given project can have access to both the research machine and HPC resources, while others only have access to the research machine. Please note that if you choose to have data in both environments, you will still need to pay the normal rate for disk usage on the research machine/operating costs on your hosted server.
If the project should only process data in the HPC environment, it is not necessary to store data on the research machine. In this case, you can inform your contact person at Research Services that the copy of the data on the research machine can be deleted after a copy is transferred to the HPC area.
The programs (scripts) used to extract data will be saved so they can be re-executed later if needed. In this case, a framework agreement is created to cover the hours Research Services use to re-deliver the data.
Note: It is a requirement that the project has an active server space either on DST’s own servers or a hosted server, as this is used to retrieve files. Additionally, login to the HPC environment must be done via the Denmark’s Data Portal.
If you are working with data from Statistics Denmark and you have a project with a health-related purpose, we offer an HPC solution via the Danish National Genome Center (NGC). The HPC solution uses a One-Node Architecture, where calculations are carried out on one server at a time. The HPC center is located outside Statistics Denmark, but the project will be created and controlled by Statistics Denmark in the same way as other projects. You will still manage the project via Denmark’s Data Portal.
To attach a new or existing project to NGC, you must:
- Have created a supplementary agreement to your data processing agreement.
- (Re-)propose you project for approval with Research Services.
- Make an agreement with NGC.
- Be able to engage in dialogue with technical staff about the set-up of server access to NGC.
Payment for use of the HPC center is settled directly with the center. For the use of NGC’s HPC infrastructure, you pay for installation, renting of hardware, operation and support.
Further information about conditions for login with NGC and prices for installation, renting of hardware, support and operation (pdf, in Danish)
If you wish to work with very large volumes of data, on a project with a health-related purpose, Statistics Denmark now offers a new project setup through our High Performance Computing (HPC) solution at the Danish National Genome Centre (NGC).
The Shared Secure Processing Environment—known as the SSPE solution—allows third party data providers to make their data available within a closed environment which, in turn, enables users to analyze these data alongside data from Statistics Denmark. This makes it possible to support projects where the volume of the external data exceeds the storage and analyzing capacity of FSE's servers. Furthermore, it ensures that researchers bound by legal agreements not to transfer external data to third parties—including Statistics Denmark—can comply with those obligations.
The SSPE solution is an advancement of the existing NGC setup. It is not meant as a replacement for the current setup, but as an additional offer to projects requiring greater analytical capacity. The same terms and conditions apply to projects wishing to use the SSPE solution as to those using the NGC solution.
Contact and questions
- Questions regarding the HPC center, software in the HPC environment, prices for HPC resources, or setup/maintenance of the HPC environment: Contact the relevant HPC center directly.
- Questions regarding API and SSPE, Statistics Denmark: Contact your contact person at Research Services or write to forskningsservice@dst.dk.