Data science has emerged as a new research paradigm
"Data science" is about how to process, analyse and extract knowledge from very large quantities of data, "big data". The area is growing with the speed of the data sets themselves and a couple of years ago the IT Faculty made the decision to start a master's programme in data science.
How do we use all the data that are generated in our society? What are the opportunities and risks from a social perspective? What can the data bring to research? How can we combine data sets in order to derive greater value? And how are we avoiding misinterpretations of the data?
The area "data science", large-scale data processing, has emerged as a result of the extended access to an increasing amount of complex data. The datasets mean new opportunities for diverse fields, from mapping of a genome to business analysis and predicting climate scenarios. Data science is also used in many areas to facilitate decision-making, where the patterns that are identified in existing datasets will be a basis for forecasting the future.
Data science is affecting all areas where data volumes are generated
Data science affects all areas where large amounts of data are generated – and almost every area generates data today. Public transport, Internet searches, medical records, access cards, security cameras, intrusion detection, EAN codes, social insurance statistics, GPS systems, call statistics, financial operations, environmental stations, recording equipment, embedded computers in consumer electronics and in our cars, incident reporting, motion detectors… The list is endless.
One thing that is a bit special is that it is technology that largely controls development in the field of data science. First: the amount of data generated, second: the possibility of storing the data – and finally that there are computer programs that make analysis possible.
Complex combination of technology, interdisciplinary work and analysis
Developments in data science means great demands on the computer scientists and analysts, since the area is at the intersection of statistics, artificial intelligence and database management. To get something useful out of the huge amounts of data the right questions must be asked, very well-defined datasets must be combined in a measurable way – and the people involved must have very good analytical knowledge to interpret the results obtained and be able to understand in a thorough way exactly which variable influence others.
Data science requires a good knowledge of the area to explore, whether it is about biological data, web statistics or data generated from the financial market. This requires interdisciplinary work, the biologist or stockbroker needs to have insights into the conditions when analysing data sets and vice versa – computer scientists need to master and understand the conditions and relations in the examined area.
New opportunities for research – and also new demands for new research
Data science also means new opportunities for research, where the data sets can extract research materials that were previously not accessible. This involves both more established research like mapping of DNA for example, but also the generation of new research areas that have arisen since the data sets make them possible.
Another aspect is that it is now necessary to do some research on this field as such – how should one handle huge amounts of data?
Text: Catharina Jerkbrant