Big data in practice: sociology, data sciences and journalism
In principle, sociologists should have much to contribute to Big Data analyses of the social world. Yet the methods training of most German-speaking sociologists is inadequate to deal with Big Data. This project investigated how methods, tools and skills drawn from the fields of sociology, data science and data journalism can be combined to enhance the sociological tool kit.
Portrait / project description (completed research project)
The project focused on close analysis of the fields of sociology, data science and data journalism. In three subprojects, we examined the methods, skills, and tools of each field, with specific reference to Big Data. We also reviewed the current state of curricula and training in each area, and how that translates into career prospects. Finally, we indicated areas of overlap that might productively be used to bring science, and especially sociology, forward. The project recombined insights from the sociology of science, the sociology of professions and the emergence of organisational fields. The inquiry into the three fields also combined traditional social science methods with novel computational methods.
The rise of Big Data – data that are large, diverse, often unstructured and concern an array of phenomena – poses new challenges and opportunities for the social sciences in analysing and explaining the full gamut of social interactions. Given their training in linking theoretical concepts with empirical observations, sociologists should be primed for such analyses. But the methods they use are designed for different data and settings (e.g. representative samples, interviews) and are less suited to Big Data than those used in new fields such as data science and data journalism.
The objectives of this project were three-fold:
(1) The first objective of the project was to systematically examine three data analytic fields, i.e. sociology, data science, and data journalism, with respect to the methods, tools and skills currently used.
(2) Another objective was to advance knowledge on the structures and mechanisms involved in maintaining and changing knowledge domains.
(3) On the basis of the two prior objectives, the third objective was to extract which methods, tools, and skills are needed for sociology to analyse Big Data. And vice versa: What aspects can sociology contribute to other data analytic fields?
The project elucidates the educational challenges Big Data poses for sociology. It provides insights into the newly emerging professional fields of data science and data journalism, and charts disciplinary developments at a key moment for sociology. Finally, the project indicates the need to improve data literacy, open data access, and transversality of data sciences. It also raises public awareness of the uses of Big Data.
Research results for the field of sociology show that methods and their teaching are fundamental for what it means to be a sociologist in German-speaking academia. The results also show that a greater consideration and emphasis on data literacy, visual literacy, and computational thinking in the curricula of sociology (and in the social sciences and humanities more general) at the Bachelor level would benefit students with a better understanding of data-intensive practices in the sciences, but also beyond. It turns out that Big Data, be it new stemming from social media, or old from the archives, is only rarely part of the current methods training, neither in form of data nor in terms of methods to be used for analyses. Research on the two other data analytic fields indicate approaches from which methods training in sociology could benefit. Indeed, a vast variety of data sources and data types can be relevant for sociological inquiries. Data need not be official statistics or numerical but can also consist of new digital or old archival textual data, image data, transactional data, sensor data, from a variety of data sources. Data gathering, data processing, and data preparation are necessary skills when working with unstructured digital or even digitised data. Open data access is preferable. Computing with data in open-source software requires knowledge about programming packages, analytical possibilities, computational thinking, and also how to tackle reproducible workflows. Data visualisations serve as important translations between data and insights – their creation also needs training. Our research on data sciences also shows the important role of plurality of methods, tools, and skills as enabling the existence of transversality and innovation. Thus, additional methodological plurality could also benefit sociological methods training. At the same time, data sciences and data journalism can also benefit from core elements of sociology. In data sciences, the current curricula mostly refer to existing technical-scientific educational traditions. Sociological perspectives can contribute to a better understanding of the social, political, and economic foundations as well as consequences of data-intensive practices in various fields of society. After all, many types of Big Data result from social interactions and social behaviour. When modelling social data, sociological insights help to build better models and to develop better explanations of found patterns.
Facing Big Data: Methods and skills needed for a 21st century sociology