BUNKA ('Culture' in Japanese) is a software architecture project focusing on a new generation search engine based on recent research in cognitive science (collective intelligence and cognitive ergonomics) and computational sciences (automatic language processing and machine learning). The BUNKA project is a laureate of the CNRS pre-maturation program and of the "AI and Human and Social Sciences" program of the PRAIRIE Institute.
This project is led by the "Computational Cultural Sciences" team of the Institut Jean Nicod at ENS-PSL and more specifically by Charles de Dampierre - PhD student in computational social sciences and research engineer at the Médialab of Sciences Po, Nicolas Baumard - CNRS research director, specialist in cognitive social sciences, and Andrei Mogoutov - physicist affiliated to the Médialab of Sciences Po, specialist in heterogeneous data analysis.
BUNKA is a direct result of the research of the Computational Cultural Sciences team. The work of this team aims to map and quantify the distribution of cultural representations (beliefs, opinions, preferences, etc.), what Dan Sperber has called the epidemiology of representations, using automatic language processing and machine learning tools.
These mappings, first produced for social science research, can in fact be of great help to everyone. What others think about an event or a cultural content, the collective intelligence, is crucial in information retrieval.
The advent of the participatory Web or Web 2.0 (Twitter, Reddit, Wikipedia, etc.) has dramatically increased the amount of collective intelligence through reviews, opinions and reviews posted online. However, this collective intelligence is still under-exploited, notably for technical reasons, because the size of the data is massive, and because this data is extremely heterogeneous.
BUNKA is developing a new way of organizing collective intelligence using recent advances in computational sciences (Automatic Language Processing and Machine Learning). More precisely, we use "embedding" algorithms to project data into an abstract space according to rules (semantic similarity, topological similarity, common dimension, etc.).
The second key idea of BUNKA is that visualization is essential for discoverability. To improve autonomy, diversity, transparency, and serendipity, we need to have all the options available, and their relationships to each other. In other words, for each query, each content, we need a map of the web. We aggregate the reviews, opinions, critics, and opinions posted on the Web by users, and we create maps and contextual representations that, for each query about a work or a theme, show all the associated cultural contents, as well as the opinions, popularity, and relationships between these contents. With all the options visible on the map, and in particular the minority and less visible options, the user can thus free her·himself from the search algorithms, and explore by her·himself the themes and dimensions that interest her/him.
Ultimately, BUNKA is an "exploration engine": a tool that does not give a single result, like a search engine, but a set of information that allows the user to explore by her·himself, autonomously and transparently.
TO GO FURTHER