Data Science Competence Center
Abstract: The PageRank algorithm is a widely known and well understood approach to assign importance scores to nodes in networks. This highly applicable approach relies on the assumption that all the connections of a node has equal importance (i.e. inversely proportional to their out-degree). The talk will introduce approaches that aim at determining the strength of the connections between pairs of nodes given that the PageRank scores (or some relative importance scores) of the nodes within the graph is assumed to be known in advance. After presenting these models, both quantitative and qualitative results will be presented on synthetic and real world networks (e.g. citation and collaboration networks, the English and Hungarian Wikipedia and networks generated from language usage).
Abstract: Approaching chemical problems with the tools of network science has proved very fruitful in recent years. The network approach has provided us with new ways of designing drugs, optimizing multistep chemical syntheses or fighting terrorism. Most of these successes, however, are based on the analyses of available data. One of the big challenges chemists are facing whether it is possible to implement this data-based knowledge for the creation of instructable molecular networks that can be models for early (molecular) evolution or able to perform complex (synthetic) tasks. The talk will discuss both the data-based achievements and the recent attempts towards creating “molecular ecosystems”.
Abstract: IP-intensive industries accounted for about 33-40% of the gross domestic product in western economies. These intellectual properties are mostly embodied in patents and trademarks. Highest stake of the expenditures in the mentioned sector goes for the wages of white-collar workers what underlines the importance of the intellectual capital and knowledge in the creation and development of IP portfolio. We investigated the flow of knowledge as a crucial resource among organizations by analysis of patent documents from the United States. In our network organizations are the nodes and the mobility of researchers among them are the edges. This graph can be considered as the informal innovation network of US organizations, where firms, universities or governmental institutions competing for knowledge and recombine their innovation capacity through the mobility of inventors.
Abstract: BItTorrent communities are content sharing systems using peer-to-peer technology. From mathematical point of view these communities can be modeled with graphs of particular structure. For example, there is a straightforward bipartite representation, where we take the users and the files as vertices and the edges represent supply and demand. Using this simple representation one can already consider interesting optimization problems. Moreover, there is a richer graph representation which leads to a flow network. Hence, one can start thinking about applying traditional flow algorithms and their meaning in this particular context.
Abstract: Bipartite networks naturally appear in from social to biological systems, examples include, among many others, the actors--movies network, artists--music network, scientists--research papers cooperation network, network of sexual contacts, diseases--genes network, plants--pollinators mutualistic networks, banks--firms money transfer networks and words co-occurrence networks. Many properties of these networks typically investigated by constructing and analyzing a projected network on one of the two sets of the original network. When one constructs a projected network of nodes only from one set, the original network's heterogeneity (e.g. the heterogeneous degree distribution) makes difficult to determine those links that presence in the projected network cannot be explained by random co-occurrence of their neighbors in the original network. In this talk we present a method based on statistical validation to overcome this problem and show some possible applications on real bipartite systems.