Collaborative Algorithm Platforms
Michele Sebag, Independent Researcher, France
Connecting Services to the Web of Data
John Domingue, Independent Researcher, United Kingdom
The Provenance of Consumer and Social Media Data
Paul Longley, Independent Researcher, United Kingdom
Collaborative Algorithm Platforms
Michele Sebag
Independent Researcher
France
Brief Bio
With a background in maths (Ecole Normale Supérieure), Michèle Sebag went to industry (Thalès) where she started to learn about computer science, project management, and artificial intelligence. She got interested in AI, became consulting engineer, and realized that machine learning was something to be. She was offered the opportunity to start research on machine learning for applications in numerical engineering at Laboratoire de Mécanique des Solides at Ecole Polytechnique. After her PhD at the crossroad of machine learning (LRI, Université Paris-Sud Orsay), data analysis (Ceremade, Université Paris-10 Dauphine) and numerical engineering (LMS, Ecole Polytechnique), she entered CNRS as research fellow (CR1) in 1991.In 2001, she took the lead of the Inference and ML group, now ML & Optimization, at LRI, Université Paris-Sud. In 2003 she founded together with Marc Schoenauer the TAO (ML & Optimization) INRIA project. Her research interests include reinforcement learning, preference learning, information theory for robotics and surrogate optimization.
Abstract
In many domains such as machine learning, constraint satisfaction or stochastic optimization, algorithm portfolios have been designed to handle the diversity of problem instances. In order to get peak performance on any particular problem instance, one must be able to select the algorithm and hyper-parameter setting best suited to this problem instance. The efficient deployment of algorithmic platforms outside research labs thus depends on the appropriate selection of the algorithm and hyper-parameter setting.
The talk will show how the issue of automatic algorithm selection and configuration can be handled by exploiting the portfolio archive, recording which algorithms and settings have been used on problem instances, and the corresponding result. Taking inspiration from the Matchbox system proposed by Stern et al. (2010), this issue is tackled as a collaborative filtering problem: each problem instance gives "marks" to some algorithms, and algorithms with better performance on this problem instance get better marks. Collaborative filtering can thus be exploited to recommend algorithms for a given problem instance. The talk will focus on the cold-start problem (how to deal with a new problem instance), presenting the algorithm recommender system ALORS, with applications in SAT, gradient-free optimization and machine learning.
Some perspectives about the exploitation of the ALORS system in order to propose a typology of problem instances and algorithms will be discussed.
Joint work with Mustafa Misir, Rémi Bardenet, Balazs Kégl
Connecting Services to the Web of Data
John Domingue
Independent Researcher
United Kingdom
Brief Bio
Prof. John Domingue is the Deputy Director of the Knowledge Media Institute at The Open University and the President of STI International, a semantics focused networking organization. He has published over 200 refereed articles in the areas of Artificial Intelligence and the Web and his current work is focused on how semantic technology can automate the management, development and use of Web services and on the relationship between Linked Data, rich media and education. Over the last decade John Domingue has served as the Scientific Director for three large European projects covering semantics, services, the Web and business process management. He current serves as Chair of the Steering Committee for the ESWC Conference Series. From 2008-2012 he served as a member of the Future Internet Assembly Steering Committee which helped coordinate the activities of over 150 EU projects with a combined budget of over 500M Euros. He is the Project Coordinator for two EU projects: FORGE which will connect Europe’s main Internet research and experimentation facilities to eLearning and Linked Data technologies and the European Data Science Academy which will increase the number of skilled data scientists within European industry. He is the founder and Director of the ESWC Summer and serves on the editorial board for the Journal of Web Semantics and the Applied Ontology Journal.
Abstract
Over the last few years we have seen a rapid growth in the Web of Data where now statistics indicate that there are around 100 billion semantic statements available on the Web. Governments, especially the US and UK Governments, are producing hundreds of thousands of public datasets in machine readable form on the Web using Web standards such as RDF(S) and SPARQL. Major Web and Media players such as Google, Facebook, Yahoo!, Microsoft and the BBC are now using this technology. In this talk I will describe work that we have been carrying out in the area which we term "Linked Services" which seeks to connect the spheres of Linked Data and Web services to form a new global computing platform. I will illustrate the talk with a number of applications we have been involved.
The Provenance of Consumer and Social Media Data
Paul Longley
Independent Researcher
United Kingdom
Brief Bio
Paul LONGLEY, B.Sc., Ph.D., D.Sc., FAcSS.holds a chair in Geographic Information Science at University College London, UK. He has worked as PI of Co-I on more than research grants totalling over £20 million and has supervised 50 Ph.D. students (most funded by research councils). His publications include nineteen books, and over 150 refereed journal articles and contributions to edited collections. His academic and editorial Duties, include past editorship of Computers, Environment and Urban Systems and Environment and Planning B. Other academic outputs include eleven externally-funded visiting appointments and over 150 conference presentations and external seminars.
Abstract
This presentation reports on the research activities of the Consumer Data Research Centre (CDRC), which is one of the UK’s current ‘Big Data’ investments funded by the Economic and Social Research Council (ESRC). Established in 2014, the CDRC’s mission is to bring sharper focus to the deployment and use of business and social media data, in support of decision-making across a widening spectrum of applications. After describing the three tier service structure of the CDRC, this presentation sets out the range of applications that are under development, the researcher and user interfaces that have been devised, and the ways in which business data may be evaluated and linked to conventional social survey sources. The presentation then focuses upon issues of establishing the provenance of business and social media data, and the wider implications of Big Data for the practice of social science. It also discusses some practical ways in which the value of new data sources may be reliably assessed.
These ideas are illustrated using an extended case study of the use of Twitter geo-temporal demographics to understand the activity patterns of different ethnic groups in London. These patterns are linked to the geography of residence as depicted using conventional data sources such as the UK Census of Population.