Data Science Challenge on Data Streams

A Track in the IEEE Big Data 2019 Big Data Cup

Organizing team


Albert BIFET


Albert Bifet is a Professor at Telecom ParisTech, Head of the Data, Intelligence and Graphs (DIG) Group, and Honorary Research Associate at the WEKA Machine Learning Group at University of Waikato. Previously he worked at Huawei Noah's Ark Lab in Hong Kong, Yahoo Labs in Barcelona, University of Waikato and UPC BarcelonaTech. He is the co-author of a book on Machine Learning from Data Streams. He is one of the leaders of MOA and Apache SAMOA software environments for implementing algorithms and running experiments for online learning from evolving data streams. He was serving as Co-Chair of the Industrial track of IEEE MDM 2016, ECML PKDD 2015, and as Co-Chair of BigMine (2018-2012), and ACM SAC Data Streams Track (2019-2012).




Dihia Boulegane is a Ph.D. student at Orange Labs in collaboration with Télécom ParisTech on Machine Learning for IoT Networks monitoring. After graduating as a computer science engineer in 2016, she enrolled as a student in the Master 2 Data & Knowledge at Télécom ParisTech. She was in charge of implementing the streaming engine of the platform and the web application under the supervision of Albert Bifet.




Nedeljko Radulović is a Ph.D. student at Télécom ParisTech in Explainable Artificial Intelligence. In 2017/18 he was enrolled in Master 2 Data & Knowledge at Télécom ParisTech, where he studied Big Data Architectures and Data Science. He was in charge of implementing the online evaluation using Spark Streaming and baselines under the supervision of Albert Bifet.




Georges Hebrail is a senior researcher at the research division of EDF and at the SystemX Technological Research Institute. His domain of expertise covers Information Systems, Business Intelligence, and Data Science. As a researcher at EDF, he has been working on data mining approaches applied to the energy sector, both for customer relationship management and for back office analysis of electric power consumption, by developing an interactive time series clustering software called “Courboscope”. From 2002 to 2010, he was a professor of computer science at Telecom ParisTech engineering school, teaching and doing research in the field of information systems and business intelligence, with a focus on time series management, stream processing, and stream data mining. He is currently working on several data science solutions for the different activities of the EDF Group: generation, electrical distribution network, smart metering, customer relationship management. He has published more than 50 papers in international journals, conferences, and workshops.




Bernhard Pfahringer received his PhD degree from the University of Technology in Vienna, Austria, in 1995. He is a Professor with the Department of Computer Science at the University of Waikato in New Zealand. His interests span a range of data mining and machine learning sub-fields, with a focus on streaming, randomization, and complex data. According to Google Scholar, in the last 5 years he has been cited 27,520 times. He has 30 journal publications, and 10 books/chapters. He is a key member of the WEKA Machine Learning software project, that has been downloaded more than 10 million times.




João Gama is Associate Professor of the Faculty of Economy, University of Porto. He is a researcher and vice-director of LIAAD, a group belonging to INESC TEC. He got the PhD degree from the University of Porto, in 2000. He has worked in several National and European projects on Incremental and Adaptive learning systems, Ubiquitous Knowledge Discovery, Learning from Massive, and Structured Data, etc. He served as Co-Program chair of ECML'2005, DS'2009, ADMA'2009, IDA' 2011, and ECML/PKDD'2015. He served as track chair on Data Streams with ACM SAC from 2007 till 2016. He organized a series of Workshops on Knowledge Discovery from Data Streams with ECML/PKDD, and Knowledge Discovery from Sensor Data with ACM SIGKDD. He is author of several books in Data Mining (in Portuguese) and authored a monograph on Knowledge Discovery from Data Streams. He authored more than 250 peer-reviewed papers in areas related to machine learning, data mining, and data streams.




Joaquin Vanschoren is an Assistant Professor at the Eindhoven University of Technology. His research focuses on machine learning, meta-learning, and understanding and automating learning. He founded and leads, an open science platform for machine learning. He received several demo and open data awards, has been tutorial speaker at NeurIPS and ECMLPKDD, and invited speaker at ECDA, StatComp, AutoML@ICML, CiML@NeurIPS, DEEM@SIGMOD, AutoML@PRICAI, MLOSS@NIPS, and many other occasions. He was general chair at LION 2016, program chair of Discovery Science 2018, demo chair at ECMLPKDD 2013, and he co-organizes the AutoML and meta-learning workshop series at NeurIPS and ICML. He is also co-editor of the book ’Automatic Machine Learning: Methods, Systems, Challenges’.