Data Science Challenge on Data Streams
A Track in the IEEE Big Data 2019 Big Data Cup
- This is Machine Learning competition on data streams which implies that
data will be released once competition starts in the form of stream.
Telecommunication companies and network providers are very concerned about the behavior
of their network to ensure high-quality services. They collect data in real-time about
network traffic and need to extract valuable knowledge so they can predict future
behavior for capacity planning or anomaly detection in case of malicious attacks or
malfunctioning devices. It is of great importance to process the data online and make
fast decisions when required. Our industrial partners will provide us with data that
was collected continuously from their networks. The collected data consists of many
recorded parameters related to devices location, quality of the signal, loss in packet
transmission and many more. The parameters can be categorical, continuous or discrete.
Our competition proposal addresses network activity analysis scenario and may appear in
various predicting use cases:
Capacity planning and activity prediction: Predicting metrics about
the network such as the number of devices and number of messages that passed
through the network will allow companies to predict future behavior and deploy
necessary resources when needed such as network expansion with new devices.
Anomaly detection: Evolution of certain metrics may reflect
malfunctioning devices among the network such as signal strength, noise ratio,
and packet loss. The goal is to predict the future value of these metrics and
spot abnormal values.
The detailed description of the dataset will be released soon.