Course group - D-BDAT-S9

D2- BIG DATA - S9

Edit

ECTS credits

5.0

Course Director(s):

  • JUGANARU-MATHIEU Mihaela
  • General Description:

    Information systems (IS) are becoming increasingly powerful and reactive to support production, decision taking and collaborative working and to increase customer or client satisfaction within companies, institutions and in society. The functions of an IS are developed by processing clearly defined information, generally stored in data bases within the IS. Nevertheless increasing volumes of data are now being generated from various sources; sensors, software programmes, activity records and other information considered to be “volatile” and not of immediate use. This data is sometimes stored for a certain period of time but is rarely exploited because of its volume, the diversity of format, the speed of accumulation and the lack of compatibility with existing data processing tools.
    The term “Big Data” refers to such data, characterised by significant volume, a variety of formats and a speed of generation (V3), to which must also be added the potential value of its extraction (V4). This value is related to the knowledge extracted. Big Data companies also observe that this question is now being dealt with by professional engineers, hence the justification for allowing the largest number of students, future generalist engineers, to be trained in the basics of Big Data processing and current technologies. It is becoming imperative to be able to train our students in these new methods and techniques to face up to a challenge which will become increasingly present in the years to come.

    Links between course units:

    Large volumes of data have to be stored in a consistent manner; it is therefore necessary to be able to manipulate software and paradigms which are adapted. The algorithmic and mathematical processing methods are also adapted to the context of Big Data. There are four main axes of teaching:

    • Adapted mathematical methods
    • Data organisation
    • Information systems
    • Data mining, algorithms and methods for Big Data

    The Units are :

    • Unit 1: Data organisation - Part 1 (S8) and Part 2 (S9)
    • Unit 2: Big Data - Part1 (S8) and Part 2 (S9)
    • Unit 3: Information Systems for Big Data - Shell (S9)
    • Unit 4: Mathematical methods for large dimension (S9)

    Orientations / Associations with other courses:

    The Big Data specialisation relies essentially on notions acquired in the Computer Science core curriculum course and the Probability Statistics unit of the Mathematics core curriculum course. The data to be processed during the teaching sessions does not require any specific knowledge. Certain notions explained during the Data Mining unit such as notions of classification and clustering, have been or will be presented in the Data Science major, but from a more statistical angle. In addition the Big Data framework imposes a generalisation of these notions and hence of other techniques which implement these concepts, since on the one hand the volume of data is very significant, and on the other it is heterogeneous (including text, graphs, dynamic data), factors which go beyond the context of the numerical data dealt with in the Data Science major. It is possible that students having followed the Computer Science major or the targeted courses on software programming and commuting will be more at ease than others during the practical courses, but not in any significant way. The idea of putting together this specialisation with little reference to existing majors is incited by the need to respond to the issue of Big Data management for various fields in industry and research.

    Key words:

    Big Data Information systems Data mining Large dimension Large volume Data science