Courses
Algorithms for Data Science (M2 DataScience, U. Paris-Saclay)
Language: English Last version: 2021–2022
Course materials and submission of homeworks/project on eCampus.
Structure:
- Week 1 (10/09/2021) - Intro, Frequent Itemset Mining
- Week 2 (17/09/2021) - Mining Similar Items
- Weeks 3,4 (24/09/2021, 01/10/2021) - Data Stream Algorithms
- Week 5 (8/10/2021) - Project
- Week 6 (15/10/2021) - Advertising on the Web
References:
- J. Leskovec, A. Rajaraman, J. Ullman. “Mining of Massive Datasets”. [site]
Social and Graph Data Management (M2 Data Science, U. Paris-Saclay)
Language: English Last version: 2021–2022
Course materials and submission of homeworks/project on eCampus.
References:
- A.-L. Barabási. “Network Science.” Cambridge University Press [site]
- M. Newman. “Networks: An Introduction.” Oxford University Press
- D. Easley, J. Kleingber. “Networks, Crowds, and Markets.” Cambridge University Press [site]
Bases de données (Polytech App3 Info, U. Paris-Saclay)
Langue : Français Dernière version : 2021–2022
Old Courses
Data Science Project (M2 Data Science, U. Paris-Saclay)
Language: English Last version: 2020–2021
Online via [eCampus]
Schedule:
- 08/01/2021: Project Presentation – Collaborative Filtering-Based Systems [pdf]
Datasets:
- GroupLens - MovieLens ratings of movies, also contains tags of movies
Bibliography:
- R. Chen, Q. Hua, Y.-S. Chang, B. Wang, L. Zhang, X. Kong. “A Survey of Collaborative Filtering-Based Recommender Systems: From Traditional Methods to Hybrid Methods Basedon Social Networks”. IEEE Access, 2018 [pdf]
- J. Leskovec, A. Rajaraman, J. Ullman. “Mining of Massive Datasets”. (chapters 9, 3, 11) [site]
Web Data Models (M2 Data&Knowledge, U. Paris-Saclay)
Language: English Last version: 2018–2019
Course dates and slides:
- 10/09/2018: XML, JSON – Syntax, Parsing [slides]
- 24/09/2018: XML – Validation, Tree Automata [slides] [Books.xml] [Books.xsd]
- 08/10/2018: XPath – Syntax, Semantics [slides] [exercises]
- 08/10/2018: XPath – Evaluation [slides]
- 22/10/2018: XPath – Equivalence, Containment [slides] [exercises]
- 22/10/2018: JSON – JSON Schema [slides]
Practical labs and project:
- 17/09/2018: XML – Checking, Parsing [text][SAX example] [DOM example]
- 01/10/2018: XML – Validation, Tree Grammars [text]
- 15/10/2018: Programming Project [text] [tests]
References:
- Makoto Murata, Dongwon Lee, Murali Mani, and Kohsuke Kawaguchi. 2005. “Taxonomy of XML schema languages using formal language theory”. ACM Trans. Internet Technol. 5, 4, 660-704. [paper]
- Georg Gottlob, Christoph Koch, and Reinhard Pichler. 2005. “Efficient algorithms for processing XPath queries”. ACM Trans. Database Syst. 30, 2, 444-491. [paper]
- Todd J. Green, Ashish Gupta, Gerome Miklau, Makoto Onizuka, and Dan Suciu. 2004. “Processing XML streams with deterministic automata and stream indexes”. ACM Trans. Database Syst. 29, 4, 752-788. [paper]
- Michael Benedikt and Christoph Koch. 2009. “XPath leashed”. ACM Comput. Surv. 41, 1, Article 3, 54 pages. [paper]
- Thomas Schwentick. 2004. “XPath query containment”. SIGMOD Rec. 33, 1, 101-109. [paper]
- Gerome Miklau and Dan Suciu. 2004. “Containment and equivalence for a fragment of XPath”. J. ACM 51, 1, 2-45. [paper]
- Felipe Pezoa, Juan L. Reutter, Fernando Suarez, Martín Ugarte, and Domagoj Vrgoc. 2016. “Foundations of JSON Schema”. ACM WWW. [paper]
Useful reading:
- C. Maneth’s course “XML and Databases” [page]
- S. Abiteboul et al. “Web Data Management”. 2011. Cambridge University Press [page]
- H. Comon et al. “Tree Automata Techniques and Applications”. 2007 [page]
- W3Schools tutorials [site]
Previous exams: 2015–2016 [pdf], 2017–2018 [pdf]
Architectures for Massive Data Management (M2 Data&Knowledge, U. Paris-Saclay)
Language: English Last version: 2018–2019
Courses:
Practical labs: