What are you looking for?
The subject will be taught in Catalan. Students will be able to address the teacher in the language they are most comfortable with. Some content, transparencies and bibliography will be in English.
B2_That students know how to apply their knowledge to their job or vocation in a professional way and have the skills they demonstrate by developing and defending arguments and solving problems within their area of study
B3_Students have the ability to gather and interpret relevant data (usually within their area of study), to make judgments that include reflection on relevant social, scientific or ethical issues
B4_That students can convey information, ideas, problems and solutions to both specialized and non-specialized audiences
B5_That students have developed those learning skills necessary to undertake further studies with a high degree of autonomy
This course introduces the basic methods of Classification (supervised learning) and Clustering (unsupervised learning) in the context of Big Data. Students will follow a case study for each of the learning methods with the help of the teacher. Students will develop a project that will consist of analyzing data using the tools seen during the course. They will also have to explain the information they have been able to extract from the data. The project must be presented orally in class.
LEFT
1 History of data science. From Business Intelligence to Big Data
2 Data quality and visualization. Reports and dashboards
3 Classification
3.1 GLM
3.2 Trees
3.3 Other methods
PART I
4 Clustering methods
4.1 Distance measurements
4.2 Kmeans
4.3 Hierarchical clustering
4.4 Gaussian Mixture Models
4.5 Optics
5 Association Rules
6 Text analysis
7 Recommendation and Reinforcement Learning Systems
8 Evaluation of the model
9 Project
The final grade will be calculated as the weighted average of the different activities:
The subject will only be evaluated if more than 80% attendance is achieved.
Recovery
The final project part can be recovered.
Rules for carrying out the activities
For each activity, the teaching staff will inform about the specific rules and conditions that govern it. Individual activities presuppose the student's commitment to carry them out individually. All activities in which the student does not comply with this commitment will be considered suspended, regardless of their role (sender or receiver). Likewise, activities that must be carried out in a group presuppose the commitment of the students who make up the group to carry them out within the group. All activities in which the group has not respected this commitment will be considered suspended, regardless of their role (sender or receiver). In activities carried out in groups, the teacher may, based on the information at their disposal, personalize the grade for each member of the group.
It is up to the professor to decide whether or not to accept submissions outside the indicated deadlines. In the event that these late submissions are accepted, it is up to the professor to decide whether to apply any penalty and its amount.
Use of Generative Artificial Intelligence
The use of generative artificial intelligences (IAGs) must be limited to those aspects that are not fundamental in the context of the subject. They can be used, critically, as a mechanism to resolve doubts about the subject and/or to improve the writing of deliverable documents and/or as an aid in the generation of auxiliary code that is outside the scope of the subject topics. In the second case (improvement of the writing) the participation of IAG in the writing must be made explicit in the document. In the last case (code generation) it will be essential to mention its nature as “generated by IAG” by explaining the model used and the prompt supplied, even if it has been subsequently personalized and/or modified. IAGs may not be used to generate programming code, not even in the form of fragments, when this code is within the scope of the subject topics and/or is of an assessable nature. This prohibition remains even if the code is subsequently personalized and/or modified. If you have any doubts regarding the legitimacy or not of the use of IAGs, you must contact, a priori, the professor of the subject.
Gareth, James and other authors (2017), An introduction to Statistical Learning: with Applications in R. Springer