New Data Science Courses on edX

Data science is a topic that comes up week after week on this site. Four new courses added to the edX platform will be of interest to anyone wishing to progress in this discipline. Two of the courses are at an introductory level and the other two address the question of ethics.

Earlier this month in Data Scientists Salary Data, which revealed that high demand for people with data science skills has resulted in above-average pay and good starting salaries, I highlighted a number of training opportunities from online providers, including a Masters in Data Science. On edX which involves a considerable commitment both financially and in terms of time. On the other hand, these new courses are relatively short and all free of audit,

Disclosure: When you make a purchase after following a link to a course provider from this article, we may earn an affiliate commission.

Understanding the world through data is a 9 week, 3-6 hour per week course from MIT (Massachusetts Institute of Technology) which first started on October 18th and will run until December 20th. Only students on the verified track ($49) will have access to course materials after this date.

This is a hands-on introductory course where students examine all the forms in which data exists, learn tools that uncover relationships between data, and harness basic algorithms to understand the world from a new perspective. .

It includes four modules each of which contains videos, short exercises and a final capstone project (although graded assignments are not included if you follow the free audit trail).

From the start the course uses Python, but:

You don’t need to have programming knowledge, we will guide you on how to leverage Python to explore and visualize all data.

According to its presentation text, students will learn:

  • Python programming and Colab notebook programming environment
  • Dependent and independent variables
  • Find relationships between data using linear and polynomial regression models
  • Recognize how data is distributed
  • How to observe noise in distributions and when to ignore it
  • Categorize data into groups with classification models

Introduction to Data Science with Python is a self-paced course lasting 8 weeks at the rate of 3 to 4 hours per week and originating from Harvard University. The verified track, which issues a certificate of completion, costs $199.

The course preview begins:

Data science is an ever-evolving field, using scientific algorithms and methods to analyze complex data sets.

Outlining what to expect, he continues:

Using Python, learners will investigate regression models (linear, multilinear, and polynomial) and classification models (kNN, logistic), using popular libraries such as sklearn, Pandas, matplotlib, and numPy. The course will cover key machine learning concepts such as: choosing the right complexity, preventing overfitting, regularization, evaluating uncertainty, weighing trade-offs, and evaluating the model. Attending this course will build your confidence in using Python, prepare you for more advanced studies in machine learning (ML) and artificial intelligence (AI), and advance your career.

The lesson plan presents the content week by week:

  1. Linear regression
  2. Multiple and polynomial regression
  3. Model selection and cross-validation
  4. Bias, variance and hyperparameters
  5. Classification and logistic regression
  6. Multi-logistic regression and absence
  7. Bootstrap, confidence intervals and hypothesis testing
  8. Capstone Project (verified track only).

The other two new courses come from, a new partner of 2U, the company that acquired edX in 2021, now has nine courses on edX among its catalog of more than 80 courses. With 20 years of experience, it was among the first institutions to embrace online teaching and learning and was the first online educational institution to be approved by the American Council on Education.


The two courses that together form a data science ethics program are aimed at both practitioners and managers. Each lasts 4 weeks assuming 4-5 hours per week. They are self-paced and can be audited for free. If you want to do more than track content, the certificate for each is $198.

As the blurb for Principles of data science ethics explains the context of the program:

Concerns about the detrimental effects of machine learning algorithms and AI models (bias and more) have led to greater attention to the fundamentals of data ethics. News stories appear regularly on credit algorithms that discriminate against women, medical algorithms that discriminate against African Americans, hiring algorithms that base decisions on gender, and more. In most cases, those who developed and deployed these algorithms and data processes had no such intentions and were unaware of the detrimental impact of their work.

At the end of this course, students will be able to:

  • Identify and anticipate types of unintended harm that may result from AI models
  • Explain why interpretability is essential to avoid damage
  • Distinguish inherently interpretable models from black box models
  • Evaluate trade-offs between model performance and interpretability
  • Establish a responsible data science framework for their projects

The continuation of the course, Applied Data Science Ethics, provides practical guidance and tools for building better models specifically covering:

  • Tools for model interpretability
  • Methods of interpretability of global and local models
  • Metrics for Model Fairness
  • Auditing your model for bias and fairness
  • Remedies for biased models
    The course offers real-world problems and datasets, a framework that data scientists can use to develop their projects, and an auditing process to follow to review them. Case studies with ethical considerations, as well as Python code, are provided. newedxlogo