Skip to main content
SPS Logo

Program Overview

Advanced Data Science

Advanced Data Science Certificate Program

For students who have completed the Data Science or Predictive Analytics graduate degree program, this certificate provides a unique and in-depth exploration into the various industry-based applications of their specific skillset. The program offers students the opportunity to pursue concentrations in analytics fields that they were unable to study during their graduate programs, including elective offerings in marketing, risk, web, and text analytics. This also affords students the ability to return to study any of the exciting offerings in the Special Topics electives, which represent our faculty’s forays into cutting-edge aspects of the predictive analytics field.



About the Advanced Data Science Certificate Program

Advanced Data Science Certificate Course Schedule

From Big Data management to marketing analytics, the Advanced Data Science Certificate Courses page provides you with detailed information on the program's offerings.

Advanced Data Science Faculty

Instructors in this certificate program are the same world-renowned experts in our Master's in Data Science program. You can find a full listing of them on the Advanced Data Science Faculty page.

Admission for the Advanced Data Science Certificate

Applicants to the Advanced Data Science certificate program must hold a graduate degree in Data Science, Predictive Analytics, or similar field from an accredited U.S. college, university or its foreign equivalent. A list of admission requirements can be found on our Acceptance Criteria for the Data Science Certificate page.

Certificate in Advanced Data Science Tuition

Tuition costs can vary for each of our programs. For the most up-to-date information on financial obligations, please visit our Certificate in Advanced Data Science Tuition page.

Advanced Data Science Registration Information

Our Advanced Data Science Registration Information page outlines important dates and deadlines as well as the process for adding and dropping courses.

Gainful Employment Information for Advanced Data Science

Common questions and answers related to cost, financing and success in this certificate program are found on our Gainful Employment Information for Advanced Data Science page.

Additional Information

Prior to applying to this program, students must have completed MSDS 420 and 422 or possess equivalent knowledge and skills. Please see below for more information:

  • MSDS 420:  In this course students explore the fundamental concepts of database management and data preparation. With a focus on applications in large-scale data analytics projects, the course introduces relational database systems, the relational model, normalization process, and structured query language (SQL). The course discusses topics related to data integration and cleaning, database programming for extract, transform, and load (ETL) operations. Students learn NoSQL technologies for working with unstructured data and document-oriented information retrieval systems. They learn how to index and score documents for effective and relevant responses to user queries. Students acquire hands-on programming experience for data preparation and data extraction using various data sources and file formats. Recommended prior programming experience or 430-DL Python for Data Science. Prerequisite: MSDS 402-DL Introduction to Data Science.
  • MSDS 422:  The course introduces machine learning with business applications. It provides a survey of machine learning techniques, including traditional statistical methods, resampling techniques, model selection and regularization, tree-based methods, principal components analysis, cluster analysis, artificial neural networks, and deep learning. Students implement machine learning models with open-source software for data science. They explore data and learn from data, finding underlying patterns useful for data reduction, feature analysis, prediction, and classification. Prerequisites: MSDS 400-DL Math for Data Scientists, MSDS 401-DL Applied Statistics with R, and MSDS 402-DL Introduction to Data Science.


Find out more about Northwestern's Certificates of Advanced Graduate Study

Advanced Data Science Required Courses

To earn a certificate, students must complete any four of the following courses. In some cases, students who have completed equivalent coursework previously may be allowed to replace the required course with another course in the field.

Please note that courses completed in the certificate program cannot be transferred to the corresponding graduate degree.

Courses:Course Detail
Big Data Management/Analytics <> CIS 436-DL

This course reviews concepts behind both centralized and distributed database systems, and relational and not-only-relational database systems. Discussion of open source and commercial solutions, with special attention being paid to large distributed database systems and data warehousing. The course introduces technologies and modeling methods for large-scale, distributed analytics.

Note for MSIS students: It is highly recommended that MSIS students complete CIS 417 and CIS 435 or possess equivalent knowledge and skills prior to taking this course.

Note for MSPA students: Students must complete PREDICT 420 and PREDICT 422 prior to taking this course.

View CIS 436-DL Sections
Regression Analysis <> MSDS 410-DL

This course develops the foundations of predictive modeling by: introducing the conceptual foundations of regression and multivariate analysis; developing statistical modeling as a process that includes exploratory data analysis, model identification, and model validation; and discussing the difference between the uses of statistical models for statistical inference versus predictive modeling. The high level topics covered in the course include: exploratory data analysis, statistical graphics, linear regression, automated variable selection, principal components analysis, exploratory factor analysis, and cluster analysis. 

This is a required course for the Analytics and Modeling specialization.

Prerequisites: MSDS 400-DL Math for Data Scientists and MSDS 401-DL Applied Statistics with R.


Sections 55, 56, 57 - R

Section 58 - Python

View MSDS 410-DL Sections
Sports Performance Analytics <> MSDS 456-DL

An introduction to sports performance measurement and analytics, this course reviews roles of athletes at each position in sports selected by the instructor. With a focus on the individual athlete, the course discusses the development and use of accurate assessments and variability due to factors such as body type, climate, and training regimen. The course reviews athletic performance measurements, including jumping ability, running speed, agility, and strength. Students work with player on-field and on-court performance measures. The course utilizes exploratory data analysis, predictive modeling, and presentation graphics, showing real-world implications for athletes, coaches, team managers, and the sports industry.


Prerequisites: MSDS 400-DL Math for Data Scientists and MSDS 401-DL Applied Statistics with R.

View MSDS 456-DL Sections
Sports Management Analytics <> MSDS 457-DL

This course provides a comprehensive review of financial, statistical, and mathematical models as they relate to sports team administration, marketing, and business management. The course gives students an opportunity to work with data and models relating to sports business tactics and strategy. Students employ modeling methods in studying sports team media, ticket pricing and game-day events, loyalty and sponsorship program development, player and team valuation, and customer relationship management. The course makes extensive use of sports business case studies.


Prerequisite: PREDICT 401-DL Introduction to Statistical Analysis.

View MSDS 457-DL Sections
Marketing Analytics <> PREDICT 450-DL

This course provides a comprehensive review of predictive analytics as it relates to marketing management and business strategy. The course gives students an opportunity to work with data relating to customer demographics, marketing communications, and purchasing behavior. Students perform data cleansing, aggregation, and analysis, exploring alternative segmentation schemes for targeted marketing. They design tools for reporting research results to management, including information about consumer purchasing behavior and the effectiveness of marketing campaigns. Conjoint analysis and choice studies are introduced as tools for consumer preference measurement, product design, and pricing research. The course also reviews methods for product positioning and brand equity assessment. This is a case-study- and project-based course involving extensive data analysis.

Prerequisite: PREDICT 411-DL Generalized Linear Models.

View PREDICT 450-DL Sections
Risk Analytics <> PREDICT 451-DL

Building upon probability theory and inferential statistics, this course provides an introduction to risk analytics. Examples from economics and finance show how to incorporate risk within regression and time series models. Monte Carlo simulation is used to demonstrate how variability in data affects uncertainty about model parameters. Additional topics include subjectivity in risk analysis, causal modeling, stochastic optimization, portfolio analysis, and risk model evaluation.

Prerequisite: PREDICT 411-DL Generalized Linear Models

Recommended: PREDICT 413-DL Time Series Analytics and Forecasting

View PREDICT 451-DL Sections
Web and Network Data Science <> PREDICT 452-DL

A central part of e-commerce and social network applications, Web sites represent an important platform and data source for online marketing and customer relationship management. This course provides a comprehensive review of Web analytics. It shows how to use Web sites and information on the Web to understand Internet user behavior and to guide management decision-making. Topics include measurements of end-user visibility, organizational effectiveness, click analytics, and log file analysis. The course also provides an overview of social network analysis for the Web. This is a case-study- and project-based course with a strong programming component.

Prerequisite: PREDICT 401-DL Introduction to Statistical Analysis and PREDICT 420-DL Database Systems and Data Preparation.

View PREDICT 452-DL Sections
Text Analytics <> PREDICT 453-DL

This course is focused on incorporating text data from a wide range of sources into the predictive analytics process. Topics covered include extracting key concepts from text, organizing extracted information into meaningful categories, linking concepts together, and creating structured data elements from extracted concepts. Students taking the course will be expected to identify an area of interest and to collect text documents relevant to that area from a variety of sources. This material will be used in the fulfillment of course assignments.


Prerequisite: PREDICT 401-DL Introduction to Statistical Analysis and PREDICT 420-DL Database Systems and Data Preparation.

View PREDICT 453-DL Sections
Advanced Modeling Techniques <> PREDICT 454-DL

Drawing upon previous course work in predictive analytics, modeling, and data mining, this course provides a review of statistical and mathematical programming and advanced modeling techniques. It explores computer-intensive methods for parameter and error estimation, model selection, and model validation. The course focuses on techniques and algorithms from the statistical and machine learning disciplines, and it has a strong programming component. Example topics that could be included in this course include: ordinary least squares regression, logistic regression, multinomial logistic regression, classification and regression trees, neural networks, support vector machines, naïve Bayes, principal components analysis, cluster analysis, regularization techniques such as the LASSO, and boosting. The exact set of topics covered could vary from course to course or instructor to instructor, but the topics covered should be clearly interpreted by the student from the assigned readings. Each student will complete a series of individual assignments and a team project assignment.

Prerequisite: PREDICT 411-DL Generalized Linear Models

Students are strongly recommended to take PREDICT 413-DL Times Series Analytics and Forecasting, PREDICT 420 Database Systems and Data Preparation, and PREDICT 422-DL Practical Machine Learning before taking this course.

View PREDICT 454-DL Sections
Data Visualization <> PREDICT 455-DL

This course begins with a review of human perception and cognition, drawing upon psychological studies of perceptual accuracy and preferences. The course reviews principles of graphic design, what makes for a good graph, and why some data visualizations effectively present information and others do not. It considers visualization as a component of systems for data science and presents examples of exploratory data analysis, visualizing time, networks, and maps. It reviews methods for static and interactive graphics and introduces tools for building web-browser-based presentations. This is a project-based course with programming assignments.

Prerequisites: PREDICT 401-DL Introduction to Statistical Analysis.

View PREDICT 455-DL Sections
Decision Analytics <> PREDICT 460-DL

This course covers the fundamental concepts, solution techniques, modeling approaches, and applications of decision analytics, with the purpose of introducing students to the most commonly used applied optimization, simulation and decision analysis techniques for prescriptive analytics in business. Students will explore topics from linear programming, network optimization, integer linear programming, goal programming, multiple objective optimization, nonlinear programming, metaheuristic algorithms, stochastic simulation, queuing modeling, decision analysis, and Markov decision processes. Students will develop a contextual understanding of decision analytic techniques useful for providing managerial decision support by implementing the covered methods using state-of-the-art analytical modeling software. This is a problem and project-based course with a strong decision analytic modeling component.

Prerequisite: PREDICT 401-DL Introduction to Statistical Analysis.

There is no available section.
Special Topics <> PREDICT 490-DL


Deep learning has yielded multiple successful artificial intelligence (AI) applications (search, vision, translation, drug synthesis, etc.), receiving major investments from leading technology companies. It works by combining (typically neural network-based) machine learning methods with multiple representation levels, so that data is transformed through a series of processes yielding useful results. Deep learning has much in common with how the brain architecture uses multiple processing levels (as in the visual cortex), combined with feedback control mechanisms. Deep learning uses both generative and descriptive methods, where generative models learn joint probability distributions and make Bayes-rule predictions, and descriptive models learn from training examples. Students will learn which deep learning methods to select, based on dataset size, amount of known exemplars for training, classification granularity, and other factors. Students will do individual project-based work (typically using Python; optionally R or another language of their choice) incorporating deep learning principles and practices, and will each select their own projects and datasets. Weekly readings, lecture slides, videos, discussions and example programming assignments will provide insight as to how different methods can be used and combined, contributing to project effectiveness. No final exam. Topics will include the Boltzmann machine, Deep Neural Networks, Hidden Markov Models, and other clustering and classification methods.

Prerequisite: PREDICT 410-DL Regression and Multivariate Analysis

View PREDICT 490-DL Sections
Special Topics PREDICT 490-DL

Heuristic Modeling Methods

This course introduces the concept of heuristic models applied to optimization. Heuristic modeling methods are often used in problems that are difficult to solve due to the size of the solution space or that would take a long time to solve exactly. In these cases, a heuristic algorithm may provide an approximate solution within an acceptable range of the exact solution. Genetic algorithms are biologically inspired algorithms that are often used to solve non-linear optimization problems. This course introduces genetic algorithms with applications to data science. Other modeling methods introduced include simulated annealing, simulation, and queuing models. This course is case-based. Students are required apply heuristic modeling methods to selected case data. Prerequisite. PREDICT 401-DL Introduction to Statistical Analysis

View PREDICT 490-DL Sections
Back to top