Graduate Certificate in Big Data Management and Analytics

About: This certificate program equips students with a set of tools that allows them to achieve international standards in the management area, to successfully manage projects and human resources, and to analyze, evaluate, and improve systems.

Term: 1 to 3 years to graduate

Inquire Today

Today's the day to advance your career with our in-person or distance programs, conveniently located in St. Louis.

Inquire

  • Requirements
  • Course Information

Requirements

Graduate Certificate Requirements:

  • Certificate programs require the completion of twelve credit hours (four designated courses) of 3000-, 4000-, 5000-, and 6000-level lecture courses (1000/2000-level courses cannot be included).

Course Information

{{ course accordions import here }}

Required Courses

Description

The key objectives of this course are two-fold: (1) to teach the fundamental concepts of data mining and (2) to provide extensive hands-on experience in applying the concepts to real-world applications. The core topics to be covered in this course include classification, clustering, association analysis, data preprocessing, and outlier/novelty detection. 

Learning Objective

  1. Fundamental Knowledge: Provide students with a basic understanding of data mining principles and techniques.
  2. Practical Skills: Equip students with the ability to apply data mining methods to extract valuable insights from large datasets.
  3. Real-World Applications: Illustrate how data mining is used in various fields and industries for decision-making and problem-solving.

Course Content

  • Data Mining Basics: Introduction to key concepts, tasks, and preprocessing.
  • Techniques: Overview of popular algorithms, hands-on implementation, and evaluation.
  • Applications: Real-world examples and ethical considerations.
  • Tools: Introduction to data mining software and practical usage.
  • Process: Understanding the data mining lifecycle.
  • Data Visualization: Importance and techniques for data visualization.
  • Big Data: Challenges and opportunities in mining large datasets.
  • Projects: Hands-on application of data mining techniques.

Course Evaluation Criteria

  • Project 
  • Midterm Exam 
  • Final Exam

Description

Covers facets of cloud computing and big data management, including the study of the architecture of the cloud computing model with respect to virtualization, multi-tenancy, privacy, security, cloud data management and indexing, scheduling and cost analysis; it also includes programming models such as Hadoop and MapReduce, crowdsourcing, and data provenance.

Learning Objective

  1. To develop a comprehensive understanding of cloud computing architecture, encompassing virtualization, multi-tenancy, privacy, security, and data management principles.
  2. To acquire proficiency in cloud data management and indexing techniques, enabling efficient data handling within a cloud environment.
  3. To master cost analysis and optimization strategies specific to cloud computing, allowing for effective cost management and resource optimization.
  4. To achieve proficiency in programming models such as Hadoop and MapReduce, facilitating the application of these models for big data processing in cloud environments.
  5. To explore the concepts of crowdsourcing and data provenance, recognizing their significance in cloud-based big data applications.
  6. To apply cloud computing and big data tools effectively in practical scenarios, gaining hands-on experience in solving real-world problems and applications.
  7.  

Course Content

  • Cloud Computing Architecture and Principles
  • Virtualization, Multi-Tenancy, and Security in the Cloud
  • Cloud Data Management and Efficient Indexing
  • Cost Analysis and Optimization Strategies in Cloud Computing
  • Programming Models: Hadoop and MapReduce
  • Crowdsourcing, Data Provenance, and Their Role in Cloud-Based Big Data

Course Evaluation Criteria

  • HWs
  • Project
  • Final Exam

Choose One

Description

Analysis of large business data sets via statistical summaries, cross-tabulation, correlation, and variance matrices. Techniques in model selection, prediction, and validation utilizing general linear and logistic regression, Bayesian methods, clustering, and visualization. Extensive programming in R is expected. Prerequisites: Calculus, Statistics, and Programming knowledge.

Learning Objective

  1. To develop proficiency in analyzing large business data sets using statistical summaries, cross-tabulation, correlation, and variance matrices.
  2. To gain expertise in selecting appropriate models for business data analysis.
  3. To build and evaluate models for prediction and validation, including general linear and logistic regression, Bayesian methods, and clustering.
  4. To achieve advanced programming proficiency in the R language, with a focus on data manipulation, analysis, and visualization.
  5. To integrate and apply foundational knowledge in Calculus, Statistics, and Programming to solve real-world business data problems.
  6. To develop the ability to effectively communicate data analysis results using visualization techniques.

Course Content

  • Getting Started with Data Science
  • Advertising and Promotion Models
  • Preference and Choice Models
  • Market Basket Models and Analysis 
  • Text Analysis
  • Sentiment Analysis
  • Brand and Price Models

Course Evaluation Criteria

  • HWs
  • Project
  • Final Exam

Description

An introduction to cluster analysis and clustering algorithms rooted in computational intelligence, computer science, and statistics. Clustering in sequential data, massive data, and high dimensional data. Students will be evaluated by individual or group research projects and research presentations. Prerequisite: At least one graduate course in statistics, data mining, algorithms, computational intelligence, or neural networks, consistent with the student's degree program.

If you enroll in this course, you cannot enroll in ELEC ENG 6830, SYS ENG 6214, COMP SCI 6405, and STAT 6239 for this certificate.

Learning Objective

  1. Understand the fundamental concepts of clustering in data analysis.
  2. Evaluate and compare various clustering algorithms.
  3. Apply clustering techniques to real-world systems engineering problems.
  4. Analyze and interpret clustering results.
  5. Implement clustering algorithms in Python or another programming language.

Course Content

  • Introduction to Clustering
  • Data Preprocessing
  • Distance Metrics
  • Partitional Clustering
  • Hierarchical Clustering
  • Density-Based Clustering
  • Advanced Topics in Clustering
  • Clustering in Systems Engineering

Course Evaluation Criteria

  • Research Assignments and Presentations
  • Final Project

Description

An introduction to cluster analysis and clustering algorithms rooted in computational intelligence, computer science, and statistics. Clustering in sequential data, massive data, and high dimensional data. Students will be evaluated by individual or group research projects and research presentations. Prerequisite: At least one graduate course in statistics, data mining, algorithms, computational intelligence, or neural networks, consistent with the student's degree program.

If you enroll in this course, you cannot enroll in COMP ENG 6330, SYS ENG 6214, COMP SCI 6405, and STAT 6239 for this certificate.

Learning Objective

  1. Understand the fundamental concepts of clustering in data analysis.
  2. Evaluate and compare various clustering algorithms.
  3. Apply clustering techniques to real-world systems engineering problems.
  4. Analyze and interpret clustering results.
  5. Implement clustering algorithms in Python or another programming language.

Course Content

  • Introduction to Clustering
  • Data Preprocessing
  • Distance Metrics
  • Partitional Clustering
  • Hierarchical Clustering
  • Density-Based Clustering
  • Advanced Topics in Clustering
  • Clustering in Systems Engineering

Course Evaluation Criteria

  • Research Assignments and Presentations
  • Final Project

Description

An introduction to cluster analysis and clustering algorithms rooted in computational intelligence, computer science, and statistics. Clustering in sequential data, massive data, and high dimensional data. Students will be evaluated by individual or group research projects and research presentations. Prerequisite: At least one graduate course in statistics, data mining, algorithms, computational intelligence, or neural networks, consistent with the student's degree program.

If you enroll in this course, you cannot enroll in COMP ENG 6330, ELEC ENG 6830, COMP SCI 6405, and STAT 6239 for this certificate.

Learning Objective

  1. Understand the fundamental concepts of clustering in data analysis.
  2. Evaluate and compare various clustering algorithms.
  3. Apply clustering techniques to real-world systems engineering problems.
  4. Analyze and interpret clustering results.
  5. Implement clustering algorithms in Python or another programming language.

Course Content

  • Introduction to Clustering
  • Data Preprocessing
  • Distance Metrics
  • Partitional Clustering
  • Hierarchical Clustering
  • Density-Based Clustering
  • Advanced Topics in Clustering
  • Clustering in Systems Engineering

Course Evaluation Criteria

  • Research Assignments and Presentations
  • Final Project

Description

An introduction to cluster analysis and clustering algorithms rooted in computational intelligence, computer science and statistics. Clustering in sequential data, massive data and high dimensional data. Students will be evaluated by individual or group research projects and research presentations.

If you enroll in this course, you cannot enroll in COMP ENG 6330, ELEC ENG 6830, SYS ENG 6214, and STAT 6239 for this certificate.

Learning Objective

  1. To proficiently apply diverse clustering techniques, including those tailored for sequential data, massive datasets, and high-dimensional data, fostering the ability to discern their relevance and effectiveness in various scenarios.
  2. To conduct independent or group research projects, putting theoretical knowledge into practical use, and presenting research findings effectively, thus enhancing the ability to engage in data-driven research in clustering analysis.
  3. To cultivate effective data presentation and communication skills, allowing for clear and concise dissemination of clustering results, ensuring comprehensibility across technical and non-technical audiences.
  4. To critically evaluate clustering outcomes, gaining the capacity to interpret complex clustering results and derive meaningful insights from intricate datasets.

Course Content

  • Fundamentals of Cluster Analysis: Theoretical Underpinnings
  • Clustering Algorithms in Computational Intelligence and Computer Science
  • Cluster Analysis for Sequential Data and Temporal Patterns
  • Handling Massive Data Sets: Scalable Clustering Techniques
  • High-Dimensional Data Clustering and Dimensionality Reduction

Course Evaluation Criteria

  • HWs 
  • Project 
  • Midterm Exam

Description

An introduction to cluster analysis and clustering algorithms rooted in computational intelligence, computer science and statistics. Clustering in sequential data, massive data and high dimensional data. Students will be evaluated by individual or group research projects and research presentations. Prerequisite: At least one graduate course in statistics, data mining, algorithms, computational intelligence, or neural networks, consistent with the student's degree program.

If you enroll in this course, you cannot enroll in COMP ENG 6330, ELEC ENG 6830, SYS ENG 6214, and COMP SCI 6405 for this certificate.

Learning Objective

  1. Cluster Analysis Fundamentals: Provide students with a solid foundation in cluster analysis, covering its theoretical underpinnings and practical applications.
  2. Advanced Clustering Techniques: Introduce students to clustering algorithms rooted in computational intelligence, computer science, and statistics, including their use in handling sequential, massive, and high-dimensional data.
  3. Research and Presentation Skills: Develop students' research and presentation abilities through individual or group research projects, fostering a deeper understanding of cluster analysis concepts and applications.

Course Content

  • Cluster Analysis Overview
  • Foundations of Clustering
  • Clustering Algorithms
  • Clustering High-Dimensional Data
  • Sequential Data Clustering
  • Clustering Massive Data
  • Evaluation Metrics

Course Evaluation Criteria

  • Projects

Description

This course introduces data-oriented techniques for business intelligence. Topics include Business Intelligence architecture, Business Analytics, and Enterprise Reporting. SAP Business Information Warehouse, Business Objects, or similar tools will be used to access and present data, generate reports, and perform analysis.

Learning Objective

  1. Understand the role and significance of Business Intelligence (BI) in the context of ERP systems.
  2. Explain BI concepts, technologies, and their integration with ERP.
  3. Utilize BI tools to extract, transform, and load (ETL) data from ERP systems.
  4. Analyze and visualize data using BI dashboards, reports, and analytics.
  5. Design and implement BI solutions tailored to specific business requirements.
  6. Evaluate the impact of BI on decision-making and organizational performance.

Course Content

  • BI Concepts and Technologies
  • Data Integration with ERP
  • BI Reporting and Dashboards
  • BI Implementation and Project Management

Course Evaluation Criteria

  • Assignments
  • Midterm Exam
  • Final Project

Description

Management of semi-structured data models and XML, query languages such as Xquery, XML indexing, and mapping of XML data to other data models and vice-versa, XML views and schema management, advanced topics include change-detection, web mining and security of XML data.

Learning Objective

  1. Data Modeling and Management: Enable students to understand and apply data modeling techniques and management strategies specifically tailored for web environments, ensuring efficient storage, retrieval, and manipulation of web data.
  2. XML Fundamentals: Equip students with a comprehensive understanding of XML (Extensible Markup Language), its syntax, applications, and related technologies, enabling them to work effectively with structured web data.
  3. Web Data Integration: Teach students how to integrate and transform web data from diverse sources and formats, facilitating the development of dynamic and data-driven web applications.

Course Content

  • Data Modeling for the Web
  • XML Basics
  • Advanced XML Concepts
  • Web Data Integration
  • XML and Web Services
  • XML in Web Development
  • Web Data Security and Privacy
  • Future Trends and Emerging Technologies

Course Evaluation Criteria

  • HWs 
  • Project 
  • Midterm Exam 
  • Final Exam

Description

This course extensively discusses multi database systems (MDBS) and mobile data access systems (MDAS). Moreover, it will study traditional distributed database issues within the framework of MDBSs and MDASs.

Learning Objective

  1. Database Diversity: Familiarize students with the challenges and techniques of managing heterogeneous and mobile databases, crucial in modern computing environments.
  2. Optimized Performance: Teach students how to design, query, and maintain databases for optimal performance in both heterogeneous and mobile contexts.
  3. Security and Scalability: Equip students with the knowledge and skills to address security and scalability concerns specific to heterogeneous and mobile database systems.

Course Content

  • Introduction: Understanding data diversity, middleware, and integration challenges.
  • Mobile Databases: Mobile database architectures, synchronization, and storage.
  • Heterogeneous Integration: Techniques for integrating data from various sources.
  • Performance Optimization: Indexing, query optimization, caching, and replication.
  • Security: Access control, authentication, encryption, and privacy.
  • Scalability: Strategies for horizontal and vertical scaling, load balancing, and replication.
  • Location-Based Services: Managing location data, spatial indexing, and GIS applications.
  • Real-World Applications: Case studies from industries like healthcare and e-commerce.
  • Emerging Trends: Exploring NoSQL databases, edge computing, and blockchain in this context. 

Course Evaluation Criteria

  • HWs 
  • Project 
  • Midterm Exam

Choose One

Description

This course introduces the advanced database concepts of normalization and functional dependencies, transaction models, concurrency and locking, timestamping, serializability, recovery techniques, and query planning and optimization. Students will participate in programming projects.

Learning Objective

  1. Attain a deep understanding of advanced database concepts such as normalization and functional dependencies.
  2. Implement transaction models for maintaining data integrity.
  3. Grasp principles of concurrency control and locking mechanisms.
  4. Apply timestamping to maintain database consistency.
  5. Implement and assess serializable transactions.
  6. Implement recovery techniques for robust database systems.
  7. Understand and apply logging mechanisms for data recovery.
  8. Develop skills in formulating and optimizing complex database queries.
  9. Understand the principles behind query planning and execution.
  10. Gain practical experience in solving database-related challenges.

Course Content

  • Overview of normalization, functional dependencies, and transaction models.
  • In-depth study of concurrency issues and implementation of locking mechanisms.
  • Understanding timestamp-based approaches and implementing serializable transactions.
  • Exploration of recovery techniques and implementation of logging mechanisms.
  • Principles behind query optimization and hands-on exercises in query formulation.
  • Application of concepts through programming projects.
  • Practical experience in addressing real-world database challenges.

Course Evaluation Criteria

  • Project
  • Midterm Exam
  • Final Exam

Description

This course presents the topic of data warehouses and their value to the organization. It takes the student from the database platform to structuring a data warehouse environment. Focus is placed on simplicity and addressing the user community's needs.

Learning Objective

  1. Develop a clear understanding of the concept of data warehousing.
  2. Recognize the strategic value that data warehouses bring to organizations. Integrate data warehousing principles with existing database platforms.
  3. Explore seamless transitions from traditional databases to data warehouses.
  4. Learn techniques for effectively structuring a data warehouse environment.
  5. Understand the architectural considerations in designing a data warehousing system.
  6. Address the specific needs of the user community in data warehousing.
  7. Design solutions that prioritize user accessibility and simplicity.

Course Content

  • Defining data warehousing and its role in organizational strategy.
  • Exploring the evolution and significance of data warehouses.
  • Integrating data warehousing principles with existing database platforms.
  • Understanding the shift from transactional databases to the data warehousing paradigm.
  • Architectural considerations in structuring an effective data warehouse.
  • Implementation strategies for optimal data organization and storage.
  • Analyzing user community needs and expectations.
  • Designing data warehouses with a focus on user accessibility and simplicity.
  • Surveying tools and technologies essential for data warehousing.
  • Hands-on exploration of popular data warehousing platforms.
  • Ensuring data quality within the data warehouse.
  • Implementing governance mechanisms for maintaining data integrity.
  • Reviewing real-world case studies of successful data warehousing implementations.
  • Extracting best practices for designing and managing data warehouses.

Course Evaluation Criteria

  • HWs
  • Project

Description

Advanced topics of current interest in the field of data mining. This course involves reading seminal and state-of-the-art papers as well as conducting topical research projects including design, implementation, experimentation, analysis, and written and oral reporting components.

Learning Objective

  1. Comprehend advanced data mining concepts and theoretical foundations.
  2. Analyze and critique influential papers, developing critical thinking skills.
  3. Conduct topical research projects, mastering design, implementation, and experimentation.
  4. Apply advanced data mining techniques to real-world problems using actual datasets.
  5. Enhance written and oral communication skills for effective research reporting.
  6. Stay updated on the latest trends, challenges, and opportunities in data mining.

Course Content

  • Exploration of advanced data mining applications.
  • Overview of emerging methodologies and their practical implications.
  • In-depth analysis of seminal papers, dissecting methodologies and theoretical underpinnings.
  • Historical examination of influential works, tracing their impact on current data mining paradigms.
  • Hands-on exploration of cutting-edge data mining techniques.
  • Practical exercises employing advanced algorithms using contemporary tools and frameworks.
  • Rigorous selection and definition of research problems.
  • Formulation of hypotheses, project planning, and incorporation of experimental design principles.
  • Practical implementation of advanced algorithms.
  • Conducting experiments with real-world datasets, emphasizing robust statistical methodologies.
  • Application of advanced statistical methods in data mining.
  • Interpretation of intricate patterns and trends, with an emphasis on feature selection and dimensionality reduction.

Course Evaluation Criteria

  • HWs
  • Project
  • Midterm Exam

Description

Introduction to time series modeling of empirical data observed over time. Topics include stationary processes, autocovariance functions, moving average, autoregressive, ARIMA, and GARCH models, spectral analysis, confidence intervals, forecasting, and forecast error. 

Learning Objective

  1. Develop a strong foundation in time series modeling and analysis for empirical data observed over time.
  2. Master various time series models, including stationary processes, ARIMA (AutoRegressive Integrated Moving Average), GARCH (Generalized Autoregressive Conditional Heteroskedasticity), and spectral analysis.
  3. Acquire the skills needed to confidently forecast future values and assess forecast accuracy using time series data.

Course Content

  • Overview of time series data and its characteristics.
  • Autocovariance Functions
  • Moving Average (MA) Models
  • Autoregressive (AR) Models
  • ARIMA Models
  • GARCH Models
  • Spectral Analysis
  • Confidence Intervals
  • Forecasting Techniques
  • Forecast Error Assessment

Course Evaluation Criteria

  • HWs
  • Exams
  • Project