Description
₹75,575 + taxes
Avail upto 80% scholarship
Learn a host of technologies that help you master the
finer nuances of Sourcing & Storage, Steaming & Integration,
Mining & Cleaning, Analytics & Visualisation, with Skill Sigma.



Key Highlights







Cource Includes
The World of Data Science - An Introduction
Data Science in Practice
RDBMS & SQL
- File Management System
- Disadvantages of a File Management System in a Multi-user Environment.
- DBMS concepts, RDBMS concepts.
- Features of RDBMS
- Communication Language to RDBMS – SQL
- SQL Practicals — DDL,DML,DCL and TCL Commands
Data Warehousing Introduction and Evolution
- Data Warehouse Concepts
- Characteristics of DWH, Need for a DWH for Business Intelligence
- Difference between OLTP and OLAP
- The Architecture of DWH. Asset Assembly to Asset Exploitation
Unix Operating System
- Operating System Introduction
- Unix Essentials
- Unix Commands & Interface
Data Engineering – The World of Big Data
- What is Big Data?
- Characteristics of Big Data
- Challenges of Big Data
- Main Sources of Big Data
- Big Data Analytics and Applications of Big Data
- Traditional Data Architecture and Modern Data Architecture
- Big Data Use Cases & Industry Examples
Hadoop
- What is Hadoop?
- Why Hadoop?
- Advantages of Hadoop
- History of Hadoop
- Hadoop Key characteristics
– Reliability
– Scalability
– Flexibility
– Economical
– Robust - RDBMS vs Hadoop
- Hadoop Architecture and Ecosystem
- When to Use and Not Use Hadoop
Yarn
- What is Yarn?
- Advantages of using Yarn
- Yarn Architecture
- Applications of yarn
- What is Map Reduce?
- Understanding the Limitations of MapReduce in Hadoop
Apache Spark & Scala
- Introduction to Spark
- History of Spark
- Components of a spark programming
- Advantages of Spark
- Spark Architecture
- Spark Use Cases
- Introduction to Scala
- What is SBT? (Scala Built Tool)
- Resilient Distributed Datasets and its Operation
- RDD Operations – Map,Union, FlatMap, Intersect, Distinct,SortBy,Zip
- More RDD Operations – Sampling, Statistical, Other Operations
- Pair RDD Operations –countByKey, groupByKey, sortByKey
- Transformations supported by spark includes single-RDD and multi-RDD transformations
- What is Spark SQL?
- Features of Spark SQL
- Uses of Spark SQL
- Data Frames
- Creating Data Frames
- Demo
R Programming
Introduction to R
- Math, Variables, and Strings
- Vectors and Factors
- Vector operations
Data structures in R
- Arrays & Matrices
- Lists
- Data frames
R programming fundamentals
- Conditions and loops
- Functions in R
- Objects and Classes
- Debugging
Working with data in R
- Reading CSV and Excel Files
- Reading text files
- Writing and saving data objects to file in R
Strings and Dates in R
- String operations in R
- Regular Expressions
- Dates in R
Apache Spark & Scala
- Introduction to Spark
- History of Spark
- components of a spark programming
- Advantages of Spark
- Spark Architecture
- Spark Use Cases
Introduction to Scala
- What is SBT(Scala built tool)
- Resilient Distributed Datasets and its Operation
- RDD Operations – Map,Union,FlatMap,intersect,distinct,SortBy,Zip
- More RDD Operations – Sampling,Statistical,Other Operations
- Pair RDD Opearations –countByKey,groupByKey,sortByKey,join
- Transformations supported by spark includes single-RDD and multi-RDD transformations
What is Spark SQL
- Features of Spark SQL
- Uses of Spark SQL
- Data Frames
- Creating Data Frames
- Demo
R Programming
Introduction to R
- Math, Variables, and Strings
- Vectors and Factors
- Vector operations
Data structures in R
- Arrays & Matrices
- Lists
- Data frames
R programming fundamentals
- Conditions and loops
- Functions in R
- Objects and Classes
- Debugging
Working with data in R
- Reading CSV and Excel Files
- Reading text files
- Writing and saving data objects to file in R
Strings and Dates in R
- String operations in R
- Regular Expressions
- Dates in R
Python Programming for Data Science
- Introduction to Python
- Python History
- Python Applications
- Python Install, Python Path
- Python Example, Execute Python
Datatypes, Declarations and Comments
- Python Variables and Data Types
- Python Keywords
- Python Literals, Python Comments
- Sample Programs for the above
Operators in Python
- Arithmetical Operators
- Relational Operators
- Logical Operators
- Assignment Operators
- Sample programs for the above
Conditional Statements
- Simple IF, If and Else, Nested If
- Sample program using if conditions
Python Loops
- Python for loop, Python while loop
Python Loops
- Python Break, Python Continue, Python Pass
Python Data Structures or Collections
- Lists
- Tuples
- Named Tuple
- Sets (Default set, Frozen Set, Union, Intersect, Minus)
- Dictionaries
- Un-ordered Dictionary
- Ordered Dictionary
- ChainMap
- Counter
Python String Handling and Functions
- Handling string format with f-string
- capitalize(),center(),count(),endswith(),format(),rjust(), ljust()
- len(),replace(),upper(),lower(),split()with Examples
Number Functions
- abs() ,ceil(), floor(), cmp(), exp(), log(), log10()
- min(), max(), power(), round(), sqrt()With Examples
Date Functions
- Import Datetime Module
- Now(),Datetime()
- Import Calendar
- Calendar.Month(),Calendar.prcal(2019)
- Import Time
User-defined Functions in Python
- Required Argument Function
- Keyword Argument Function
- Varying Argument Functions
- Default Argument Functions
- Position only Parameter Functions
File Handling in Python
- Python Files I/O
- create file using “r”, “w” ,”a” modes
Statistics for Data Science
- Introduction
- Basic Statistics
- Useful Statistics in Analytics & Data Science
- Central Tendency
- Normal Distribution
- Hypothesis Testing
Machine Learning
Machine Learning For Data Science & Analytics
Machine learning vs. Statistical modelling
Supervised vs. Unsupervised Learning
- Machine Learning Languages, Types, and Examples
- Machine Learning vs Statistical Modelling
- Supervised vs Unsupervised Learning
- Supervised Learning Classification
- Unsupervised Learning
Supervised Learning
- Understanding Nearest Neighbour Classification
- The KNN algorithm
- Measuring Similarity with Distance
- Choosing Appropriate K
- Use Case
Classification Using Naïve Bayes
- Basic Concepts of Bayesian Methods
- Probabilistic Learning
Classification using Decision Trees
- The C5.0 Decision Tree Algorithm
- Understanding Classification Rules
- Separate and Conquer
- Rules from Decision Trees
- Advantages & Disadvantages of Decision Trees
Understanding Regression
- Simple Linear Regression
- Ordinary Least Square estimation
Correlations
Multiple Linear Regression
Support Vector Machines
- Classification with Hyper planes
- Using Kernels for non-linear spaces
Unsupervised Learning
Association Rules – Pattern detection
- K-Means Clustering plus Advantages & Disadvantages
- Hierarchical Clustering plus Advantages & Disadvantages
- Measuring the Distances Between Clusters – Single Linkage Clustering
- Measuring the Distances Between Clusters – Algorithms for Hierarchy Clustering
- Density-Based Clustering
Neural Networks
- Black Box Methods
- Training neural networks with backpropagation
- ANN – Artificial Neural Networks
- CNN – Convolutional Neural Networks
- Evaluating Model Performance
- Improving Model Performance
Statistics for Data Science
- Introduction
- Basic Statistics
- Useful statistics in Analytics & Data Science
- Central Tendency
- Normal Distribution
- Hypothesis Testing
Machine Learning
Machine Learning For Data Science & Analytics
Machine learning vs. Statistical modelling
Supervised vs. Unsupervised Learning
- Machine Learning Languages, Types, and Examples
- Machine Learning vs Statistical Modelling
- Supervised vs Unsupervised Learning
- Supervised Learning Classification
- Unsupervised Learning
Supervised Learning
- Understanding nearest neighbour classification
- The KNN algorithm
- Measuring similarity with distance
- Choosing Appropriate K
- Use Case
Classification Using Naïve Bayes
- Basic Concepts of Bayesian Methods
- Probabilistic Learning
Classification using Decision Trees
- The C5.0 decision tree algorithm
- Understanding Classification Rules
- Separate and Conquer
- Rules from decision trees
- Advantages & Disadvantages of Decision Trees
Understanding Regression
- Simple Linear Regression
- Ordinary least Square estimation
Correlations
Multiple Linear Regression
Support Vector Machines
- Classification with Hyper planes
- Using Kernels for non-linear spaces
Unsupervised Learning
Association Rules – Pattern detection
- K-Means Clustering plus Advantages & Disadvantages
- Hierarchical Clustering plus Advantages & Disadvantages
- Measuring the Distances Between Clusters – Single Linkage Clustering
- Measuring the Distances Between Clusters – Algorithms for Hierarchy Clustering
- Density-Based Clustering
Neural Networks
- Black Box Methods
- Training neural networks with back propagation
- ANN – Artificial Neural Networks
- CNN – Convolutional Neural Networks
Evaluating Model Performance
Improving Model Performance
Data Visualization with Tableau
Connecting to Data
Customizing a Data Source
- Filtering Your Data
- Sorting Your Data
- Creating Groups in Your Data
- Creating Hierarchies in Your Data
- Working with Date Fields: Discrete and Continuous Time
- Working with Date Fields: Custom Dates
- Working with Multiple Measures: Dual Axis and Combo Charts
- Working with Multiple Measures: Combined Axis Charts
- Showing Relationships between Numerical Values
- Mapping Data Geographically
- Using Crosstabs: Totals and Aggregation
Using Crosstabs: Highlight Tables
- Using Crosstabs: Heat Maps
- Using Calculations: Customize Your Data
- Using Calculations: Working with Strings, Dates, and Type Conversion Functions
- Using Calculations: Working with Aggregations
- Using Quick Table Calculations to Analyze Data
- Showing Breakdowns of the Whole
- Highlighting Data with Reference Lines
- Create a Dashboard: Combining Your Views
- Create a Dashboard: Add Actions for Interactivity
- Sharing Your Work
Working with a Data Extract
- Joining Tables
- Blending Multiple Data Sources
- Blending Data without a Common Field
- Using Split and Custom Split
- Advanced Calculations: Aggregating
Dimensions
- Controlling Table Calculations
- Showing the Biggest and Smallest Values
- Using Level of Detail Expressions
- Filtering and LOD Expressions
- Using Parameters to Control Data in the View
- Parameters: Swap Measures
Using Sets to Highlight Data
- Advanced Mapping: Modifying Locations
- Advanced Mapping: Customizing Tableau’s Geocoding
- Advanced Mapping: Using a Background Image
- Viewing Distributions
- Comparing Measures Against a Goal
- Showing Statistics and Forecasting: Use the Analytics Pane and Trend Lines Advanced
Dashboards: Using Design Techniques and Filter Actions
Telling Stories with Data
Reviews
There are no reviews yet.