Rahul Chatterjee cleared his PMP Certification in his first attempt! Sumana Halder cleared her ITIL V3 Foundation exam with high marks! Monojit offered a job by Hindalco after completing his Oracle course! Avik placed with HP as an Oracle DBA

R Programming                                                                   Big Data Analytics Course                                         

R Programming

Palium offers a very comprehensive and exhaustive course on R-Programming. The course covers all aspects of the subject with hands-on examples and practice.

Mode: Classroom

Duration: 30 hours

Pre-requisite: None but knowledge of statistics helps

Course Outline

  • Getting R and Getting Started
  • Getting and Using R
  • A First R Session
  • Moving around in R
  • Working with Data in R
  • Dealing with Missing Data in R

Programming in R

  • What is Programming?
  • Getting Ready to Program
  • The Requirements for Learning to Program
  • Flow Control
  • Essentials of R Programming
  • Understanding the R Environment
  • Implementation of Program Flow in R
  • A First R Program
  • Example - finding Pythagorean Triples
  • Using R to Solve Quadratic Equations
  • Why R is Object-Oriented

Writing Reusable Functions

  • Examining R Function from Base-R Code
  • Creating a Function
  • Calculating a Confidence Interval for a Mean
  • Avoiding Loops with Vectorized Operations
  • Vectorizing If-Else Statement Using ifelse()
  • Making More Powerful Functions
  • Any, All, and Which
  • Making Functions More Useful
  • Confidence Intervals Revisited

Summary Statistics

  • Measuring Central Tendency
  • Measuring Location via Standard Scores
  • Measuring Variability
  • Covariance and Correlation
  • Measuring Symmetry (or Lack Thereof)

Creating Tables and Graphs

  • Frequency Distributions and Tables
  • Pie Charts and Bar Charts
  • Line Graphs
  • Saving and Using Graphics

Discrete Probability Distributions

  • Discrete Probability Distributions
  • Bernoulli Processes
  • Relating Discrete Probability to Normal Probability

Computing Normal Probabilities

  • Characteristics of the Normal Distribution
  • The Sampling Distribution of Means
  • A One-Sample z Test

Creating Confidence Intervals

  • Confidence Intervals for Means
  • Confidence Intervals for Proportions
  • Understanding the Chi-Square Distribution
  • Confidence Intervals for Variances and Standard Deviations
  • Confidence Intervals for Differences between Means
  • Confidence Intervals Using the stats Package

Performing t Tests

  • A Brief Introduction to Hypothesis Testing
  • Understanding the t Distribution
  • The One-Sample t Test
  • The Paired-Samples t Test
  • Two-Sample t Tests
  • A Note on Effect Size for the t Test

One-Way Analysis of Variance

  • Understanding the F Distribution
  • Using the F Distribution to Test Variances
  • Compounding Alpha, Post Hoc Comparisons
  • One-Way ANOVA
  • Using the anova Function

Advanced Analysis of Variance

  • Two-Way ANOVA
  • Repeated-Measures ANOVA
  • Mixed-Factorial ANOVA

Correlation and Regression

  • Covariance and Correlation
  • An Example: Predicting the Price of Gasoline
  • Determining Confidence and Prediction Intervals

Multiple Regression

  • The Multiple Regression Equation
  • Multiple Regression Example: Predicting Job Satisfaction
  • Using Matrix Algebra to Solve a Regression Equation
  • Brief Introduction to the General Linear Model
  • More on Multiple Regression

Logistic Regression

  • What is Logistic Regression?
  • Logistic Regression with One Dichotomous


  • Logistic Regression with One Continuous


  • Logistic Regression with Multiple Predictors
  • Comparing Logistic & Multiple Regression
  • Alternatives to Logistic Regression

Chi-Square Tests

  • Chi-Square Tests of Goodness of Fit
  • Chi-Square Tests of Independence
  • A Special Case: Two-by-Two Contingency


  • Relating the Standard Normal Distribution

to Chi-Square

  • Effect Size for Chi-Square Tests
  • Demonstrating the Relationship of Phi to the Correlation Coefficient

Nonparametric Tests

  • Nonparametric Alternatives to t Tests
  • Nonparametric Alternatives to ANOVA
  • Nonparametric Alternatives to Correlation

Using R for Simulation

  • Defining Statistical Simulation
  • Some Simulations in R

The "New" Statistics—Resampling and Bootstrapping

  • The Pitfalls of Hypothesis Testing
  • The Bootstrap
  • Permutation Tests
  • More on Modern Robust StatisticalMethods

Making an R Package

  • The Concept of a Package
  • Some Windows Considerations
  • Establishing the Skeleton of an R Package
  • Editing the R Documentation
  • Building and Checking the Package
  • Installing the Package
  • Making Sure the Package Works Correctly
  • Maintaining Your R Package

The R Commander Package

  • The R Commander Interface
  • Examples of Using R Commander for Data Analysis

About the Faculty

The program is conducted by an experienced professional working with a top tier IT-company and many years of IT exposure and experience.

Return to Top

Big Data Analytics Classroom Training

Palium Skills is offering Big Data Analytics Classroom Training in Kolkata ing courses to individuals and corporates.

It is estimated that most new jobs in IT Sector will be in these emerging areas like Business Intelligence, Big Data, SAS etc.

Currently, there is an acute shortage of trained people to work in this domain.

Therefore, early movers will get a distinct advantage - hence Palium has launched the Big Data and Hadoop course.

The certification and training course details are as follows -

Big Data & Hadoop Classroom and Online Training




Topics Covered


Day 1 – Introduction to Big Data and Hadoop


Introduction to

Big Data

• What is Big Data?
• Data Evolution Through Past 4 Decades
• Nature of Big Data – Volume, Variety & Velocity
• Factors Driving Use of Big Data
• Challenges faced by Conventional Data Analytics system
• Need for Big Data Technologies
• Overview of Big Data Technologies – Hadoop

1.5 Hours


Applications of

Big Data

• Real life business problems, which can’t be answered without data
• Case Study – Increasing Customer Life-time Value in Web Retail Business
• Case Study – Increasing Machine utilization and asset life in aManufacturing Organization
• Case Study – Real-time Credit Card Fraud Detection

1.5 Hours




• Principles of Distributed Processing
• Overview of Hadoop Architecture
• Hadoop Core Components – HDFS & Map/Reduce

1.5 Hours



• HDFS Overview
• Distributed File Storage Concepts
• Importance of Block Size
• Data Fault-tolerance using Replication
• Understanding Name node Metadata
• Cluster Fault-tolerance using Secondary Name node

2.5 Hours

Day 2– Hadoop Setup and Big Data Processing with Map / Reduce


Setting Up a

Hadoop Cluster

• Downloading, installing & Configuring Hadoop
• Setting up a Pseudo-Distributed Cluster
• Setting Up a Fully-distributed Cluster
• Working with Hadoop – DFS shell commands
• Hadoop User Interface
• Understanding Hadoop Log Files
• Changing HDFS block sizes and Replication Factors
• Adding and Removing Nodes from Hadoop Cluster

2 Hours


Introduction to


• What is Map/Reduce
• Map/Reduce examples from real world
• Basic principles of Map/Reduce
• Understanding various Map/Reduce Phases
• Running our first Map/Reduce program using Wordcount example

1 Hours


Working with


• Map/Reduce Framework
• Roles of Developer and Framework

1.5 Hours

• Hadoop Data Types – Writable & Comparable Interfaces
• Hadoop Input & Output Formats
• Writing a Map Class
• Writing a Reduce Class
• Creating a Job Configuration Object
• Building a Jar and running a Map/Reduce program


Map/Reduce in


• How Map/Reduce works
• Roles of Developer and Framework
• Hadoop Data Types – Writable & Comparable Interfaces
• Hadoop Input & Output Formats
• Writing a Map Class
• Writing a Reduce Class
• Creating a Job Configuration Object using inner classes
• Building a Jar and running a Map/Reduce program

2.5 Hours

Day 3 – Big Data Processing Simplified with PIG & HIVE


Introduction to Pig

• What is Pig? Why Pig
• Pig Architecture & Modes of Operations
• Pig Latin Script
• Pig Data Types
• Pig Operators
• Word Count Example & Other Hands-on Exercises
• User Defined Functions and Data Types
• PIG Data Joins

3 Hours


Introduction to


• Hive Introduction
• Differences between Hive and traditional RDBMS
• Hive Architecture – components of Hive
• Hive Data Model – database, partitions, tables, fields, indexes
• Hive Schema and Meta store
• Input / Output Formats – SerDe’s
• HQL - Hive Query Language
• HQL – Hands On Exercises
• Project - Building a Tweeter Sentiment Analysis Application

4 Hours

Day 4 – Real-Time Analytics with Hadoop


Introduction to
No-SQL Databases
• Why HBase?
• HBase Architecture
• Role of Zookeeper
• Overview of Cassandra
• Cassandra Exercise

3 Hours

Data Import.
Export with
Flume and Sqoop
• Bringing unstructured data into Hadoop using Flume
• Flume Exercise
• Bringing Relational Data from SQL into Hadoop using Sqoop
• Sqoop Exercises

2 Hours

Analytics with
• Why Spark
• Spark Architecture
• Spark Application Development Environment

2 Hours

Duration: 3months

Download Big Data Analytics Outline

Call us today to know more: +91-9051092035/ +91-99031 30500 

Return to Top

Who are we?

Palium is a training organization conducting trainings on Functional, Soft-Skills, IT and Project Management areas. We have conducted trainings for leading corporates like ITC, JUSCO, CMS and many others.

We also conduct trainings on Primavera P6, MS Project and other Project Management related topics.

For Registration Contact:               

Barindra De:  + 91 – 90510  92035   or
Shrabana Pal: + 91 – 84205  94969

Palium Software Services Private Limited

1st Floor, ‘Sheeba Bhavan’,
1/22, Poddar Nagar
(Near South City Mall/Dominos Pizza),
Kolkata, West Bengal 700068
P: +91-842 059 4969 L: +91-33-4001 7947
E: trainings@paliumsoftware.in


Plz call +91-9051092035 for more details -