How to create and interpret histogram in R Studio

How to create and interpret histogram in R Studio

What is a histogram? A histogram is a graphical representation of the distribution of a variable. Distribution means a dataset is divided into “groups” called “bins,” and we assign each data point to one of the groups. Finally, we calculate the number of data points in each group and plot them as a bar graph.  

Case 1 – Consider the height of all the students in your class 

To create the histogram, we will enter this data as a vector in R

12345678910
170185162169180142159153154180
#Create a vector of data values in R 

height <- c(170, 185, 162, 169, 180, 142, 159, 153, 154, 180) 

#Create histogram using “hist()” 

hist(height)

image 89

As you can see, the heights of 10 people from your class are classified into five groups. R has the power to self-assign correct group sizes and create the distribution. The distribution shows three people in the 150-160 group and two in the 170-180 group. Now observe the histogram, and you will see that the frequency of the middle groups is higher than the extreme end groups, which looks like approximately a bell. Bell shaped means the distribution is approximately normal.

Case 2 – Consider the height of basketball players in your college (require long people)

12345678910
170185175169180142181179182184

To create the histogram, we will enter this data as a vector in R

#Create a vector of data values in R

height <- c(170, 185, 175, 169, 180, 142, 181, 179, 182, 184)

#Create histogram using “hist()”

hist(height)
image 90

As in the above figure, there are more right-side group players than the left-side group. Why? Because we are talking about a basketball team where height is an important selection criterion. So indeed, the height of players will not be a bell-shaped normal distribution. In case when the right side of the group has higher frequencies than the left side, it is called left-skewed distribution.

Case 3 – Consider the height of rock-climbing players in your college (require short people)

12345678910
144143142147148142159147162154

To create the histogram, we will enter this data as a vector in R

#Create a vector of data values in R

height <- c(144, 143, 142, 147, 148, 142, 159, 147, 162, 154)

#Create histogram using “hist()”

hist(height)
image 91

As shown in the above figure, more players have a height on the left side of the groups compared to the right side. Why? Because rock climbing generally requires short-height people. Observe that this distribution is precisely opposite of the previous one, and since there are more people on the left side than on the right side, we call it right-skewed distribution.

Conclusion

So now we have understood the basics of histogram, you can easily identify the distribution of a data using histogram and also interpret the distribution.

📚 Further Quests: Level Up Your Data Game 📈

Ready to take your data game to the next level? 🚀 Here’s a treasure trove of resources that are as binge-worthy as the latest Netflix series. 🍿

📖 Books & Articles 📚

Statistics Fundamentals

R Programming

Python Programming

So, are you ready to embark on your next data quest? 🎮🌟

Submit a Comment

Your email address will not be published. Required fields are marked *

academic Sidebar Image

Unlock the power of data with our user-friendly statistics calculator.

Explore now
academic Sidebar Image

Explore our data science courses to supercharge your career growth in the world of data and analytics.

REGISTER FOR COURSES
academic Sidebar Image

Test Your Skills With Our Quiz

SOLVE QUIZ

Contact me today! I have solution to all your problems.

Please enable JavaScript in your browser to complete this form.