Trend Analysis in R using real world data in 2024

Trend Analysis in R using real world data in 2024
82 / 100

Introduction

Recently I was searching for something interesting in Google trends and suddenly thought why not check how `ggplot` is performing. I know its random but just hit my mind and I landed up finding traffic on search term ggplot. I found the chart below and was quite astonished by its beautiful pattern.

Table of Contents

image

You can see this chart is both interesting and a great example to demonstrate the power of R in analysis.

So, I thought let’s analyze the trend for ggplot using R!!!!

In this article, we will learn how to find the trend and seasonality of a time series using R and will predict what will be the future of ggplot searches.Trend Analysis in R

So the first step is to download this data into CSV. You can do that by clicking on the download button.

Trend Analysis in R

After you download the data, you will find the data looks something like this,

Trend Analysis in R

So now we have two columns, one is the day on which the data is recorded and second is the actual data which we will analyze. The first step in any analysis is “Data Cleaning”

So as a good data scientist, it is important to make sure that when we import this data to R, it do not increase our workload.Trend Analysis in R

I am deleting the first two rows to ensure that my column headings i.e. “Day” and “ggplot: (Worldwide)” comes in the first row and rename it to “ggplot”. See how it will look like

Trend Analysis in R

Now start your R environment and import this data into R. Keep the files in the same directory to avoid confusion. Use the following commands:

library(readr)
dataset <- read_csv("multiTimeline.csv")
View(dataset)
str(dataset)  #to check the structure of data

Now, lets first convert this data into a time series data because you know it’s a time series data but R doesn’t know yet. (Why?)

Trend Analysis in R

When you run the last command, you will see the structure of data and as seen, Day column is “chr” which means character. This indicates R did not understand that this is a date column. So, we need to explain it to R.

If you see that data carefully, our date is Month/Day/Year.

Trend Analysis in R

We will convert this to time using R and check the structure again,

dataset$Day <- as.Date(dataset$Day, "%m/%d/%Y")
str(dataset)
Trend Analysis in R

Congratulations 😊. We got day column converted to “Date” type. Now, we can easily convert our data into time series.

Now let’s visualize the same series in R using ggplot,Trend Analysis in R

# load the ggplot2 package
library(ggplot2)

# create a line plot using ggplot
ggplot(dataset, aes(x = Day, y = ggplot)) +
  geom_line() +
  labs(title = "Sales over Time", x = "Date", y = "Sales")+theme_classic()

Once you run the code above, you will find a time series chart same as below

Trend Analysis in R

Now why is this time series interesting? Because there are two main components we can see here.

Trend Analysis in R

(i) Trend – Trend refers to the long-term movement or direction of a time series data. It shows whether the values are increasing or decreasing over time.

(ii) Seasonal Variations – Seasonal variations refer to the pattern that repeats itself after a fixed interval of time, such as daily, weekly, monthly, or yearly.

We can say that variations are cyclic, but since the approximate width of each cycle is similar, we will call it seasonal variations rather than cyclic variations.

But for this article we are focussing on Trend. For seasonal variations click here

Now to find the trend, we have to think of it as a straight line which either increases or decreases with time. So for our data we will create a new variable which represents time progression.

# Add a numeric time index to the data frame
dataset$Time_Index <- 1:nrow(dataset)

This will add an additional column to the dataframe with first day as 1 and so on.

Trend Analysis in R

Mathematically we write trend as,

image 10

Where,

Trend Analysis in R

To do this in R we have to apply simple linear regression,

# Fit a linear regression model to the data to obtain the trend component
trend_model <- lm(ggplot ~ Time_Index, data = dataset)
print(trend_model)
Trend Analysis in R

So after running the code you will find this result which shows intercept and time_index. Here the Intercept is your b and Time_Index is your a for the trend component.

So now we have value of a and b, we can write our trend component as,

image 13

So what does this show?

(i) First it shows that on the very first day in the past 90 days, the search volume score for GGPLOT was 44.24 units.

(ii) Second it shows that every day from the first day, the search volume is increase by 0.3986 units because the value is positive. This means that every next day, there will be additional 0.3986 units of search as we move forward in time.

Now let’s find out what are the values of search volume that trend component suggests. We will add a new column to our dataset for this using the mathematical equation above.

Trend Analysis in R
# Calculating predicted search volume
dataset$trend_component <- trend_model$coefficients[1] + trend_model$coefficients[2] * dataset$Time_Index
Trend Analysis in R

Now lets plot it using ggplot 😊

# create a line plot using ggplot
ggplot(dataset, aes(x = Day, y = ggplot)) +
  geom_line() + geom_line(aes(y= trend_component), color = "red")+
  labs(title = "Sales over Time", x = "Date", y = "Sales")+theme_classic()
Trend Analysis in R

Here you will find the red colored line representing the trend and it clearly shows a positive upward moving direction. So, if you are learning ggplot, then go ahead! The requirements will increase in future.

Now, after doing this, try to understand the situation, we analyzed only past 90 days it means the condition may vary in future but as of now, we can comfortably say that searches for GGPLOT is increasing.

Trend Analysis in R using real world data in 2024

Submit a Comment

Your email address will not be published. Required fields are marked *

corporate Sidebar Image

Unlock the power of data with our user-friendly statistics calculator.

Explore calculator
corporate Sidebar Image

Unlock the secrets of data-driven success in startups through my captivating research paper

CHECK RESEARCH PAPER
corporate Sidebar Image

Get expert guidance for your startup's success with a 2-hour consultation for just $10.

CONTACT ME NOW!

Contact me today! I have solution to all your problems.

Please enable JavaScript in your browser to complete this form.