My R learning notes: quick ways to aggregate minutely data into hourly data

Recently I ran into a stone wall when I tried to analyse the time series data from my first-year experiments.

In my experiments, I measured tree sap flow velocity and different environmental controls such as temperature, solar radiation, relative humidity, wind speed and soil moisture. However, these data were at different time steps. The sap flow velocity were at a hourly basis, while the environmental controls were measured at either every 10 minutes or 20 minutes. So I had to aggregate those into the same time sequences.

There are two types of aggregating data. The first is to assemble these data at every hour at each date, in order to make a correlation analysis or multiple linear regression; and the second is to average data which are measured at the same hour, disregarding the date, aiming at analyzing the 24-hour profile of the variables.

Thanks to R, I succeeded in dealing with my four-month data at just ten seconds. Now, I’d like to share the codes with you, giving my soil moisture data as an example.  🙂

  1. Making a 24-hour profile of soil moisture

Here is part of my soil moisture data (unit: VW%) at my Treatment 2 plot. I measured the soil moisture at two depth: Shallow ( at the 20 cm depth), and Deep (at the 40 cm depth).

Fig. 1 Example of initial soil moisture data

(1) Save the excel file as *.csv, and import it into R Studio. In my case, I input my data as data frame named soilt2. Please remember to change the data type of Datetime column as Datetime, and turn Shallow column and Deep colunm as numeric.

Fig. 2 Input soil moisture dataset

(2) Input code:

soilt2$Datetime1 <- as.POSIXlt(soilt2$Datetime)$hour  ### This will add a new colunm named Datetime1 in the soilt2 dataset. Datetime1 displayed below showed that only the hour was extracted from Datetime.

Fig. 3 the added new colunm Datetime1 is showed on the right of the table.

soilt2 <- aggregate(cbind(Shallow, Deep) ~ Datetime1, data=soilt2, FUN=mean) ### This averaged Shallow and Deep data into hourly data,  regardless of the date. In my case, I changed my original dataset. If you don’t want to make change of the original dataset, please give a new name to your new dataset, such as soilt3.

Fig. 4 The changed soilt2 data frame.

Done! Yay!

2. Assemble the data at each hour at each date.

(1) Import the dataset (the same as the first method).

(2) Input the following code

soilt2$Datetime2 <- droplevels(cut(soilt2$Datetime, breaks=”hour”)) ### This added a new colunm named Datetime2 in soilt2, which includes the date and the hour extracted from Datetime column.

Fig. 5 The added new column Datetime2 is showed on the right of the table.

soilt3 <- aggregate(cbind(Shallow, Deep) ~ Datetime2, data=soilt2, FUN=mean) ### This calculate the mean of data in each hour at each date.

Fig. 6 The new soilt3 data frame.

Yay again! Done!

If you want to output these data into excel, you can load package (xlsx), and using the code of write.xlsx(  ).  For example, I use the following code:

write.xlsx(soilc1, “c:/Users/Yi Wang/Desktop/soilc1.xlsx”)

There are definitely other ways to aggregate the data, but the codes that I put on this blog, in my opinion, is the most simple code. If you disagree with me, or you’d like to share your code, please feel free to leave your comments.

Thank you for reading!

May, 04, 2017

 

 

29 thoughts on “My R learning notes: quick ways to aggregate minutely data into hourly data

  1. Hi there,

    Thanks for your code.
    Quick question. Do you know why when I input your line:
    soilt2$Datetime2 <- droplevels(cut(soilt2$Datetime, breaks=”hour”))
    which translates in my code to:
    Site27_summerBB$datetime2 <- droplevels(cut(Site27_summerBB$datetime, breaks=”hour”))

    I get:
    Error: unexpected input in "Site27_summerBB$datetime2 <- droplevels(cut(Site27_summerBB$datetime, breaks=”"

    It appears that it doesn't like "hour"

    Many Thanks,

    Jenni

    1. Oh, totally realized why as soon as I posted it typical! haha

      change from this:
      soilt2$Datetime2 <- droplevels(cut(soilt2$Datetime, breaks=”hour”))

      to this:
      soilt2$Datetime2 <- droplevels(cut(soilt2$Datetime, breaks='hour'))

      Thanks

  2. Hi Yi

    Thanks for sharing. I have some half-yearly electricity demand data and I wish to convert it into time series object. The data is between 2014-2017, therefore, includes the leap year 2016. I am not sure how to convert it to a time series object

    1. Hi Heidi,

      Thanks for reading the post! The gap should not be an issue if u use the codes in this post. Could you please let me know how you would like to convert?

  3. Thank you for this very helpful post!

    When I want to break into hour: data.merged_wake_sleep$Datetime2 <- droplevels(cut(data.merged_wake_sleep$absolute_Date, breaks= 'hour'))

    I get the following error message: Error in cut.default(data.merged_wake_sleep$absolute_Date, breaks = "hour") :
    'x' must be numeric.

    Anyone familiar with this?

    Best,
    Lea

    1. Hi Lea,

      Thanks for reading it! Maybe try this:

      data.merged_wake_sleep$absolute_Date1 <- as.numeric(data.merged_wake_sleep$absolute_Date) # turn your raw data into numeric data data.merged_wake_sleep$Datetime2 <- droplevels(cut(data.merged_wake_sleep$absolute_Date1, breaks= 'hour')) Please let me know if it works! Thank you!

  4. Hello,

    Thank you for this very helpful article!

    I came across the same issue as Lea, and tried to use the fix suggested but received the following error message:
    Error in breaks + 1 : non-numeric argument to binary operator

    When I take a look at my table, the new column I created with the as.numeric from the date/time column was simply numbered 1-12874 (number of values), rather than keeping any date/time format.

    Any ideas as to why this might be?

  5. This is really a extremely beneficial read for me, Have to admit you might be 1 in the most effective bloggers and my seotons I ever saw.Thanks for posting this informative article.

  6. Thanks, A Lot…
    Please don’t stop your Blogging and continue posting such solution..
    It helps a lot.

  7. What a nice article i found , i was looking it for a long time , i been searching it for so long now that i found out really help me.
    We must be more passionate about everything espeCially when it comes in our nature and environment check this out also >>
    카지노사이트

  8. Thank you so much for the article ! Simple, efficient and very clear. After hours spent to look for a solution, your article was a great help for me.

  9. Your writing is perfect and complete. However, I think it will be more wonderful if your post includes additional topics that I am thinking of. I have a lot of posts on my site similar to your topic. Would you like to visit once? keonhacai

  10. This is very sharp and straight to the point. Been struggling with the complicated posts on stack overflow. Thank you for this!

Leave a Reply

Your email address will not be published. Required fields are marked *