Category Archives: ScriptingMemo

Web application for downloading Environment Canada’s weather data in bulk

Upon realizing that there’s no easy way to download ECCC weather data in bulk (can only down data month by month through the website: https://climate.weather.gc.ca/historical_data/search_historic_data_e.html), I decided to use R & a package called weathercan to download the data in bulk. Since not everyone uses R or always have R software installed (e.g.: on a public computer), I decided to write a small web application using R Shiny. The web application is currently hosted on shinyapps.io under a free account (5 concurrent users & 25 monthly run time limit) so please close the browser window as soon as you are done using the app, thank you.

https://nickrongkp.shinyapps.io/WeatherCan/

 

If you are an experienced R coder who wants to improve upon my Shiny app, feel free to do so. You can find the source code of my ShinyWeatherCan on GitHub. If you are interested in collaboration, please contact me.

Nick

 

Leave a Comment

Filed under ScriptingMemo

Protected: FRST590 Term Project: Regional Analysis

This content is password protected. To view it please enter your password below:

Enter your password to view comments.

Filed under ScriptingMemo

How to use CFA (Consolidated Frequency Analysis) with HYDAT data

1. Installing CFA on modern computers requires a DOS emulator because CFA only runs on DOS which has been removed on new OS since Windows XP. Watch this Youtube video if you need help:

 

2. Environment Canada (EC) used to publish it’s hydrologic data through distributing CD-ROM which contains the HYDAT dataset which can be directly read into CFA. However, since I don’t know when, EC has been publishing HYDAT data online only through downloadable database file. Therefore, it will be necessary to convert the database file into ASCII files for CFA (next step)

Before we convert the files into ASCII, it’s important to talk about the use of EC Data Explorer as a data screening tool: https://ec.gc.ca/rhc-wsc/default.asp?lang=En&n=0A47D72F-1

 

All the stream gauges are plotted on a map and individual stream gauges can be selected on the map

 

Stream gauge data can be plotted as time series directly with few clicks. Also the station information/meta-data are extremely useful in understanding the watershed it monitors. Make sure you check the data symbols (sometimes a value is estimated due to stream gauge failure etc.)

Note the HYDAT data file comes with the EC Data Explorer installation file could be outdated and it’s recommended that you download the newest HYDAT data file and replace the old one in the EC Data Explorer’s folder.

 

3. As mentioned, new HYDAT database file cannot be read into CFA without some file conversion.

I have used R to read the HYDAT database file and write the ASCII files that can be imported into CFA. The script can be found here: https://github.com/nickyrong/UBC_FRST590_scripts/blob/master/FRST590_convert_HYDATmdb_to_CFAascii.R

However, if you don’t want to deal with R, I have converted it for you already using the newest HYDATA dataset (Jan 2017): Hydat_v2007Jan_CFA (Just download and unzip)

The use of CFA with ASCII files is in this video:

Good luck!

2 Comments

Filed under ScriptingMemo

R script to convert HYDAT mdb file into CFA readable ascii files

#######################################################################
#
# ** Convert HYDAT database file into CFA-readable ASCII files **
#
# ** USE AT YOUR OWN RISK, NO WARRANTY **
#
# Developed by Nick(WeiTao) Rong
# Watershed Hydrology Group, UBC Forestry
#
# Last modified: March 24, 2016
#######################################################################
################## **** Read Me!! **** ###################
# This script will output either mean-daily AMS or
# instantaneous peaks. You need to specify the location
# of your HYDAT mdb file and the script will generate a
# sub-directory containing all ascii files with the
# WSC station number as file name with no extension
#
# ** IMPORTANT NOTICE **
# Because this script does not filter out any stations,
# users are highly recommended to screen the dataset
# first with EC DataExplorer software available on
# the internet (Windows OS only)
#
# To use this script, you need to change:
# 1) hydat.input.location (line 50)
# 2) ascii.output.location (line 56)
# 3) INST (line 59)
##########################################################

rm(list=ls()) # good habit to clean the workspace first

########### **** R Packages Required **** ################
library(Hmisc) # mdb.get() read MS Access database

# Note mdb.get() of {Hmisc} requires mdbtools package on the OS level
# For Mac, install mdbtools use brew or macports
# For Linux, install mdbtools use apt-get
# For Windows with Cygwin, install mdbtools
########### **** Read in HYDAT dataset **** ##############
# Download the .mdb HYDAT dataset: ftp://ftp.tor.ec.gc.ca/HYDAT/
# Unzip the hydat dataset
# Where is the .mdb file located? Including the file name & extension
# The file reading process can take 2~5 mins

hydat.input.location = “/Users/nickrong/Dropbox/FRST590/Hydat_Jan2016.mdb”
########### **** Output ascii files **** ##############
# Where you want the ascii file folder to be located?
# Do not forget the “/” at the end…
ascii.output.location = “/Users/nickrong/Dropbox/FRST590/ascii/”

# WANT INSTANEOUS PEAKS (TRUE) OR MEAN-DAILY ANNUAL PEAKS (FALSE)?
INST = FALSE

# After modifying the items above, run the entire script in R
# Only Modify Things Below If You Know What You Are Doing!!!

# Read the database and store information in list
hydat.all = mdb.get(hydat.input.location)

# The actual hydat database is huge, extract just the table of info.
hydat.table = mdb.get(hydat.input.location, tables = TRUE)

print(“HYDAT mdb file read-in completed”)

if (INST == TRUE) {

hydat.Qmax = subset(hydat.all[[18]], DATA.TYPE == ‘Q’ & PEAK.CODE == ‘H’)
# Annual peak flows are stored in hydat.all[[18]] –>”ANNUAL_INST_PEAKS” (Instantaneous peaks)
hydat.allQ = data.frame(
STATION.NUMBER = hydat.Qmax$STATION.NUMBER,
YEAR = hydat.Qmax$YEAR,
MONTH = formatC(as.numeric(hydat.Qmax$MONTH), width=2, flag=”0″),
FLOW = hydat.Qmax$PEAK,
# DATA.TYPE has to be the last one so I can remove it easily later
DATA.TYPE = hydat.Qmax$DATA.TYPE)
} else{

hydat.Qmax = subset(hydat.all[[19]], DATA.TYPE == ‘Q’)
# Annual peak flows are stored in hydat.all[[19]] –>”ANNUAL_STATISTICS” (Mean Daily Max)
hydat.allQ = data.frame(
STATION.NUMBER = hydat.Qmax$STATION.NUMBER,
YEAR = hydat.Qmax$YEAR,
MONTH = formatC(as.numeric(hydat.Qmax$MAX.MONTH), width=2, flag=”0″),
FLOW = hydat.Qmax$MAX,
# DATA.TYPE has to be the last one so I can remove it easily later
DATA.TYPE = hydat.Qmax$DATA.TYPE)
}
# Station information in hydat.all[[26]] –>”STATIONS”
hydat.allSTATION = data.frame(
STATION.NUMBER = hydat.all[[26]]$STATION.NUMBER,
STATION.NAME = hydat.all[[26]]$STATION.NAME,
PROVINCE = hydat.all[[26]]$PROV.TERR.STATE.LOC,
AREA = hydat.all[[26]]$DRAINAGE.AREA.GROSS
)

# Create the output folder if not exist already
dir.create(file.path(ascii.output.location), showWarnings = FALSE)
setwd(file.path(ascii.output.location))

# loop to generate one ASC file each station…
for (loop1 in 1:length(hydat.allSTATION$STATION.NUMBER)){

# Which station we are working on?
station = as.character(hydat.allSTATION$STATION.NUMBER[loop1])

# Subset out Annual Peaks of just this station and just Flow (DATA.TYPE == Q)
station.Q = subset(hydat.allQ, STATION.NUMBER == station)
input.Q = station.Q[,1:4] # remove DATA.TYPE
input.Q = input.Q[order(input.Q$YEAR),] # sort by years

# filter out stations without records
if (length(input.Q$YEAR) != 0) {
# Initial file with the file name as the station number; no extension
CFAinput <- file(paste0(ascii.output.location, station), “w”)

# Writting header info
cat(paste0(hydat.allSTATION$STATION.NUMBER[loop1],”\n”), file = CFAinput)
cat(paste0(hydat.allSTATION$STATION.NAME[loop1],”\n”), file = CFAinput)
cat(paste0(length(input.Q$YEAR),” “,
hydat.allSTATION$AREA[loop1],”\n”), file = CFAinput)

# Appending the flow data to the file (no col/row names)
write.table(input.Q, file = CFAinput, append = TRUE,
col.names = FALSE, row.names = FALSE, quote = FALSE)

# Finish the file writing
close(CFAinput)
}

completion = (loop1/length(hydat.allSTATION$STATION.NUMBER))*100

if(completion %% 5 < 0.01) {
print(paste0(round(completion, digits = 0), “% of stations output completed”))
}

} # End of loop for exporting each WSC station into a separated ascii file

#################### EOF ####################

5 Comments

Filed under ScriptingMemo

Simple function in R to calculate water year

###########################################################################
# Simple Water Year Function
###########################################################################
wtr_yr <- function(dates, start_month = 10) {
# Convert possible character vector into date
d1 = as.Date(dates)
# Year offset
offset = ifelse(as.integer(format(d1, “%m”)) < start_month, 0, 1)
# Water year
adj.year = as.integer(format(d1, “%Y”)) + offset
# Return the water year
return(adj.year)
}

Leave a Comment

Filed under ScriptingMemo

Handling date and date-time in R

Be careful with the time zone

> example = as.POSIXct(“1957-09-30 16:00:00”)

> example

[1] “1957-09-30 16:00:00 PST”

> as.Date(example)

[1] “1957-10-01”

> as.Date(example, tz = Sys.timezone())

[1] “1957-09-30”

The problem is that as.POSIXct() uses system timezone as default time zone and as.Date() uses UTC as default timezone.

Leave a Comment

Filed under ScriptingMemo