{"id":147,"date":"2014-02-21T15:30:43","date_gmt":"2014-02-21T22:30:43","guid":{"rendered":"https:\/\/blogs.ubc.ca\/coetoolbox\/?page_id=147"},"modified":"2016-10-21T13:37:06","modified_gmt":"2016-10-21T20:37:06","slug":"aggregate-queries","status":"publish","type":"page","link":"https:\/\/blogs.ubc.ca\/coetoolbox\/databases\/aggregate-queries\/","title":{"rendered":"Aggregation Queries in SQL"},"content":{"rendered":"<p>In SQL you can generate statistical summaries of the records in a query using aggregation functions and clauses. In many cases, you will need to do some processing and transformations to your data before summarizing it.<\/p>\n<h2>Aggregation Functions<\/h2>\n<p>These functions operate on the values of a column (x) and return a value as shown in this table:<\/p>\n<table border=\"1\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td>Aggregation function<\/td>\n<td>Returns<\/td>\n<\/tr>\n<tr>\n<td>MIN(x)<\/td>\n<td>minimum value in column x<\/td>\n<\/tr>\n<tr>\n<td>MAX(x)<\/td>\n<td>maximum value in column x<\/td>\n<\/tr>\n<tr>\n<td>SUM(x)<\/td>\n<td>sum of values in column x<\/td>\n<\/tr>\n<tr>\n<td>AVG(x)<\/td>\n<td>average value in column x<\/td>\n<\/tr>\n<tr>\n<td>STDEV(x)<\/td>\n<td>standard deviation of values in column x<\/td>\n<\/tr>\n<tr>\n<td>COUNT(x)<\/td>\n<td>the number of values in column x<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The following queries will return a single value as result of the aggregation function:<\/p>\n<p style=\"padding-left: 30px;\">SELECT <span style=\"color: #0000ff;\">AVG<\/span> (FinalGrade)<br \/>\nFROM Registrations<br \/>\nWHERE CourseID = \u201cBAMS 580\u201d<\/p>\n<p style=\"padding-left: 30px;\">SELECT <span style=\"color: #0000ff;\">COUNT<\/span> (StudentID)<br \/>\nFROM Students<br \/>\nWHERE Program = \u201cMMOR\u201d<\/p>\n<p><strong>COUNT() Function<\/strong><\/p>\n<p>Depending on what information you are looking for, the Count function can be used in three different ways:<\/p>\n<table border=\"1\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td>COUNT(x)<\/td>\n<td>Returns the number of non-null values in column x.<\/td>\n<\/tr>\n<tr>\n<td>COUNT(*)<\/td>\n<td>Returns the number of rows in the queried table.<\/td>\n<\/tr>\n<tr>\n<td>COUNT(DISTINCT x)<\/td>\n<td>Returns the number of distinct values in column x.<br \/>\nNote: this function is not available in MS Access<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This query will return the same value as the previous example (assuming the students table has no entries with null ID):<\/p>\n<p style=\"padding-left: 30px;\">SELECT <span style=\"color: #0000ff;\">COUNT (*)<\/span><br \/>\nFROM Students<br \/>\nWHERE Program = \u201cMMOR\u201d<\/p>\n<h2>Aggregation Clauses<\/h2>\n<p>The <span style=\"color: #0000ff;\">GROUP BY<\/span> clause divides rows into groups and apply aggregation operations to each group. For example, this query will return the average grade for each course and year:<\/p>\n<p style=\"padding-left: 30px;\">SELECT CourseID, Year, AVG (FinalGrade)<br \/>\nFROM Registrations<br \/>\nGROUP BY CourseID, Year<\/p>\n<p>The <span style=\"color: #0000ff;\">HAVING<\/span> clause specifies a condition for groups displayed. This query will display the number of students only for courses after 2010.<\/p>\n<p style=\"padding-left: 30px;\">SELECT CourseID, Year, COUNT (StudentID)<br \/>\nFROM Registrations<br \/>\nGROUP BY CourseID, Year<br \/>\nHAVING Year &gt; 2010<\/p>\n<p>Aggregation functions can be included in the having clause:<\/p>\n<p style=\"padding-left: 30px;\">SELECT CourseID, Year, COUNT (StudentID)<br \/>\nFROM Registrations<br \/>\nGROUP BY CourseID, Year<br \/>\nHAVING\u00a0COUNT (StudentID) &gt; 10<\/p>\n<h2>Functions for data transformation<\/h2>\n<p>When summarizing in SQL you might need to do some conversions to the data you want to display. The <a href=\"http:\/\/technet.microsoft.com\/en-us\/library\/hh213574.aspx\" target=\"_blank\">IIF function<\/a> returns one of two values depending on a logical condition.<\/p>\n<p style=\"padding-left: 30px;\">SELECT StudentName, <span style=\"color: #0000ff;\">IIF<\/span>( Program=&#8217;MMOR&#8217;, 1, 0) as <span style=\"color: #993300;\">MMORStudent<\/span><br \/>\nFROM Students<\/p>\n<p>When summarizing it can be useful to transform categorical variables into dummy variables that can then be aggregated using math functions:<\/p>\n<p style=\"padding-left: 30px;\">SELECT <span style=\"color: #993300;\">AVG<\/span>( <span style=\"color: #0000ff;\">IIF<\/span>( FinalGrade&lt;50, 1, 0)) as <span style=\"color: #993300;\">Total_Students_Failed_BAMS580<\/span><br \/>\nFROM Registrations<br \/>\nWHERE CourseID=&#8217;BAMS580&#8242;<\/p>\n<p><a href=\"http:\/\/www.sql-server-helper.com\/tips\/tip-of-the-day.aspx?tkey=23d9b553-a66b-420e-8a4a-f254f97f947d&amp;tid=84&amp;tkw=example-uses-of-the-iif-logical-function\" target=\"_blank\">This webpage<\/a> shows other examples of conversions you can do. Keep in mind that IIF function works only for Microsoft databases since 2012. A more portable function across SQL platforms is <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/ms181765%28v=sql.110%29.aspx\" target=\"_blank\">CASE expression<\/a> (except for MS Access).<\/p>\n<p>Another way to summarize data in SQL is the <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/office\/ff192901%28v=office.14%29.aspx\" target=\"_blank\">TRANSFORM statement<\/a>, which allows to display values from specified fields or expressions as column headings.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In SQL you can generate statistical summaries of the records in a query using aggregation functions and clauses. In many cases, you will need to do some processing and transformations to your data before summarizing it. Aggregation Functions These functions operate on the values of a column (x) and return a value as shown in [&hellip;]<\/p>\n","protected":false},"author":22982,"featured_media":0,"parent":74,"menu_order":7,"comment_status":"open","ping_status":"closed","template":"page-templates\/full-width.php","meta":{"footnotes":""},"class_list":["post-147","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/pages\/147","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/users\/22982"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/comments?post=147"}],"version-history":[{"count":30,"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/pages\/147\/revisions"}],"predecessor-version":[{"id":1036,"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/pages\/147\/revisions\/1036"}],"up":[{"embeddable":true,"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/pages\/74"}],"wp:attachment":[{"href":"https:\/\/blogs.ubc.ca\/coetoolbox\/wp-json\/wp\/v2\/media?parent=147"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}