Aggregates in PostgreSQL  Hot PDF Print E-mail
Tag it:
Delicious
Furl it!
Digg
NewsVine
Reddit
YahooMyWeb
Technorati
Articles Reviews PostgreSQL
Written by Bogdan V   
Thursday, 28 September 2006

{mos_sb_discuss:49} 

Aggregates can be combined with a WHERE clause to produce more complex results. For example, the query SELECT AVG(age) FROM friend WHERE age >= 21 computes the average age of people age 21 or older. This prevents Dick Gleason from being included in the average computation because he is younger than 21. The column label defaults to the name of the aggregate.



Lists five aggregates. COUNT operates on entire rows; the other four operate on specific columns.

Aggregates can be combined with a WHERE clause to produce more complex results. For example, the query

SELECT AVG(age) FROM friend WHERE age >= 21 computes the average age of

people age 21 or older. This prevents Dick Gleason from being included in the average computation because he is younger than 21. The column label defaults to the name of the aggregate.

NULL values are not processed by most aggregates, such as MAX(), SUM(), and AVG(); they are simply ignored. However, if a column contains only NULL values, the result is NULL, not zero. C OUNT (*) is different in this respect. It does count NULL values because it looks at entire rows:

Aggregate COUNT (*) SUM (colname) MAX (colname) MIN (colname) AVG (colname)

Function count of rows total

maximum minimum average

test=> SELECT * FROM friend ORDER BY firstname;

firstname               |                lastname                 |              city                | state | age

----------------------------------- + ------------------------------------------- + ---------------------------------- + ------------- +        

Dean

| Yeager

| Plymouth

| MA          |      24

Dick

| Gleason

| Ocean City

| NJ           |      19

Ned

| Millstone

| Cedar Creek

| MD          |      27

Sandy

| Gleason

| Ocean City

| NJ           |      25

Sandy

| Weber

| Boston

| MA          |      33

Victor

| Tabor

| Williamsport

| PA          |      22

(6 rows)

test=> SELECT COUNT(*) FROM friend; count

test=> SELECT SUM(age) FROM friend;

sum -----150

(1 row)

test=> SELECT MAX(age) FROM friend;

max 

33

(1 row)

test=> SELECT MIN(age) FROM friend;

min 

19

(1 row)

test=> SELECT AVG(age) FROM friend;

avg 

25

(1 row)

using the asterisk( * ). It does not examine individual columns like the other aggregates. To find the COUNT of all non - NULL values in a certain column, use COUNT (colname). To find the number of distinct values in a column, use COUNT(DISTINCT colname).

First, a single row containing a NULL column is used to show aggregates returning NULL results. Two versions of COUNT on a NULL column are shown. Notice that COUNT never returns a NULL value. Then, a single non - NULL row is inserted, and the results shown. Notice the AVG() of 3 and NULL is 3, not 1.5, illustrating the NULL value is not considered in the average computation. Psql 's da command lists all of the aggregates supported by P OSTGRE SQL.

 Using GROUP BY

Simple aggregates return one row as a result. It is often desirable, however, to apply an aggregate to groups of rows. In queries using aggregates with GROUP BY, the aggregate is applied to rows grouped by another column in the table. For example, SELECT COUNT (*) FROM friend returns the total number of rows in the table. With GROUP BY, the table is split up into groups by state, and COUNT (*) is applied to each group in turn.

The second query shows the minimum, maximum, and average ages of the people in each state. It also shows an ORDER BY operation carried out on the aggregate column. Because the column is the fourth one in the result, you can identify it by the number 4. Using ORDER BY avg would have worked as well.

You can GROUP BY more than one column, as shown in Figure 5.4. G ROUP BY collects all NULL values into a single group.

Using HAVING

One more aggregate capability is often overlooked—the HAVING clause. H AVING allows a user to perform conditional tests on aggregate values. It is often employed in conjunction with GROUP BY. With HAVING, you can include or exclude groups based on the aggregate value for that group. For example, suppose you want to know all states in which you have more than one friend. H AVING allows you to test the count column. Aggregates cannot be used in a WHERE clause; they are valid only inside HAVING.

When a query has several clauses, such as FROM, WHERE, and GROUP BY, it is best to place each clause on a separate

test=> CREATE TABLE aggtest (col INTEGER); CREATE

test=> INSERT INTO aggtest VALUES (NULL); INSERT 19759 1

test=> SELECT SUM(col) FROM aggtest; sum

(1 row)

test=> SELECT MAX(col) FROM aggtest; max‑

(1 row)

test=> SELECT COUNT(*) FROM aggtest;

count 

1 (1 row)

test=> SELECT COUNT(col) FROM aggtest; count

0 (1 row)

test=> INSERT INTO aggtest VALUES (3); INSERT 19760 1

test=> SELECT AVG(col) FROM aggtest;

avg

3
(1 row)

test=> SELECT COUNT(*) FROM aggtest;

count 

2 (1 row)

test=> SELECT COUNT(col) FROM aggtest; count

1 (1 row)

test = > SELECT state, COUNT(*)

test-> FROM friend

test-> GROUP BY state;

state | count
                 +                

MA            |              2

MD            |              1

NJ             |              2

PA             |              1 (4 rows)

test=> SELECT state, MIN(age), MAX(age), AVG(age)

test-> FROM friend

test-> GROUP BY state test-> ORDER BY 4 DESC; state | min | max | avg
----------- + ------- + ------- + -------

MA          |       24 |          33 |

28

MD         |       27 |          27 |

27

NJ           |       19 |         25 |

22

PA           |       22 |          22 |

22

(4 rows)

Aggregate with G ROUP B Y

test=> SELECT city, state, COUNT(*) test-> FROM friend

test-> GROUP BY state, city

test-> ORDER BY 1, 2;

city                      | state | count

--------------------------- + --------- + ---------

Boston                           | MA             |              1

Cedar Creek                 | MD            |             1

Ocean City                    | NJ             |              2

Plymouth                      | MA             |              1

Williamsport                 | PA             |              1

(5 rows)

G ROUP BY with two columns

test=> SELECT state, COUNT(*)

test-> FROM friend

test-> GROUP BY state

test-> HAVING COUNT(*) > 1

test-> ORDER BY state;

state | count
                 +                

MA            |              2

NJ             |              2 (2 rows)

HAVING line. This convention makes queries easier to understand. Clear queries also use appropriate capitalization.

In a test database, mistakes do not create a problem. In a live production database, however, one incorrect query can cause great difficulty. It takes five seconds to issue an erroneous query, and sometimes five days to recover from it. Double-check your queries before executing them. This consideration is especially important for UPDATE, DELETE, and INSERT queries, because they modify the database.

Also, before performing an UPDATE or DELETE, do a SELECT or SELECT COUNT (*) with the same WHERE clause. Make sure the SELECT result is reasonable before doing the UPDATE or DELETE.


User reviews

There are no user reviews for this item.

Add new review




Powered by jReviews

Last Updated ( Wednesday, 27 June 2007 )
 
< Prev