SQLite Tutorial  Hot PDF Print E-mail
Tag it:
Delicious
Furl it!
Digg
NewsVine
Reddit
YahooMyWeb
Technorati
Articles Reviews Structured Query Language
Written by Mike Chirico   
Monday, 27 November 2006
Article Index
SQLite Tutorial  Hot
Logging All Inserts, Updates, and Deletes
UTC and Localtime
The ATTACH Command
The Power of the Sign Function
Creating a Permanent Sign Function
Spreadsheet Format to Normalized Data
C and C API
A C Program -- Building a Class to Do the Work
Defining SQLite User Functions
Aggregate Functions
Reading Images (Blob data)

The ATTACH Command: Build a Virtual Table that Spans Multiple Tables on Separate Databases

This is a very powerful concept. As you have seen, sqlite3 works with a local database file. Within this local database, multiple tables can be created. This section will examine a technique to combine multiple tables with the same field layout that exist in separate database files into a single virtual table. On this single virtual table, you will see how selects can be performed. There is no overhead in copying or moving data. No data gets copied or moved, period. This is the ideal situation when working with very large tables. Suppose the computers on your network record port scans from snort to a local sqlite3 file. Provided you have access to the individual database files, via NFS mount or samba mount, you could virtually combine the tables from all your computers into one virtual table to perform database queries in an effort to identify global patterns of attack against your network.

This example will be done with the examdatabase, since we still have the scripts that were used for the exam table. We can easily create a new database "examdatabase2", along with a new exam table, by executing the following script from the bash shell:

$ sqlite3 examdatabase2 < examScript
$ sqlite3 examdatabase2 < examLOG
$ sqlite3 examdatabase2 "insert into exam (ln,fn,exam,score) values ('Carter','Sue',1,89);
insert into exam (ln,fn,exam,score) values ('Carter','Sue',2,100);"

$ sqlite3 examdatabase2 "select * from exam"
1|Sue|Carter|1|89|2004-10-02 16:04:12
2|Sue|Carter|2|100|2004-10-02 16:04:12

To combine the two database files, use the attach command. The alias for examdatabase will be e1, and the alias for examdatabase2 will be e2. The shorter names will come in handy when the tables are joined with the union clause (a standard SQL command).

After the "attach" database command is performed, the ".database" command can be used to show the location of the individual database files. The location follows the alias. See the example below.

$ sqlite3
SQLite version 3.0.8
Enter ".help" for instructions
sqlite> attach database 'examdatabase' as e1;
sqlite> attach database 'examdatabase2' as e2;
sqlite> .database
seq name file
--- --------------- ----------------------------------------------------------
0 main
2 e1 /work/cpearls/src/posted_on_sf/sqlite_examples/sqlite_exam
3 e2 /work/cpearls/src/posted_on_sf/sqlite_examples/sqlite_exam
sqlite>

To select all data from both tables, perform the union of two select statements as demonstrated below. Note that by adding 'e1' and 'e2' to the respective selects, it is possible to identify which database the returned records are coming from.

sqlite> select 'e1',* from e1.exam union select 'e2',* from e2.exam;

e1|1|Bob|Anderson|1|75|2004-10-02 15:25:00
e1|2|Bob|Anderson|2|82|2004-10-02 15:33:16
e2|1|Sue|Carter|1|89|2004-10-02 16:04:12
e2|2|Sue|Carter|2|100|2004-10-02 16:04:12

To summarize: A query was performed on two tables that resided in separate databases. This union created the virtual table. The select syntax is as follows: SELECT FROM

. For the table option, we have used the complete string "(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)", which is our virtual table.


Here is a query example performed on this virtual table. Suppose you wanted the maximum score by exam across databases.

sqlite> select exam,max(score) from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
group by exam;

1|89
2|100

No problem. You got the maximum score for each exam, but who does it below to? Find the ln and fn, but be careful; if you add "ln" and "fn" to the first part of the select, you will get the wrong answer.

sqlite> select exam,max(score),ln,fn from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
group by exam;

** THIS IS INCORRECT; it should be Carter|Sue. **

1|89|Anderson|Bob
2|100|Anderson|Bob

"Anderson", "Bob" happens to be the name that dropped down in this select statement. It is not the correct answer. If, by chance, you got the correct answer by doing this query, it is because you entered the names in a different order. If that is the case, perform the query below, which takes the min(score) and gets an error on one of these examples.

Here, the min(score) is queried. By chance, because of the order in which data was entered into this table, the correct answer is displayed.

sqlite> select exam,min(score),ln,fn from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
group by exam;

** correct answer -- just chance **

1|75|Anderson|Bob
2|82|Anderson|Bob

Clearly, there needs to be a better way of finding out who got the maximum and minimum scores for each exam. Here is the correct SQL statement which will always give the correct answer:

sqlite> select db,ln,fn,exam,score from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
where
(
score=(
select max(score) from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
where exam=1
)
and exam = 1
)
OR
(
score=(
select max(score) from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
where exam=2
)
and exam = 2
) ;

e2|Carter|Sue|1|89
e2|Carter|Sue|2|100


Or it can be done as two independent select statements as follows:

sqlite> select db,ln,fn,exam,score from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
where exam=1 order by score desc limit 1;

e2|Carter|Sue|1|89

sqlite> select db,ln,fn,exam,score from
(select 'e1' as db,* from e1.exam union select 'e2' as db,* from e2.exam)
where exam=2 order by score desc limit 1;

e2|Carter|Sue|2|100

A Pivot Table

What if you wanted a pivot table in which the scores are listed across the top as exam1,exam2,..examN for each person? For example:

fn ln exam1 exam2
--- -------- ----- -----
Bob Anderson 75 82
Sue Carter 89 100

Also, is there a way to display the deltas between exams, to have a fifth column that would show 7 points (82-75) or the delta between exam1 and exam2 and similar data for Sue Carter?

Such power select statements can be done with the sign function. And unlike the case statement, the sign function can be placed in the GROUP BY and HAVING expressions of a SELECT statement. For example, taking a look at the general syntax of the SELECT statement, the sign function can be used anywhere you see an expression or expression-list.

SELECT [ALL | DISTINCT] result [FROM table-list]
[WHERE expr]
[GROUP BY expr-list]
[HAVING expr]
[compound-op select]*
[ORDER BY sort-expr-list]
[LIMIT integer [( OFFSET | , ) integer]]

The sign function does not exist in sqlite, but that is not a problem, since we can easily create it.

As a side note, you may wonder why you should create the sign function. Instead, why not create an IF or IIF function? The main reason is that the IF statement is not standard on all databases, and, on some databases where it is standard (MySQL), it was created incorrectly. Yes, if you are a MySQL user, take a look at the following LONGWINDED TIP 1 for an example of MySQL's incorrect IF statement and how the sign function solves this problem.



Last Updated ( Sunday, 06 January 2008 )
 
< Prev   Next >