|
Miscellaneous Topics
Indexes
Indexes allow a DBMS to access data quicker (please note: this feature
is nonstandard/not available on all systems). The system creates this internal
data structure (the index) which causes selection of rows, when the selection is
based on indexed columns, to occur faster. This index tells the DBMS where a
certain row is in the table given an indexed-column value, much like a book
index tells you what page a given word appears. Let's create an index for the
OwnerID in the AntiqueOwners column:
CREATE INDEX OID_IDX ON ANTIQUEOWNERS (OWNERID);
Now on the names:
CREATE INDEX NAME_IDX ON ANTIQUEOWNERS (OWNERLASTNAME, OWNERFIRSTNAME);
To get rid of an index, drop it:
DROP INDEX OID_IDX;
By the way, you can also "drop" a table, as well (careful!--that
means that your table is deleted). In the second example, the index is kept on
the two columns, aggregated together--strange behavior might occur in this
situation...check the manual before performing such an operation.
Some DBMS's do not enforce primary keys; in other words, the uniqueness of a
column is not enforced automatically. What that means is, if it was
tried to insert another row into the AntiqueOwners table with an OwnerID of 02,
some systems will allow this to to happen, even though we do not, as that column is
supposed to be unique to that table (every row value is supposed to be
different). One way to get around that is to create a unique index on the column
that we want to be a primary key, to force the system to enforce prohibition of
duplicates:
CREATE UNIQUE INDEX OID_IDX ON ANTIQUEOWNERS (OWNERID);
GROUP BY & HAVING
One special use of GROUP BY is to associate an aggregate function (especially
COUNT; counting the number of rows in each group) with groups of rows. First,
assume that the Antiques table has the Price column, and each row has a value
for that column. We want to see the price of the most expensive item bought by
each owner. We have to tell SQL to group each owner's purchases, and tell
us the maximum purchase price:
SELECT BUYERID, MAX(PRICE)
FROM ANTIQUES
GROUP BY BUYERID;
Now, say we only want to see the maximum purchase price if the purchase is
over $1000, so we use the HAVING clause:
SELECT BUYERID, MAX(PRICE)
FROM ANTIQUES
GROUP BY BUYERID
HAVING PRICE 1000;
More Subqueries
Another common usage of subqueries involves the use of operators to allow a
Where condition to include the Select output of a subquery. First, list the
buyers who purchased an expensive item (the Price of the item is $100 greater
than the average price of all items purchased):
SELECT BUYERID
FROM ANTIQUES
WHERE PRICE >
(SELECT AVG(PRICE) + 100
FROM ANTIQUES);
The subquery calculates the average Price, plus $100, and using that figure, an
OwnerID is printed for every item costing over that figure. One could use
DISTINCT BUYERID, to eliminate duplicates.
List the Last Names of those in the AntiqueOwners table, ONLY if they have
bought an item:
SELECT OWNERLASTNAME
FROM ANTIQUEOWNERS
WHERE OWNERID IN
(SELECT DISTINCT BUYERID
FROM ANTIQUES);
The subquery returns a list of buyers, and the Last Name is printed for an
Antique Owner if and only if the Owner's ID appears in the subquery list
(sometimes called a candidate list). Note: on some DBMS's, equals
can be used instead of IN, but for clarity's sake, since a set is returned from
the subquery, IN is the better choice.
For an Update example, we know that the gentleman who bought the bookcase has
the wrong First Name in the database...it should be John:
UPDATE ANTIQUEOWNERS
SET OWNERFIRSTNAME = 'John'
WHERE OWNERID =
(SELECT BUYERID
FROM ANTIQUES
WHERE ITEM = 'Bookcase');
First, the subquery finds the BuyerID for the person(s) who bought the Bookcase,
then the outer query updates his First Name.
Remember this rule about subqueries: when you have a subquery as part
of a WHERE condition, the Select clause in the subquery must have columns that
match in number and type to those in the Where clause of the outer query. In
other words, if you have "WHERE ColumnName = (SELECT...);",
the Select must have only one column in it, to match the ColumnName in the outer
Where clause, and they must match in type (both being integers, both
being character strings, etc.).
EXISTS & ALL
EXISTS uses a subquery as a condition, where the condition is True if the
subquery returns any rows, and False if the subquery does not return any rows;
this is a nonintuitive feature with few unique uses. However, if a prospective
customer wanted to see the list of Owners only if the shop dealt in Chairs, try:
SELECT OWNERFIRSTNAME, OWNERLASTNAME
FROM ANTIQUEOWNERS
WHERE EXISTS
(SELECT *
FROM ANTIQUES
WHERE ITEM = 'Chair');
If there are any Chairs in the Antiques column, the subquery would return a row
or rows, making the EXISTS clause true, causing SQL to list the Antique Owners.
If there had been no Chairs, no rows would have been returned by the outside
query.
ALL is another unusual feature, as ALL queries can usually be done with
different, and possibly simpler methods; let's take a look at an example query:
SELECT BUYERID, ITEM
FROM ANTIQUES
WHERE PRICE = ALL
(SELECT PRICE
FROM ANTIQUES);
This will return the largest priced item (or more than one item if there is a
tie), and its buyer. The subquery returns a list of all Prices in the Antiques
table, and the outer query goes through each row of the Antiques table, and if
its Price is greater than or equal to every (or ALL) Prices in the list, it is
listed, giving the highest priced Item. The reason "=" must be used is
that the highest priced item will be equal to the highest price on the list,
because this Item is in the Price list.
|