From the “Pointless SQL Queries” file…

Somehow, the question came up on alt.usage.english of the distribution of author surnames (in people’s personal libraries) over the letters of the alphabet. Since I have a library database, I came up with the followng crazy query to answer this question:

SELECT first_letter, total, authors, (100.0 * "count" * auth_count / total)::NUMERIC(4,1) AS "%"
  FROM (SELECT first_letter, "count", COUNT(author.name) AS auth_count, string_agg(author.name, '; ') AS "authors" 
        FROM (SELECT author_num, UPPER(SUBSTR(author.name, 1, 1)) AS first_letter, COUNT(*) as "count", RANK() OVER (PARTITION BY UPPER(SUBSTR(author.name, 1, 1)) ORDER BY COUNT(*) DESC) AS "rank"
              FROM book_author JOIN author ON (author.number = book_author.author_num)
              GROUP BY author_num, author.name) AS foo
          JOIN author ON (author_num = author.number)
        WHERE "rank" = 1 AND "count" > 1
        GROUP BY first_letter, "count") AS bar
    RIGHT OUTER JOIN (SELECT UPPER(SUBSTR(author.name, 1, 1)) AS first_letter, COUNT(*) AS "total"
                      FROM book_author JOIN author ON (author.number = book_author.author_num)
                      GROUP BY first_letter) AS baz
      USING (first_letter)
 ORDER BY first_letter

See if you can figure out what that query (with doubly-nested subselects) is doing, and if there’s a more efficient way to write it. Here are the results:

first_letter total authors %
A 87 Asimov, Isaac 21.8
B 144 Bryson, Bill 9.7
C 88 Carey, Jacqueline; Czerneda, Julie E. 11.4
D 94 Duane, Diane 28.7
E 41 Editors of Cook’s Illustrated 24.4
F 103 Fenner, Cathy; Fenner, Arnie 44.7
G 87 Gould, Stephen Jay 24.1
H 99 Heinlein, Robert A. 13.1
I 1    
J 27 Jakubowski, Maxim; Jones, Diana Wynne 14.8
K 58 Kahn, David 10.3
L 158 Lackey, Mercedes 37.3
M 134 May, Julian 11.9
N 42 Norton, Andre 9.5
O 19 O’Donohoe, Nick 15.8
P 62 Pratchett, Terry 25.8
Q 1    
R 66 Raymo, Chet; Rombauer, Irma S. 12.1
S 159 Sacks, Oliver 5.0
T 55 Tolkien, J.R.R. 14.5
U 6 Unebasami, Kari; Unmacht, Robert 66.7
V 22 Vallejo, Doris 13.6
W 86 Walton, Jo 10.5
Y 8    
Z 6    
This entry was posted in Books, Computing and tagged . Bookmark the permalink.