CSE 462 Project #1 (due 11:59 pm 10/15/2012)
You will be using the DBLP relational schema that will be discussed in the recitation. Submit your solutions as a single .zip file, containing THREE files: a text file with all SQL queries, a text file with all query results, and a jar file. Use submit_cse462.
Problem 1 (64pts): SQL Queries
Write the following queries in SQL2, defining appropriate views if necessary:
Query 1.1: List all the authors who have a paper that is cited by at least 100 other papers.
Query 1.2: How many entities have less than four authors?
Query 1.3: How many authors published at least 100 papers?
Query 1.4: Among the titles that are written by 3 or more authors, list the top 10 titles in terms of the number of other titles that cite them in the descending order.
Query 1.5: List all the authors who have the longest career span (i.e. the difference between the year of the latest and the year of the earliest article of the author).
Query 1.6: List the titles of publications that cite the maximum number of other papers.
Query 1.7: List all authors who coauthored with at least 500 other authors.
Query 1.8: List all authors who published a paper that cites at least 2 other papers by themselves.
Run the queries against the DBLP database, and report the results.
Problem 2 (36pts): Top K
Introduction. Given a list of students and grades of N tests of each student, we define that, student A's grade is higher than student B's grade if and only if the average grade of A is greater than the average grade of B.
Given an input K, we want to output the top K students who have the highest grades.
Requirements. Your assignment is to write a class in Java that computes the top K students who have the highest grades for a given K.
In order to help you test your implementation, the relation StudentGrade will be created for you. The schema of this relation is as follows:
StudentGrade(studentName,test1,test2,…,test10) where studentName is the complete name of the student; test1,…,test10 correspond to the grades of test 1, test 2,…,test 10.
All codes about connecting to Oracle using JDBC will be discussed in the recitation to help you get started.
Observations. Obviously if StudentGrade contains fewer than K tuples, all students are in the top K students who have the highest grades. If two students have the same average grade, you may choose between them arbitrarily (breaking ties