class: center, middle # Introduction to MySQL<br/>Session III ## Oct 16, 2014<br/>Coalition for Queens #### Dan Goldin<br/>dan@dangoldin.com --- # Agenda 1. Review GROUP BY/JOINs 2. Subqueries in an IN 3. Subselect JOINs 4. Exercises --- # GROUP BY - Allows us to aggregate and group our data * SELECT [fields], [aggregate functions] FROM [table] GROUP BY [fields]; --- # GROUP BY - We need to make sure that what we're SELECTing matches what we're GROUPing BY ```sql SELECT week, SUM(points) AS total_points, SUM(passing_tds) as total_passing_tds FROM stats GROUP BY week; ``` --- # GROUP BY <div style="float:left; margin-right: 60px;"> <h4>Raw data</h4> <table class="table"><thead><tr><th>week</th><th>player_id</th><th>points</th></tr></thead><tbody><tr><td>5</td><td>2</td><td>16.81</td></tr><tr><td>5</td><td>1</td><td>13.51</td></tr><tr><td>5</td><td>3</td><td>13.54</td></tr><tr><td>5</td><td>4</td><td>12.81</td></tr><tr><td>6</td><td>2</td><td>20.72</td></tr><tr><td>6</td><td>1</td><td>14.2</td></tr><tr><td>6</td><td>4</td><td>14.55</td></tr><tr><td>6</td><td>3</td><td>13.85</td></tr><tr><td>7</td><td>2</td><td>16.96</td></tr><tr><td>7</td><td>3</td><td>15.52</td></tr><tr><td>7</td><td>4</td><td>15.46</td></tr><tr><td>7</td><td>1</td><td>14.02</td></tr><tr><td>8</td><td>2</td><td>20.94</td></tr><tr><td>8</td><td>1</td><td>14.43</td></tr><tr><td>8</td><td>3</td><td>13.25</td></tr><tr><td>8</td><td>4</td><td>12.66</td></tr></tbody></table> </div> <div style="float:left; margin-right: 60px;"> <h4>GROUP BY week</h4> <table class="table"><thead><tr><th>week</th><th>total_points</th></tr></thead><tbody><tr><td>5</td><td>56.67</td></tr><tr><td>6</td><td>63.32</td></tr><tr><td>7</td><td>61.96</td></tr><tr><td>8</td><td>61.28</td></tr></tbody></table> </div> <div style="float:left;"> <h4>GROUP BY player_id</h4> <table class="table"><thead><tr><th>player_id</th><th>total_points</th></tr></thead><tbody><tr><td>1</td><td>56.16</td></tr><tr><td>2</td><td>75.43</td></tr><tr><td>3</td><td>56.16</td></tr><tr><td>4</td><td>55.48</td></tr></tbody></table> </div> --- # Exercises - Describe the columns and data returned by the following query: ```sql SELECT name AS player, count(*) AS count FROM players GROUP BY name; ``` --- # Exercises - Describe the columns and data returned by the following query: ```sql SELECT week, count(*) AS num_games FROM schedule WHERE home_id IN (1,2,3,4,5,6) GROUP BY week; ``` --- # Exercises - Describe the columns and data returned by the following query: ```sql SELECT week, sum(rushing_tds) AS total_rushing_tds, max(rushing_tds) AS max_rushing_tds, sum(receiving_tds) AS total_receiving_tds, max(receiving_tds) AS max_receiving_tds FROM stats WHERE points > 0 GROUP BY week; ``` --- # Exercises - Now write queries to do the following: * Retrieve the number of total projected points by week for all weeks past 11. * Retrieve the number of players in our dataset for each team (team_id). --- # JOIN - Allows us to merge data from multiple tables (Think VLOOKUP in Excel) * SELECT [fields] FROM [table1] JOIN [table2] ON [join condition]; --- # JOIN - Without a join condition, we'll get the combination of everything. - Question: If we join the following two tables, how many rows will the resulting table have? <div style="float:left; margin-right: 60px;"> <h4>positions</h4> <table class="table"><thead><tr><th>id</th><th>name</th></tr></thead><tbody><tr><td>1</td><td>QB</td></tr><tr><td>2</td><td>RB</td></tr><tr><td>3</td><td>TE</td></tr><tr><td>4</td><td>WR</td></tr></tbody></table> </div> <div style="float:left; margin-right: 60px;"> <h4>players</h4> <table class="table"><thead><tr><th>id</th><th>name</th><th>position_id</th><th>team_id</th></tr></thead><tbody><tr><td>1</td><td>Carson Palmer</td><td>1</td><td>1</td></tr><tr><td>2</td><td>Matt Ryan</td><td>1</td><td>2</td></tr><tr><td>3</td><td>Joe Flacco</td><td>1</td><td>3</td></tr><tr><td>4</td><td>EJ Manuel</td><td>1</td><td>4</td></tr><tr><td>5</td><td>Cam Newton</td><td>1</td><td>5</td></tr></tbody></table> </div> --- # JOIN - To make it easier to write complicated queries, we can also alias table names. * Gives tables a shorter name to reference throughout a query * Allows us to include the same table multiple times --- # JOIN ```sql SELECT week, name, name FROM schedule JOIN teams ON schedule.home_id = teams.id JOIN teams ON schedule.away_id = teams.id; ``` ```sql SELECT s.week, t1.name AS home_team, t2.name AS away_team FROM schedule AS s JOIN teams AS t1 ON s.home_id = t1.id JOIN teams AS t2 ON s.away_id = t2.id; ``` --- # Exercises - Describe the columns and data returned by the following queries: ```sql SELECT p.name AS player_name, t.name AS team_name FROM players p JOIN teams t ON p.team_id = t.id ``` --- # Exercises - Describe the columns and data returned by the following queries: ```sql SELECT p.name AS player_name, q.name AS position FROM players p JOIN positions q ON p.position_id = q.id WHERE q.name in ('WR', 'TE'); ``` --- # Exercises - Describe the columns and data returned by the following queries: ```sql SELECT t.name AS team_name, count(*) as num_players FROM teams t JOIN players p ON t.id = p.team_id GROUP BY t.name ORDER BY num_players DESC; ``` --- # Exercises - Describe the columns and data returned by the following queries: ```sql SELECT t.name, sum(s.points) FROM stats s JOIN players p on p.id = s.player_id JOIN teams t on p.team_id = t.id GROUP BY t.name; ``` --- # Exercises - Now write queries to do the following: * Retrieve the name of every home team playing in week 9. * Retrieve the name of every away team playing in week 6. --- # Subqueries - Use a query within another query * SELECT [fields] FROM [table] WHERE [field] IN ([other query]); * SELECT [fields] FROM [table] JOIN ([other query]) ON [join condition]; --- # Exercises - Which is the inner query? What does it do? - Describe the columns and data returned by the following queries: ```sql SELECT name FROM players WHERE team_id IN ( SELECT id FROM teams WHERE name in ('NYJ', 'NYG') ); ``` --- # Exercises - Which is the inner query? What does it do? - Describe the columns and data returned by the following queries: ```sql SELECT sum(points) FROM stats WHERE week IN ( SELECT week FROM schedule GROUP BY week HAVING count(*) < 16 ); ``` --- # Exercises - Which is the inner query? What does it do? - Describe the columns and data returned by the following queries: ```sql SELECT p.name, s.total_points FROM players AS p JOIN ( SELECT player_id, SUM(points) as total_points FROM stats GROUP BY player_id HAVING total_points > 100 ) AS s on p.id = s.player_id; ``` --- # Exercises - Which is the inner query? What does it do? - Describe the columns and data returned by the following queries: ```sql SELECT t.name as team_name FROM teams AS t JOIN ( SELECT team_id, count(1) as num_players FROM players GROUP BY team_id HAVING num_players > 10 ) AS p on p.team_id = t.id; ``` --- Exercises - Write queries (any way you want) to do the following: * Find the total projected points scored by each position. * Find the total projected points scored by each position by week. * Find the total projected points scored by each position by week for weeks > 10. --- # Different join types <img src="http://i.stack.imgur.com/1UKp7.png" alt="Join types" style="height:400px"> From: [Visual Representation of Joins](http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins)