Homework 1-3

Adapted from Brown University CSCI0931. Used with Permission.

Reminders

Setup

To start off, copy this spreadsheet into your google drive and rename it FirstLast_StudentID_HW1-3.

Task 1: Making a Good Spreadsheet

  • Look at the Bad Sheet tab. It shows the September prices for two items (Bread and Rice) and how many of each item we bought, and then the cost of the items, which is computed as D3*B3 + D4*B4, i.e., the number of breads times the price of bread, plus the number of rices times the price of rice. That total is shown in cell B6.

    Down below, we've got both September prices (copied from those above) and October prices, and a (correct) formula for the increase in price from September to October, which takes the total October cost, D3*C9 + D4*C10 and subtracts the total September cost, D3*B9 + D4*B10. Notice that we're using the costs from rows 9 and 10.

    Change the price of bread in cell B3 from $2.35 to $2.50. You'll see the September cost go up. But the cell labelled "Increase" (D9) won't change at all. Look at the formulas and explain why in cell E12.

  • Now look at the tab called Good Sheet. On this one, things are a little clearer: the only places the person looking at this sheet should change anything are highlighted in yellow (with a note at the top explaining this). The September prices in rows 9 and 10 are not copied from rows 3 and 4, but rather are defined by formulas that refer to rows 3 and 4: when the price of bread in B3 is changed, the price in B9 automatically updates. Note, too, that all dollar amounts have been formatted to display nicely. (That was true of the "Bad Sheet" tab as well.)

    The cells in this sheet can be divided into four classes:

    1. Labels: things that explain the contents of an adjacent cell or table-row or table-column.
    2. Parameters: the things marked in yellow, which might be changed by a viewer of the table.
    3. Computed values: All the numbers that aren't marked in yellow. They're values that are derived from the data by formulas.
    4. Data: There is no "data" in this example table except for the parameters. But given another set of data, for example, a table of legislative council votes, the "parameters" might be "which session" and "which vote", and the data would then be the votes of all the legislators: it's information that you don't have to type in, but which can be found (on the wab, in a file, ...) by using the parameters.

      The distinction between "data" and "parameters" is a fine one. If we use the example again of legislator votes, you might want to manually change which bill you're looking at (that's a parameter), but you probably would not manually change James Tien's vote to a 'No' on some bill (that is data).
    That categorization should apply, whenever possible, to every spreadsheet you produce in this class.
  • Now look at the tab called BadBetterGood. On this one, the first two columns import legislator names as we did in class. The key formula, in cell B1, is
    =importxml("http://www.legco.gov.hk/yr14-15/chinese/counmtg/voting/cm_vote_20141126.xml","//vote[1]/individual-votes/member/@name-en") You can see that the year and the date of the legislative session are in the url in two places -- once for the year, and another time for the date of the session. Now suppose you want to change the session that you are looking at -- for example, to the meeting on October 10-11, 2013. That would mean that you have to change both the year (13-14) and the date (20131009). If you forget one, then you will end up making a mistake in the URL. That's very error-prone and, thus, not a good design of a spreadsheet!

    Another thing to note. In this case, the senator names are "data", but there don't seem to be any parameters. Or, maybe, we can say that there are parameters -- it is embedded inside the importXML formula: the URL and the XPath.

    Now compare this with Columns D and E. Once again the data you can change is highlighted in yellow. There are several well-named entries (like "Which Year"), and even the item to extract has been separated out into its own field.

    There's a problem, though: the list of legislator names is missing.

    If you look at the "Target URL" entry (cell E8), I've written down the thing you should be trying to produce in the "URL" entry (cell E9). However, the formula that I've used for the URL doesn't produce the correct URL, as you can see. The same applies for "Target XPath" (cell E10) and "XPath" (cell E11).

    Modify the formula in E9 and E11) (you'll need several more & and several more strings!) to make it exactly reproduce the "Target URL" and the "Target XPath". When you do, the Query results should appear in E13:E82.

    Change the correct parameters to load up the legislators from the meeting on October 17, 2012. Does the right thing happen? (make sure you're loading in the English names and check for Joseph Lee's name -- somebody must have gotten a promotion!)

    The formula for the query is =importXML(E7,E5). That makes sense: E7 contains the URL, E5 contains the XPath. But we can do better.

  • Look at columns G and H. You'll need to fix the URL formula in H7, just as you did in columns D and E, but once you do, the results will show up, and will change when you change the "year" or "date". But look at the formula in H8:
    =importXML(url,xpath)
    That code works because I selected cell H7, went to Data > Named Ranges and named it "url"; I did the same thing with cell H5, naming it "xpath". After doing that, I'm allowed to refer to those cells by name rather than by letter-number coordinates. It makes the formula a good deal easier to read.

    In Cell I11, construct the XPath that will load up the votes from the legislators next to their names. Give the cell a reasonable name (e.g. "xpath_vote"). In Cell I13, type in the IMPORTXML command that will load up the votes for the legislators in Cells I13:I82.

  • Task 2

    Finally, we want to check in and make sure you are ok with this class.

    1. Go to the tab Feedback. In cell A2, please rate how you feel the class is going so far.
    2. Also, tell us how long it took you to finish this homework in cell A5.

    Handin

    Share the following files with PolyUCOMP1D04@gmail.com

  • FirstLast_StudentID_HW1-3
  • In the above file name, "FirstLast" should be replaced with your first and last name. If it is not, we will take off points. Make sure every task has been completed.