Lab 01 assignment (20 pts)#
UW Geospatial Data Analysis
CEE467/CEWA567
David Shean, Eric Gagliano, Quinn Brencher
Introduction#
Lab 1 is focused on understanding the potential of geospatial data, navigating the command line, getting comfortable with using python in jupyter notebooks, and using git and github to manage your work.
In the first section of this lab, we will see examples of geospatial data and show off some of the applications we will get to later in the quarter. In the second section of this lab, we’ll get more familiar with navigating the command line. In the third and fourth section of the lab, we’ll practice the basics of using python in jupyter notebooks. Finally, we will practice using git and github to submit your work.
Instructions#
Please go through all cells in the notebook sequentially, making sure to complete all instructions
Some answers are left in as a guide–you still need to fill in the code to produce the output on your own!
Answers should be produced using code unless otherwise noted by “Written response”
Follow submission instructions closely, making sure you save and submit your notebook with all your outputs preserved
Part 1: Geospatial data gallery (4 pts)#
This part of the lab should require no code. Use the Geospatial data examples notebook to answer the following questions. No need for super lengthy written answers!
Vector data#
1a) Written response: Choose one of the vector dataset examples. Please answer the following.#
Which dataset did you choose, and how might it be used in practice?
How is the data structured–generally, what do the rows and columns represent and how many are there of each?
What do you think the geometry column is for, and how could it be useful to us?
STUDENT WRITTEN RESPONSE HERE
Raster data#
1b) Written response: Choose one of the raster dataset examples. Please answer the following.#
Which dataset did you choose, and how might it be used in practice?
How is the data structured–generally, what do the rows and columns represent and how many are there of each?
How do you think geometry information is stored?
STUDENT WRITTEN RESPONSE HERE
nDarray data#
1c) Written response: Choose one of the nDarray dataset examples. Please answer the following.#
Which dataset did you choose, and how might it be used in practice?
How is the data structured–generally, what do the dimensions represent and how many are there of each?
How do you think geometry information is stored?
Could this data be represented as a raster? What do you think is the relationship between raster data and nDarray data?
STUDENT WRITTEN RESPONSE HERE
1d) Written response: Consider how vector, raster, and nDarray data can complement each other in geospatial analysis. For some topic of interest, imagine a project that would benefit from using all three of these types of geospatial data. Briefly describe what each imaginary dataset would represent and how they would work together to help answer your research question.#
STUDENT WRITTEN RESPONSE HERE
Part 3: Python time! A play on words (8 pts)#
3a) Define a variable to store the path to the words file#
Can be absolute or relative path (try both!): https://www.geeksforgeeks.org/absolute-relative-pathnames-unix/
Note: Can use
%pwd(print working directory, similar topwdshell command) to get current directory path.When defining paths in iPython, use
/home/jovyaninstead of~shortcut for your home directoryThe path should be a string, enclosed in single quotes
'/path/to/some/file.txt'
# STUDENT CODE HERE
3b) Use Python to read this file and populate a list of strings containing all words#
Use basic Python
openfunction here, even if you know how to do this with other modulesNote: you will need to handle newline strings
'\n'at the end of each word
# STUDENT CODE HERE
3c) How many words are there in the list? How many characters are in the first word of the list?#
# STUDENT CODE HERE
235886
# STUDENT CODE HERE
3d) What is total number of characters for all words in the list?#
Can use list comprehension here to loop through all words
# STUDENT CODE HERE
2257223
3e) What is the longest word? And how many characters are in the longest word?#
# STUDENT CODE HERE
# STUDENT CODE HERE
24
3f) Print the first 3 words, print the last 3 words#
Use relative list indices for slicing: https://stackoverflow.com/questions/509211/understanding-slice-notation
Note that the output is still a list object
# STUDENT CODE HERE
['A', 'a', 'aa']
['zythum', 'Zyzomys', 'Zyzzogeton']
3e) Define a function that will concatenate an input list of strings. Run your function three separate times below (Use indexing on the words list, don’t copy/paste strings from the list)#
Your function should return a single string (with no spaces)
This function should accept an input list with arbitrary length as an argument
So
return inlist[0]+inlist[1]+inlist[2]won’t work
Example input:
['Geospatial', 'Data', 'Analysis']Example output:
'GeospatialDataAnalysis'
# STUDENT CODE HERE
Pass in a list of the first 3 words
# STUDENT CODE HERE
'Aaaa'
Pass in a list of the first 5 words
# STUDENT CODE HERE
Aaaaaalaalii
Pass in a list of the last 3 words
# STUDENT CODE HERE
zythumZyzomysZyzzogeton
Pass in a list of the 1st, 3rd, 5th, and 7th word (don’t pass the words in separately!)
# STUDENT CODE HERE
AaaaaliiAani
3f) Does your list contain the nickname for the UW mascot? If so, what is the numerical index for that word?#
If you don’t know our mascot, ask a neighbor! Be careful about case
This should be simple boolean statement
Double check your index by printing the word at that index
# STUDENT CODE HERE
True
# STUDENT CODE HERE
58209
dubs
Part 4: Letter counter (6 pts)#
4a) How many words begin with each letter of the alphabet (case-insensitive)?#
Hint: Python has built-in list of lowercase letters stored as
string.ascii_lowercase(in thestringmodule, so need to import first!). Also, all string objects have methods that can change the case: https://docs.python.org/2.5/lib/string-methods.htmlHint: One possible approach could use nested loops:
Loop through each letter
Initialize some count variable or empty list
Loop through each word in the list of words
Check to see if the word starts with the letter (careful about case!)
If it does, increment your counter or append the word to your list
Print out the letter and the total count of words that met your criterion
Another possible approach could use a dictionary:
Create a new dictionary with a key for each lowercase letter
Initialize a counter for each value in the dictionary
Loop through words and increment the appropriate counter
If you want, try to implement both - which one is faster?
# STUDENT CODE HERE
# STUDENT CODE HERE
4b) What is the most common first letter? Use string formatting to print your answer#
While it is possible to just look at the output counts above, try to do this with code
If the above results are stored in a dictionary or lists, this should only require 1-2 lines of code - no need for additional loops
Output should be something like: “The most common first letter in words is ‘a’ with 17096 occurences”
Note that ‘a’ is not the correct answer - only 25 other possibilities to consider!
# STUDENT CODE HERE
"The most common first letter is 's' with 25162 occurrences"
Challenge question: Create a plot of letter counts (GS: Required / UG: +1 pts)#
We haven’t talked about
matplotlibor other plotting libraries yet, but if you already feel pretty comfortable plotting, create a visualization your output counts. A bar plot (AKA histogram when counts are involved) might be a good choice
# STUDENT CODE HERE
Submit your work#
Save this notebook with all code and output (Make sure when you save the notebook, all cells show their outputs).
Use the terminal to stage, commit, and push your notebook to your GitHub repository. It should look something like this…
git add 01_lab.ipynb
git commit -m “Completed Lab 01 exercises”
git push
Verify that your notebook appears in your GitHub repository. Double check to make sure all the ouputs are visible!