CS246 Homework 5b: Unix Utilities
Due: Wednesday, March 21
Things to note:
- This is an individual assignment
What to hand in:
- Paper print out of answers/explanations and results (for the sake of trees, print out only the top 50 or so for question 3)
- Electronic text file consisting of Unix commands (the ones you used to obtain the results), copied to /home/dxu/submissions/cs246
The magic of utilities and pipes [50pts]
Based on the baby names data files, find the appropriate combinations of Unix utilty programs to answer the following questions. Only unix commands, pipes and redirections are allowed. No shell scripts or C programs. For each question, record what commands you used to get the answer.
- Find top 25 female names of 2010
- Find bottom 25 male names of 2010
- Find all female names used from 1900-2010, sorted alphabetically, with no repeats
- How many were there in question 3?
- Find all unique male variations on the name "John"
- Find all unique female variations on the name "John"
- Find all unique variations on the name "John", male and female. Can you do this with one command line (i.e. just one return)? Can you do this without storing intermediate results?
- Find all female/male names that start with the first letter of your name and end with the last letter of your name (1900-2010).
- How many babies have the same name as you (pick a name to match if yours isn't in the files) in the last decade (2000-2009)?
- Find the total number of female babies contributing to these statistics files (1900-2010) (you might want to check the Unix utility gawk)
- Find the total number of unique male names (1900-2010).
- Find top 25 males names used from 1900-2010. Can this be done easily with Unix utility programs alone? Why and why not?