Due: Wednesday, Feburary 29, 2012 6pm
What to hand in:
rank Male_name Male_number Female_name Female_number
where,
rank The ranking of the names on this line
Male_name A male name of this rank
Male_number Number of males with this name
Female_name A female name of this rank
Female_number Number of females with this name
This is the format of database files obtained from the U.S. Social Secutiry Administration of the top 1000 registered baby names. Each line begins with the rank, followed by the male name at that rank, followed by the number of males with that name, etc. Here is an example file containing data from the year 2002:
1 Jacob 30122 Emily 24262
2 Michael 28119 Madison 21546
3 Joshua 25859 Hannah 18559
4 Matthew 24831 Emma 16324
5 Ethan 21949 Alexis 15411
6 Joseph 21766 Ashley 15217
7 Andrew 21696 Abigail 15155
8 Christopher 21676 Sarah 14564
9 Daniel 21186 Samantha 14540
10 Nicholas 21148 Olivia 14481
...
996 Edgardo 158 Jazmyne 222
997 Garett 158 Libby 222
998 Gerard 158 Nyasia 222
999 Ryley 158 Kari 221
1000 Braulio 157 Keeley 221
As you can see from the above, in 2002, there were 30,122 male babies named Jacob and 24,262 babies named Emily, making them the most popular names used in that year. Similarly, going down the list, we see that there were 221 newborn females named, Kari, making it the 999th most popular name.
Your program should output the number of times the following names appear in each database file, as well as in total: Mary (female), Mercedes (female), Precious (female), George (male and female), Jacob (male) and YOUR_NAME. There are three stats you need to obtain for each name: the rank, the number of time used, and the percentage.
Run the program on all the files in the Names directory, which can be found in /home/dxu/handouts/cs246 and fill out the following table (an example entry is filled in), the filled entry is showing that of names given to babies in 1900, 5.6205% of total female babies totalling 24,455 were named Mary.
1900 |
1910 |
... |
Total |
|
Name | rank number % | rank number % | rank number % | |
Mary | 1 24455 5.6205 | |||
Mercedes | ||||
Precious | ||||
George (male) | ||||
George (female) | ||||
Jacob | ||||
_YOU_ |
You will have to "hard code" the names you are searching for in your program, as well as the database file names. Later, we will see how you can make the program independent of this.
Calculating total rank is non-trivial. Think through your data structure and algorithm needs before you start. Remember to use #define to help make your code readable.
Please pay attention to your program design and organization, as you will be graded on modularity, organization, in addition to functionality.