Using python Pandas in a Jupiter notebook, I have the following csv files I am trying to get a result of and I can't find out how to get the correct result:
sets.csv with the following columns:
set_num	name	year	theme_id	num_parts
And this file themes.csv
id	name	parent_id
Create a function called theme_by_year that takes as input a year (as an integer) and shows the theme ids and theme names (listed in order by theme id) that were in sets that year.
The column names must be id and name_themes (to differentiate between the name of a theme and the name of a set) in that order.
The index should be reset and go from 0 to n-1.
Each theme should only be listed once even if it appeared in more than one set from that year -- duplicate themes should be based on theme id and not name since there are some themes with the same name but with a different id.
Hint: It will help if you were to think about merging appropriate DataFrames to help you get this answer.
In [153]:
### The code ###
def theme_by_year(yr):
new = sets[sets['year']==yr]
new = list(set(list(sets['theme_id'])))
name = []
new.sort()
for i in new:
st = str(themes[themes['id']==i]['name'])
st = st[1:]
st= st.strip("\nName: name, dtype: object")
name.append(st)
df = pd.DataFrame(list(zip(new,name)),index=list(range(len(new))),columns=['id','name_themes'])
return df
Code Check: Call the theme_by_year() function on the year with the lowest number of total sets as found in 1960 . Your output should look as follows:
id	name_themes
0	371	Supplemental
1	497	Books
2	513	Classic
### Here is the result I get ###
Q14 = theme_by_year(1960)
Q14
Out[154]:
id	name_themes
0	1	Techni
1	3	Competiti
2	4	Expert Builder
3	16	RoboRiders
4	17	Speed Slammers
...	...	...
431	715	39 Marvel
432	716	40 Modulex
433	717	41 Speed Racer
434	718	42 Series 22 Minifigures
435	719	43 BrickLink Designer Progr