{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Projet FDD - Arthur Brandao & Maxence Bacquet\n", "\n", "L'objectif du projet est récupérer la liste des anime regardés par un utilisateur sur le site Anilist.co pour lui faire des recommendations.\n", "\n", "## Import\n", "\n", "Import des class utile pour la recommendation. La class AnilistApi et AnilistQuery ont été créées à l'occasion de ce projet pour permettre d'interroger facielement l'API. de plus l'API d'Anilist utilisant GraphQL une class GraphQLClient à aussi été créée pour l'occasion et est utiliser par la class AnilistApi." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "numpy: 1.16.4\n", "pandas: 0.25.2\n" ] } ], "source": [ "import numpy as np\n", "import pandas as pd\n", "from anilist_api import AnilistApi\n", "from anilist_api import AnilistQuery\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.ensemble import RandomForestClassifier\n", "# Affichage des versions\n", "print('numpy: {}'.format(np.__version__))\n", "print('pandas: {}'.format(pd.__version__))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Définition des variable utile pour tous le projet (variable global et instance d'objet)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Le nom de l'utilisateur pour qui sont faite les recommendations\n", "USER = 'Loquicom'\n", "# La precision minimum du model lors de la phase de test (en %) /!\\ Un trop grand nombre peut être impossible à atteindre\n", "ACCURACYMIN = 80\n", "# Nombre d'iteration maximum avant de considerer que le model ne peut pas atteindre la précision démandée (pourr éviter une boucle infini)\n", "ITERATIONMAX = 25 # -1 <=> Pas de limite\n", "# Le nombre de requete effectué pour la recherche\n", "NBITERATION = 4\n", "# Le nombre d'anime récupérés par requete (max 50)\n", "NBANIME = 50\n", "# L'api d'Anilist\n", "anilist = AnilistApi()\n", "# Le modèle utilisé pour l'apprentissage des features utile à la recommendation\n", "model = RandomForestClassifier(n_estimators = 1000, random_state = 42, max_features=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Definition des fonctions utilitaires\n", "\n", " - **valid_data**: Permet de verifier et mesurer la precision des données predite par le model\n", " - **identical_features**: Fait en sorte que le dataframe est les même feature que la dataframe source\n", " - **make_querry**: Création d'une query basique utile pour requeter l'api d'Anilist" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def valid_data(predict, value):\n", " result = pd.Series(value - predict)\n", " error = result[result != 0].count()\n", " accuracy = 100 - ((error / result.size) * 100)\n", " return result.size, error, accuracy\n", "\n", "def identical_features(source, target, keep = []):\n", " if keep is str:\n", " keep = [keep]\n", " # Ajout feature manquante\n", " for feature in source.columns:\n", " if feature not in target.columns:\n", " target[feature] = 0\n", " # Suppr feature en trop\n", " drop = []\n", " for feature in target.columns:\n", " if feature not in source.columns and feature not in keep:\n", " drop.append(feature)\n", " return target.drop(drop, axis = 1)\n", "\n", "def make_query(score = -1, popularity = -1, epMin = -1, epMax = -1, durationMin = -1, durationMax = -1, source = [], formatType = []):\n", " query = AnilistQuery()\n", " if score != -1:\n", " query.scoreGreaterThan(score)\n", " if popularity != -1:\n", " query.popularityGreaterThan(popularity)\n", " if epMin != -1:\n", " query.episodeBetween(epMin, epMax)\n", " if durationMin != -1:\n", " query.durationBetween(durationMin, durationMax)\n", " if len(formatType) > 0:\n", " query.formatIn(formatType)\n", " if len(source) > 0:\n", " query.sourceIn(source)\n", " return query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Récupèration des données\n", "\n", "Les données sont récupérées sur l'API d'Anilist. Les données utilisées correspondent à la liste des anime complétés par l'utilisateur. Dans les données on retrouve notamment le score donné par l'utilisateur, le score moyen sur le site, le nombre d'episode et leur durée, la popularité, des tags, les genre, le format et la source." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | id | \n", "title | \n", "score | \n", "popularity | \n", "format | \n", "episode | \n", "duration | \n", "source | \n", "userScore | \n", "tag1 | \n", "tag2 | \n", "tag3 | \n", "genre1 | \n", "genre2 | \n", "genre3 | \n", "genre4 | \n", "genre5 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "97636 | \n", "Akiba's Trip: The Animation | \n", "62 | \n", "12475 | \n", "TV | \n", "13 | \n", "24 | \n", "VIDEO_GAME | \n", "80.0 | \n", "Super Power | \n", "Otaku Culture | \n", "Demons | \n", "Action | \n", "Ecchi | \n", "Supernatural | \n", "Comedy | \n", "Fantasy | \n", "
1 | \n", "20602 | \n", "Amagi Brilliant Park | \n", "74 | \n", "42303 | \n", "TV | \n", "13 | \n", "24 | \n", "LIGHT_NOVEL | \n", "70.0 | \n", "Ensemble Cast | \n", "Magic | \n", "Male Protagonist | \n", "Comedy | \n", "Romance | \n", "Fantasy | \n", "None | \n", "None | \n", "
2 | \n", "21077 | \n", "Amagi Brilliant Park: Nonbirishiteiru Hima ga ... | \n", "70 | \n", "8614 | \n", "OVA | \n", "1 | \n", "24 | \n", "LIGHT_NOVEL | \n", "70.0 | \n", "Ensemble Cast | \n", "Male Protagonist | \n", "Work | \n", "Comedy | \n", "Fantasy | \n", "None | \n", "None | \n", "None | \n", "
3 | \n", "6547 | \n", "Angel Beats! | \n", "79 | \n", "88258 | \n", "TV | \n", "13 | \n", "24 | \n", "ORIGINAL | \n", "75.0 | \n", "Afterlife | \n", "Tragedy | \n", "School | \n", "Action | \n", "Comedy | \n", "Drama | \n", "Supernatural | \n", "None | \n", "
4 | \n", "20755 | \n", "Ansatsu Kyoushitsu | \n", "79 | \n", "72532 | \n", "TV | \n", "22 | \n", "23 | \n", "MANGA | \n", "90.0 | \n", "Assassins | \n", "School | \n", "Shounen | \n", "Action | \n", "Comedy | \n", "Supernatural | \n", "None | \n", "None | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
97 | \n", "104252 | \n", "Maou-sama, Retry! | \n", "59 | \n", "11116 | \n", "TV | \n", "12 | \n", "24 | \n", "LIGHT_NOVEL | \n", "70.0 | \n", "Isekai | \n", "Anti-Hero | \n", "Male Protagonist | \n", "Action | \n", "Adventure | \n", "Fantasy | \n", "None | \n", "None | \n", "
98 | \n", "107704 | \n", "Kawaki wo Ameku | \n", "79 | \n", "1676 | \n", "MUSIC | \n", "1 | \n", "4 | \n", "OTHER | \n", "0.0 | \n", "Musical | \n", "Female Protagonist | \n", "Primarily Female Cast | \n", "Music | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
99 | \n", "99425 | \n", "Promare | \n", "83 | \n", "9545 | \n", "MOVIE | \n", "1 | \n", "115 | \n", "ORIGINAL | \n", "95.0 | \n", "Firefighters | \n", "Robots | \n", "CGI | \n", "Action | \n", "Mecha | \n", "Comedy | \n", "Sci-Fi | \n", "None | \n", "
100 | \n", "107226 | \n", "Dumbbell Nan Kilo Moteru? | \n", "74 | \n", "20176 | \n", "TV | \n", "12 | \n", "24 | \n", "MANGA | \n", "80.0 | \n", "Fitness | \n", "Educational | \n", "Athletics | \n", "Comedy | \n", "Ecchi | \n", "Sports | \n", "Slice of Life | \n", "None | \n", "
101 | \n", "112381 | \n", "raison d'etre | \n", "64 | \n", "208 | \n", "MUSIC | \n", "1 | \n", "4 | \n", "ORIGINAL | \n", "0.0 | \n", "None | \n", "None | \n", "None | \n", "Music | \n", "Action | \n", "None | \n", "None | \n", "None | \n", "
102 rows × 17 columns
\n", "\n", " | score | \n", "popularity | \n", "episode | \n", "duration | \n", "tag_Super_Power | \n", "tag_Ensemble_Cast | \n", "tag_Afterlife | \n", "tag_Assassins | \n", "tag_Cute_Girls_Doing_Cute_Things | \n", "tag_School | \n", "... | \n", "format_OVA | \n", "format_SPECIAL | \n", "format_TV | \n", "source_LIGHT_NOVEL | \n", "source_MANGA | \n", "source_ORIGINAL | \n", "source_OTHER | \n", "source_VIDEO_GAME | \n", "source_VISUAL_NOVEL | \n", "like | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "62 | \n", "12475 | \n", "13 | \n", "24 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "
1 | \n", "74 | \n", "42303 | \n", "13 | \n", "24 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "70 | \n", "8614 | \n", "1 | \n", "24 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
3 | \n", "79 | \n", "88258 | \n", "13 | \n", "24 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
4 | \n", "79 | \n", "72532 | \n", "22 | \n", "23 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
97 | \n", "59 | \n", "11116 | \n", "12 | \n", "24 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
98 | \n", "79 | \n", "1676 | \n", "1 | \n", "4 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "
99 | \n", "83 | \n", "9545 | \n", "1 | \n", "115 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "
100 | \n", "74 | \n", "20176 | \n", "12 | \n", "24 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "
101 | \n", "64 | \n", "208 | \n", "1 | \n", "4 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
102 rows × 124 columns
\n", "\n", " | feature | \n", "score | \n", "
---|---|---|
0 | \n", "score | \n", "0.12 | \n", "
1 | \n", "popularity | \n", "0.11 | \n", "
2 | \n", "episode | \n", "0.05 | \n", "
3 | \n", "duration | \n", "0.04 | \n", "
18 | \n", "tag_Magic | \n", "0.02 | \n", "
22 | \n", "tag_Video_Games | \n", "0.02 | \n", "
36 | \n", "tag_Female_Protagonist | \n", "0.02 | \n", "
49 | \n", "tag_Male_Protagonist | \n", "0.02 | \n", "
92 | \n", "genre_Action | \n", "0.02 | \n", "
93 | \n", "genre_Comedy | \n", "0.02 | \n", "
94 | \n", "genre_Slice_of_Life | \n", "0.02 | \n", "
96 | \n", "genre_Supernatural | \n", "0.02 | \n", "
99 | \n", "genre_Adventure | \n", "0.02 | \n", "
107 | \n", "genre_Psychological | \n", "0.02 | \n", "
117 | \n", "source_LIGHT_NOVEL | \n", "0.02 | \n", "
118 | \n", "source_MANGA | \n", "0.02 | \n", "
\n", " | id | \n", "title | \n", "score | \n", "popularity | \n", "episode | \n", "duration | \n", "tag_Super_Power | \n", "tag_Ensemble_Cast | \n", "tag_Afterlife | \n", "tag_Assassins | \n", "... | \n", "format_ONA | \n", "format_OVA | \n", "format_SPECIAL | \n", "format_TV | \n", "source_LIGHT_NOVEL | \n", "source_MANGA | \n", "source_ORIGINAL | \n", "source_OTHER | \n", "source_VIDEO_GAME | \n", "source_VISUAL_NOVEL | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "245 | \n", "Great Teacher Onizuka | \n", "84 | \n", "31604 | \n", "43.0 | \n", "25 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
1 | \n", "97922 | \n", "Inuyashiki | \n", "73 | \n", "31631 | \n", "11.0 | \n", "23 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "10162 | \n", "Usagi Drop | \n", "83 | \n", "31883 | \n", "11.0 | \n", "22 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
3 | \n", "17549 | \n", "Non Non Biyori | \n", "78 | \n", "31976 | \n", "12.0 | \n", "24 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
4 | \n", "14967 | \n", "Boku wa Tomodachi ga Sukunai Next | \n", "72 | \n", "31992 | \n", "12.0 | \n", "24 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
45 | \n", "14741 | \n", "Chuunibyou demo Koi ga Shitai! | \n", "76 | \n", "67649 | \n", "12.0 | \n", "24 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
46 | \n", "20920 | \n", "Dungeon ni Deai wo Motomeru no wa Machigatteir... | \n", "74 | \n", "68372 | \n", "13.0 | \n", "24 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
47 | \n", "6746 | \n", "Durarara!! | \n", "80 | \n", "68550 | \n", "24.0 | \n", "24 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
48 | \n", "10087 | \n", "Fate/Zero | \n", "83 | \n", "70510 | \n", "13.0 | \n", "26 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
49 | \n", "20623 | \n", "Kiseijuu: Sei no Kakuritsu | \n", "82 | \n", "71072 | \n", "24.0 | \n", "24 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
200 rows × 125 columns
\n", "\n", " | id | \n", "title | \n", "
---|---|---|
0 | \n", "20652 | \n", "Durarara!!x2 Shou | \n", "
1 | \n", "8425 | \n", "Gosick | \n", "
2 | \n", "20593 | \n", "Hanamonogatari | \n", "
3 | \n", "7674 | \n", "Bakuman. | \n", "
4 | \n", "99726 | \n", "Net-juu no Susume | \n", "
... | \n", "... | \n", "... | \n", "
97 | \n", "20832 | \n", "Overlord | \n", "
98 | \n", "14813 | \n", "Yahari Ore no Seishun Love Comedy wa Machigatt... | \n", "
99 | \n", "20920 | \n", "Dungeon ni Deai wo Motomeru no wa Machigatteir... | \n", "
100 | \n", "6746 | \n", "Durarara!! | \n", "
101 | \n", "10087 | \n", "Fate/Zero | \n", "
82 rows × 2 columns
\n", "