diff --git a/Coursera FTPHIJ59MYOC.pdf b/Coursera FTPHIJ59MYOC.pdf new file mode 100644 index 0000000..ef81096 Binary files /dev/null and b/Coursera FTPHIJ59MYOC.pdf differ diff --git a/PY0101EN-5 1_Intro_API.ipynb b/PY0101EN-5 1_Intro_API.ipynb new file mode 100644 index 0000000..73df8b6 --- /dev/null +++ b/PY0101EN-5 1_Intro_API.ipynb @@ -0,0 +1,698 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " \"cognitiveclass.ai\n", + "
\n", + "\n", + "# Hands-on Lab: Introduction to API\n", + "\n", + "Estimated time needed: **15** minutes\n", + "\n", + "## Objectives\n", + "\n", + "After completing this lab you will be able to:\n", + "\n", + "* Create and use APIs in Python\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Introduction\n", + "\n", + "An API lets two pieces of software talk to each other. Just like a function, you don't have to know how the API works, only its inputs and outputs. An essential type of API is a REST API that allows you to access resources via the internet. In this lab, we will review the Pandas Library in the context of an API, we will also review a basic REST API.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Table of Contents\n", + "\n", + "
\n", + "
  • Pandas is an API
  • \n", + "
  • REST APIs
  • \n", + "
  • Quiz
  • \n", + "\n", + "
    \n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Pandas is an API\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Pandas is actually set of software components , much of which is not even written in Python.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import pandas as pd\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You create a dictionary, this is just data.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "dict_={'a':[11,21,31],'b':[12,22,32]}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When you create a Pandas object with the dataframe constructor, in API lingo this is an \"instance\". The data in the dictionary is passed along to the pandas API. You then use the dataframe to communicate with the API.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "pandas.core.frame.DataFrame" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df=pd.DataFrame(dict_)\n", + "type(df)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"logistic\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When you call the method `head` the dataframe communicates with the API displaying the first few rows of the dataframe.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    ab
    01112
    12122
    23132
    \n", + "
    " + ], + "text/plain": [ + " a b\n", + "0 11 12\n", + "1 21 22\n", + "2 31 32" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When you call the method `mean`, the API will calculate the mean and return the value.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "a 21.0\n", + "b 22.0\n", + "dtype: float64" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.mean()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## REST APIs\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

    Rest APIs function by sending a request, the request is communicated via HTTP message. The HTTP message usually contains a JSON file. This contains instructions for what operation we would like the service or resource to perform. In a similar manner, API returns a response, via an HTTP message, this response is usually contained within a JSON.

    \n", + "

    In this lab, we will use the NBA API to determine how well the Golden State Warriors performed against the Toronto Raptors. We will use the API to determine the number of points the Golden State Warriors won or lost by for each game. So if the value is three, the Golden State Warriors won by three points. Similarly it the Golden State Warriors lost by two points the result will be negative two. The API will handle a lot of the details, such a Endpoints and Authentication.

    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It's quite simple to use the nba api to make a request for a specific team. We don't require a JSON, all we require is an id. This information is stored locally in the API. We import the module `teams`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Collecting nba_api\n", + " Downloading nba_api-1.1.13-py3-none-any.whl (255 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m255.2/255.2 kB\u001b[0m \u001b[31m217.4 kB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n", + "\u001b[?25hRequirement already satisfied: numpy in /home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from nba_api) (1.21.6)\n", + "Requirement already satisfied: requests in /home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from nba_api) (2.29.0)\n", + "Requirement already satisfied: charset-normalizer<4,>=2 in /home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from requests->nba_api) (3.1.0)\n", + "Requirement already satisfied: idna<4,>=2.5 in /home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from requests->nba_api) (3.4)\n", + "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from requests->nba_api) (1.26.15)\n", + "Requirement already satisfied: certifi>=2017.4.17 in /home/jupyterlab/conda/envs/python/lib/python3.7/site-packages (from requests->nba_api) (2023.5.7)\n", + "Installing collected packages: nba_api\n", + "Successfully installed nba_api-1.1.13\n" + ] + } + ], + "source": [ + "!pip install nba_api" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "from nba_api.stats.static import teams\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The one_dict() function takes a list of dictionaries (each representing one team's details) and combines them into a single dictionary where each key contains a list of all corresponding values.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "def one_dict(list_dict):\n", + " keys=list_dict[0].keys()\n", + " out_dict={key:[] for key in keys}\n", + " for dict_ in list_dict:\n", + " for key, value in dict_.items():\n", + " out_dict[key].append(value)\n", + " return out_dict" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The method get_teams() returns a list of dictionaries.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "nba_teams = teams.get_teams()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The dictionary key id has a unique identifier for each team as a value. Let's look at the first three elements of the list:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "nba_teams[0:3]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To make things easier, we can convert the dictionary to a table. First, we use the function one dict, to create a dictionary. We use the common keys for each team as the keys, the value is a list; each element of the list corresponds to the values for each team.\n", + "We then convert the dictionary to a dataframe, each row contains the information for a different team.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "dict_nba_team=one_dict(nba_teams)\n", + "df_teams=pd.DataFrame(dict_nba_team)\n", + "df_teams.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Will use the team's nickname to find the unique id, we can see the row that contains the warriors by using the column nickname as follows:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df_warriors=df_teams[df_teams['nickname']=='Warriors']\n", + "df_warriors" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can use the following line of code to access the first column of the DataFrame:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "id_warriors=df_warriors[['id']].values[0][0]\n", + "# we now have an integer that can be used to request the Warriors information \n", + "id_warriors" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The function \"League Game Finder \" will make an API call, it's in the module stats.endpoints.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from nba_api.stats.endpoints import leaguegamefinder" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The parameter team_id_nullable is the unique ID for the warriors. Under the hood, the NBA API is making a HTTP request.\\\n", + "The information requested is provided and is transmitted via an HTTP response this is assigned to the object game finder.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Since https://stats.nba.com does not allow api calls from Cloud IPs and Skills Network Labs uses a Cloud IP.\n", + "# The following code is commented out, you can run it on jupyter labs on your own computer.\n", + "# gamefinder = leaguegamefinder.LeagueGameFinder(team_id_nullable=id_warriors)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see the json file by running the following line of code.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Since https://stats.nba.com does not allow api calls from Cloud IPs and Skills Network Labs uses a Cloud IP.\n", + "# The following code is commented out, you can run it on jupyter labs on your own computer.\n", + "# gamefinder.get_json()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The game finder object has a method get_data_frames(), that returns a dataframe. If we view the dataframe, we can see it contains information about all the games the Warriors played. The PLUS_MINUS column contains information on the score, if the value is negative, the Warriors lost by that many points, if the value is positive, the warriors won by that amount of points. The column MATCHUP has the team the Warriors were playing, GSW stands for Golden State Warriors and TOR means Toronto Raptors. vs signifies it was a home game and the @ symbol means an away game.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Since https://stats.nba.com does not allow api calls from Cloud IPs and Skills Network Labs uses a Cloud IP.\n", + "# The following code is comment out, you can run it on jupyter labs on your own computer.\n", + "# games = gamefinder.get_data_frames()[0]\n", + "# games.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can download the dataframe from the API call for Golden State and run the rest like a video.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "\n", + "filename = \"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%205/Labs/Golden_State.pkl\"\n", + "\n", + "def download(url, filename):\n", + " response = requests.get(url)\n", + " if response.status_code == 200:\n", + " with open(filename, \"wb\") as f:\n", + " f.write(response.content)\n", + "\n", + "download(filename, \"Golden_State.pkl\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "file_name = \"Golden_State.pkl\"\n", + "games = pd.read_pickle(file_name)\n", + "games.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can create two dataframes, one for the games that the Warriors faced the raptors at home, and the second for away games.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "games_home=games[games['MATCHUP']=='GSW vs. TOR']\n", + "games_away=games[games['MATCHUP']=='GSW @ TOR']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can calculate the mean for the column PLUS_MINUS for the dataframes games_home and games_away:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "games_home['PLUS_MINUS'].mean()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "games_away['PLUS_MINUS'].mean()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can plot out the PLUS MINUS column for the dataframes games_home and games_away.\n", + "We see the warriors played better at home.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, ax = plt.subplots()\n", + "\n", + "games_away.plot(x='GAME_DATE',y='PLUS_MINUS', ax=ax)\n", + "games_home.plot(x='GAME_DATE',y='PLUS_MINUS', ax=ax)\n", + "ax.legend([\"away\", \"home\"])\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Quiz\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Calculate the mean for the column PTS for the dataframes games_home and games_away:\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "tags": [] + }, + "outputs": [ + { + "ename": "NameError", + "evalue": "name 'games_home' is not defined", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m/tmp/ipykernel_68/926849432.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# Write your code below and press Shift+Enter to execute\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mgames_home\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'PTS'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmean\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mgames_away\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'PTS'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmean\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mNameError\u001b[0m: name 'games_home' is not defined" + ] + } + ], + "source": [ + "# Write your code below and press Shift+Enter to execute\n", + "games_home['PTS'].mean()\n", + "\n", + "games_away['PTS'].mean()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "games_home['PTS'].mean()\n", + "\n", + "games_away['PTS'].mean()\n", + "\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Authors:\n", + "\n", + "[Joseph Santarcangelo](https://www.linkedin.com/in/joseph-s-50398b136/)\n", + "\n", + "Joseph Santarcangelo has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n", + "\n", + "\n", + "
    \n", + "\n", + "##

    © IBM Corporation 2023. All rights reserved.

    \n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python", + "language": "python", + "name": "conda-env-python-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.12" + }, + "prev_pub_hash": "271610dc516897640c6672cd11a0aebaa773f29fb3a2af1abca0595363dcaaba" + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/PY0101EN-5 2_API_2 v2 (1).ipynb b/PY0101EN-5 2_API_2 v2 (1).ipynb new file mode 100644 index 0000000..0fbd78f --- /dev/null +++ b/PY0101EN-5 2_API_2 v2 (1).ipynb @@ -0,0 +1,2248 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    \n", + " \"cognitiveclass.ai\n", + "
    \n", + "\n", + "# Hands-on Lab: API Examples\n", + "## Random User and Fruityvice API Examples\n", + "\n", + "\n", + "Estimated time needed: **30** minutes\n", + "\n", + "## Objectives\n", + "\n", + "After completing this lab you will be able to:\n", + "\n", + "* Load and use RandomUser API, using `RandomUser()` Python library\n", + "* Load and use Fruityvice API, using `requests` Python library\n", + "* Load and use Open-Joke-API, using `requests` Python library\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The purpose of this notebook is to provide more examples on how to use simple APIs. As you have already learned from previous videos and notebooks, API stands for Application Programming Interface and is a software intermediary that allows two applications to talk to each other. \n", + "\n", + "The advantages of using APIs:\n", + " * **Automation**. Less human effort is required and workflows can be easily updated to become faster and more \n", + " productive.\n", + " * **Efficiency**. It allows to use the capabilities of one of the already developed APIs than to try to \n", + " independently implement some functionality from scratch.\n", + " \n", + "The disadvantage of using APIs:\n", + " * **Security**. If the API is poorly integrated, it means it will be vulnerable to attacks, resulting in data breeches or losses having financial or reputation implications.\n", + "\n", + "One of the applications we will use in this notebook is Random User Generator. RandomUser is an open-source, free API providing developers with randomly generated users to be used as placeholders for testing purposes. This makes the tool similar to Lorem Ipsum, but is a placeholder for people instead of text. The API can return multiple results, as well as specify generated user details such as gender, email, image, username, address, title, first and last name, and more. More information on [RandomUser](https://randomuser.me/documentation#intro) can be found here.\n", + "\n", + "Another example of simple API we will use in this notebook is Fruityvice application. The Fruityvice API web service which provides data for all kinds of fruit! You can use Fruityvice to find out interesting information about fruit and educate yourself. The web service is completely free to use and contribute to.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 1: RandomUser API\n", + "Bellow are Get Methods parameters that we can generate. For more information on the parameters, please visit this [documentation](https://randomuser.me/documentation) page.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## **Get Methods**\n", + "\n", + "- get_cell()\n", + "- get_city()\n", + "- get_dob()\n", + "- get_email()\n", + "- get_first_name()\n", + "- get_full_name()\n", + "- get_gender()\n", + "- get_id()\n", + "- get_id_number()\n", + "- get_id_type()\n", + "- get_info()\n", + "- get_last_name()\n", + "- get_login_md5()\n", + "- get_login_salt()\n", + "- get_login_sha1()\n", + "- get_login_sha256()\n", + "- get_nat()\n", + "- get_password()\n", + "- get_phone()\n", + "- get_picture()\n", + "- get_postcode()\n", + "- get_registered()\n", + "- get_state()\n", + "- get_street()\n", + "- get_username()\n", + "- get_zipcode()\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To start using the API you can install the `randomuser` library running the `pip install` command.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Requirement already satisfied: randomuser in /opt/conda/lib/python3.12/site-packages (1.6)\n", + "Requirement already satisfied: pandas in /opt/conda/lib/python3.12/site-packages (3.0.1)\n", + "Requirement already satisfied: numpy>=1.26.0 in /opt/conda/lib/python3.12/site-packages (from pandas) (2.4.3)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.12/site-packages (from pandas) (2.9.0.post0)\n", + "Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.12/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)\n" + ] + } + ], + "source": [ + "!pip install randomuser\n", + "!pip install pandas" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, we will load the necessary libraries.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "from randomuser import RandomUser\n", + "import pandas as pd" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we will create a random user object, r.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [], + "source": [ + "r = RandomUser()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, using `generate_users()` function, we get a list of random 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [], + "source": [ + "some_list = r.generate_users(10)" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ]" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "some_list" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The **\"Get Methods\"** functions mentioned at the beginning of this notebook, can generate the required parameters to construct a dataset. For example, to get full name, we call `get_full_name()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [], + "source": [ + "name = r.get_full_name()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's say we only need 10 users with full names and their email addresses. We can write a \"for-loop\" to print these 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Toivo Erkkila toivo.erkkila@example.com\n", + "Chama Hoedjes chama.hoedjes@example.com\n", + "Kuzey Taşçı kuzey.tasci@example.com\n", + "Naomi Singh naomi.singh@example.com\n", + "Sheila Welch sheila.welch@example.com\n", + "Phoebe Davies phoebe.davies@example.com\n", + "Luz Ortiz luz.ortiz@example.com\n", + "Hemitério Barros hemiterio.barros@example.com\n", + "Natacha Dubois natacha.dubois@example.com\n", + "Violet Bailey violet.bailey@example.com\n" + ] + } + ], + "source": [ + "for user in some_list:\n", + " print (user.get_full_name(),\" \",user.get_email())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 1\n", + "In this Exercise, generate photos of the random 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "https://randomuser.me/api/portraits/men/64.jpg\n", + "https://randomuser.me/api/portraits/women/77.jpg\n", + "https://randomuser.me/api/portraits/men/35.jpg\n", + "https://randomuser.me/api/portraits/women/63.jpg\n", + "https://randomuser.me/api/portraits/women/78.jpg\n", + "https://randomuser.me/api/portraits/women/86.jpg\n", + "https://randomuser.me/api/portraits/women/52.jpg\n", + "https://randomuser.me/api/portraits/men/95.jpg\n", + "https://randomuser.me/api/portraits/women/82.jpg\n", + "https://randomuser.me/api/portraits/women/19.jpg\n" + ] + } + ], + "source": [ + "## Write your code here\n", + "for user in some_list:\n", + " print (user.get_picture())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "for user in some_list:\n", + " print (user.get_picture())\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To generate a table with information about the users, we can write a function containing all desirable parameters. For example, name, gender, city, etc. The parameters will depend on the requirements of the test to be performed. We call the Get Methods, listed at the beginning of this notebook. Then, we return pandas dataframe with the users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "def get_users():\n", + " users =[]\n", + " \n", + " for user in RandomUser.generate_users(10):\n", + " users.append({\"Name\":user.get_full_name(),\"Gender\":user.get_gender(),\"City\":user.get_city(),\"State\":user.get_state(),\"Email\":user.get_email(), \"DOB\":user.get_dob(),\"Picture\":user.get_picture()})\n", + " \n", + " return pd.DataFrame(users) " + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    NameGenderCityStateEmailDOBPicture
    0Irène RenardfemaleMorcoteGraubündenirene.renard@example.com1968-07-05T03:08:20.465Zhttps://randomuser.me/api/portraits/women/36.jpg
    1Ida SchnoorfemaleKranichfeldSaarlandida.schnoor@example.com1969-09-22T11:23:25.734Zhttps://randomuser.me/api/portraits/women/51.jpg
    2Siebrigje VerhoogfemaleTerborgFlevolandsiebrigje.verhoog@example.com1986-06-12T19:23:24.048Zhttps://randomuser.me/api/portraits/women/61.jpg
    3Anolido NunesmaleResendeDistrito Federalanolido.nunes@example.com1946-02-08T04:26:32.531Zhttps://randomuser.me/api/portraits/men/68.jpg
    4Maël MathieumaleSaint-PierreLoire-Atlantiquemael.mathieu@example.com1983-09-04T11:24:22.303Zhttps://randomuser.me/api/portraits/men/57.jpg
    5Christina SaurfemaleFreisingHessenchristina.saur@example.com1951-07-03T13:24:53.323Zhttps://randomuser.me/api/portraits/women/28.jpg
    6Siete SpoelstramaleLeeuwarderadeelUtrechtsiete.spoelstra@example.com1948-04-08T03:02:35.102Zhttps://randomuser.me/api/portraits/men/74.jpg
    7Ron SpencermaleRoscommonFingalron.spencer@example.com1958-10-06T19:58:05.229Zhttps://randomuser.me/api/portraits/men/58.jpg
    8Nathan GarrettmaleSunnyvaleWyomingnathan.garrett@example.com1995-05-09T02:32:53.349Zhttps://randomuser.me/api/portraits/men/31.jpg
    9Claire GeorgefemaleLeixlipWexfordclaire.george@example.com1965-08-04T08:19:42.828Zhttps://randomuser.me/api/portraits/women/83.jpg
    \n", + "
    " + ], + "text/plain": [ + " Name Gender City State \\\n", + "0 Irène Renard female Morcote Graubünden \n", + "1 Ida Schnoor female Kranichfeld Saarland \n", + "2 Siebrigje Verhoog female Terborg Flevoland \n", + "3 Anolido Nunes male Resende Distrito Federal \n", + "4 Maël Mathieu male Saint-Pierre Loire-Atlantique \n", + "5 Christina Saur female Freising Hessen \n", + "6 Siete Spoelstra male Leeuwarderadeel Utrecht \n", + "7 Ron Spencer male Roscommon Fingal \n", + "8 Nathan Garrett male Sunnyvale Wyoming \n", + "9 Claire George female Leixlip Wexford \n", + "\n", + " Email DOB \\\n", + "0 irene.renard@example.com 1968-07-05T03:08:20.465Z \n", + "1 ida.schnoor@example.com 1969-09-22T11:23:25.734Z \n", + "2 siebrigje.verhoog@example.com 1986-06-12T19:23:24.048Z \n", + "3 anolido.nunes@example.com 1946-02-08T04:26:32.531Z \n", + "4 mael.mathieu@example.com 1983-09-04T11:24:22.303Z \n", + "5 christina.saur@example.com 1951-07-03T13:24:53.323Z \n", + "6 siete.spoelstra@example.com 1948-04-08T03:02:35.102Z \n", + "7 ron.spencer@example.com 1958-10-06T19:58:05.229Z \n", + "8 nathan.garrett@example.com 1995-05-09T02:32:53.349Z \n", + "9 claire.george@example.com 1965-08-04T08:19:42.828Z \n", + "\n", + " Picture \n", + "0 https://randomuser.me/api/portraits/women/36.jpg \n", + "1 https://randomuser.me/api/portraits/women/51.jpg \n", + "2 https://randomuser.me/api/portraits/women/61.jpg \n", + "3 https://randomuser.me/api/portraits/men/68.jpg \n", + "4 https://randomuser.me/api/portraits/men/57.jpg \n", + "5 https://randomuser.me/api/portraits/women/28.jpg \n", + "6 https://randomuser.me/api/portraits/men/74.jpg \n", + "7 https://randomuser.me/api/portraits/men/58.jpg \n", + "8 https://randomuser.me/api/portraits/men/31.jpg \n", + "9 https://randomuser.me/api/portraits/women/83.jpg " + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "get_users()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df1 = pd.DataFrame(get_users())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we have a *pandas* dataframe that can be used for any testing purposes that the tester might have.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 2: Fruityvice API\n", + "\n", + "Another, more common way to use APIs, is through `requests` library. The next lab, Requests and HTTP, will contain more information about requests.\n", + "\n", + "We will start by importing all required libraries.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "import json" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will obtain the [fruityvice](https://www.fruityvice.com) API data using `requests.get(\"url\")` function. The data is in a json format.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "data = requests.get(\"https://web.archive.org/web/20240929211114/https://fruityvice.com/api/fruit/all\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will retrieve results using `json.loads()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [], + "source": [ + "results = json.loads(data.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will convert our json data into *pandas* data frame. \n" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    nameidfamilyordergenusnutritions
    0Persimmon52EbenaceaeRosalesDiospyros{'calories': 81, 'fat': 0.0, 'sugar': 18.0, 'c...
    1Strawberry3RosaceaeRosalesFragaria{'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca...
    2Banana1MusaceaeZingiberalesMusa{'calories': 96, 'fat': 0.2, 'sugar': 17.2, 'c...
    3Tomato5SolanaceaeSolanalesSolanum{'calories': 74, 'fat': 0.2, 'sugar': 2.6, 'ca...
    4Pear4RosaceaeRosalesPyrus{'calories': 57, 'fat': 0.1, 'sugar': 10.0, 'c...
    5Durian60MalvaceaeMalvalesDurio{'calories': 147, 'fat': 5.3, 'sugar': 6.75, '...
    6Blackberry64RosaceaeRosalesRubus{'calories': 40, 'fat': 0.4, 'sugar': 4.5, 'ca...
    7Lingonberry65EricaceaeEricalesVaccinium{'calories': 50, 'fat': 0.34, 'sugar': 5.74, '...
    8Kiwi66ActinidiaceaeStruthioniformesApteryx{'calories': 61, 'fat': 0.5, 'sugar': 9.0, 'ca...
    9Lychee67SapindaceaeSapindalesLitchi{'calories': 66, 'fat': 0.44, 'sugar': 15.0, '...
    10Pineapple10BromeliaceaePoalesAnanas{'calories': 50, 'fat': 0.12, 'sugar': 9.85, '...
    11Fig68MoraceaeRosalesFicus{'calories': 74, 'fat': 0.3, 'sugar': 16.0, 'c...
    12Gooseberry69GrossulariaceaeSaxifragalesRibes{'calories': 44, 'fat': 0.6, 'sugar': 0.0, 'ca...
    13Passionfruit70PassifloraceaeMalpighialesPassiflora{'calories': 97, 'fat': 0.7, 'sugar': 11.2, 'c...
    14Plum71RosaceaeRosalesPrunus{'calories': 46, 'fat': 0.28, 'sugar': 9.92, '...
    15Orange2RutaceaeSapindalesCitrus{'calories': 43, 'fat': 0.2, 'sugar': 8.2, 'ca...
    16GreenApple72RosaceaeRosalesMalus{'calories': 21, 'fat': 0.1, 'sugar': 6.4, 'ca...
    17Raspberry23RosaceaeRosalesRubus{'calories': 53, 'fat': 0.7, 'sugar': 4.4, 'ca...
    18Watermelon25CucurbitaceaeCucurbitalesCitrullus{'calories': 30, 'fat': 0.2, 'sugar': 6.0, 'ca...
    19Lemon26RutaceaeSapindalesCitrus{'calories': 29, 'fat': 0.3, 'sugar': 2.5, 'ca...
    20Mango27AnacardiaceaeSapindalesMangifera{'calories': 60, 'fat': 0.38, 'sugar': 13.7, '...
    21Blueberry33RosaceaeRosalesFragaria{'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca...
    22Apple6RosaceaeRosalesMalus{'calories': 52, 'fat': 0.4, 'sugar': 10.3, 'c...
    23Guava37MyrtaceaeMyrtalesPsidium{'calories': 68, 'fat': 1.0, 'sugar': 9.0, 'ca...
    24Apricot35RosaceaeRosalesPrunus{'calories': 15, 'fat': 0.1, 'sugar': 3.2, 'ca...
    25Melon41CucurbitaceaeCucurbitaceaeCucumis{'calories': 34, 'fat': 0.0, 'sugar': 8.0, 'ca...
    26Tangerine77RutaceaeSapindalesCitrus{'calories': 45, 'fat': 0.4, 'sugar': 9.1, 'ca...
    27Pitahaya78CactaceaeCaryophyllalesCactaceae{'calories': 36, 'fat': 0.4, 'sugar': 3.0, 'ca...
    28Lime44RutaceaeSapindalesCitrus{'calories': 25, 'fat': 0.1, 'sugar': 1.7, 'ca...
    29Pomegranate79LythraceaeMyrtalesPunica{'calories': 83, 'fat': 1.2, 'sugar': 13.7, 'c...
    30Dragonfruit80CactaceaeCaryophyllalesSelenicereus{'calories': 60, 'fat': 1.5, 'sugar': 8.0, 'ca...
    31Grape81VitaceaeVitalesVitis{'calories': 69, 'fat': 0.16, 'sugar': 16.0, '...
    32Morus82MoraceaeRosalesMorus{'calories': 43, 'fat': 0.39, 'sugar': 8.1, 'c...
    33Feijoa76MyrtaceaeMyrtoideaeSellowiana{'calories': 44, 'fat': 0.4, 'sugar': 3.0, 'ca...
    34Avocado84LauraceaeLauralesPersea{'calories': 160, 'fat': 14.66, 'sugar': 0.66,...
    35Kiwifruit85ActinidiaceaeEricalesActinidia{'calories': 61, 'fat': 0.5, 'sugar': 8.9, 'ca...
    36Cranberry87EricaceaeEricalesVaccinium{'calories': 46, 'fat': 0.1, 'sugar': 4.0, 'ca...
    37Cherry9RosaceaeRosalesPrunus{'calories': 50, 'fat': 0.3, 'sugar': 8.0, 'ca...
    38Peach86RosaceaeRosalesPrunus{'calories': 39, 'fat': 0.25, 'sugar': 8.4, 'c...
    39Jackfruit94MoraceaeRosalesArtocarpus{'calories': 95, 'fat': 0.0, 'sugar': 19.1, 'c...
    40Horned Melon95CucurbitaceaeCucurbitalesCucumis{'calories': 44, 'fat': 1.26, 'sugar': 0.5, 'c...
    41Hazelnut96BetulaceaeFagalesCorylus{'calories': 628, 'fat': 61.0, 'sugar': 4.3, '...
    42Pomelo98RutaceaeSapindalesCitrus{'calories': 37, 'fat': 0.0, 'sugar': 8.5, 'ca...
    43Mangosteen99ClusiaceaeMalpighialesGarcinia{'calories': 73, 'fat': 0.58, 'sugar': 16.11, ...
    44Pumpkin100CucurbitaceaeCucurbitalesCucurbita{'calories': 25, 'fat': 0.3, 'sugar': 3.3, 'ca...
    45Japanese Persimmon101EbenaceaeEricalesDiospyros{'calories': 70, 'fat': 0.2, 'sugar': 13.0, 'c...
    46Papaya42CaricaceaeBrassicalesCarica{'calories': 39, 'fat': 0.3, 'sugar': 4.4, 'ca...
    47Annona103AnnonaceaeRosalesAnnonas{'calories': 92, 'fat': 0.29, 'sugar': 3.4, 'c...
    48Ceylon Gooseberry104SalicaceaeMalpighialesDovyalis{'calories': 47, 'fat': 0.3, 'sugar': 8.1, 'ca...
    \n", + "
    " + ], + "text/plain": [ + " name id family order genus \\\n", + "0 Persimmon 52 Ebenaceae Rosales Diospyros \n", + "1 Strawberry 3 Rosaceae Rosales Fragaria \n", + "2 Banana 1 Musaceae Zingiberales Musa \n", + "3 Tomato 5 Solanaceae Solanales Solanum \n", + "4 Pear 4 Rosaceae Rosales Pyrus \n", + "5 Durian 60 Malvaceae Malvales Durio \n", + "6 Blackberry 64 Rosaceae Rosales Rubus \n", + "7 Lingonberry 65 Ericaceae Ericales Vaccinium \n", + "8 Kiwi 66 Actinidiaceae Struthioniformes Apteryx \n", + "9 Lychee 67 Sapindaceae Sapindales Litchi \n", + "10 Pineapple 10 Bromeliaceae Poales Ananas \n", + "11 Fig 68 Moraceae Rosales Ficus \n", + "12 Gooseberry 69 Grossulariaceae Saxifragales Ribes \n", + "13 Passionfruit 70 Passifloraceae Malpighiales Passiflora \n", + "14 Plum 71 Rosaceae Rosales Prunus \n", + "15 Orange 2 Rutaceae Sapindales Citrus \n", + "16 GreenApple 72 Rosaceae Rosales Malus \n", + "17 Raspberry 23 Rosaceae Rosales Rubus \n", + "18 Watermelon 25 Cucurbitaceae Cucurbitales Citrullus \n", + "19 Lemon 26 Rutaceae Sapindales Citrus \n", + "20 Mango 27 Anacardiaceae Sapindales Mangifera \n", + "21 Blueberry 33 Rosaceae Rosales Fragaria \n", + "22 Apple 6 Rosaceae Rosales Malus \n", + "23 Guava 37 Myrtaceae Myrtales Psidium \n", + "24 Apricot 35 Rosaceae Rosales Prunus \n", + "25 Melon 41 Cucurbitaceae Cucurbitaceae Cucumis \n", + "26 Tangerine 77 Rutaceae Sapindales Citrus \n", + "27 Pitahaya 78 Cactaceae Caryophyllales Cactaceae \n", + "28 Lime 44 Rutaceae Sapindales Citrus \n", + "29 Pomegranate 79 Lythraceae Myrtales Punica \n", + "30 Dragonfruit 80 Cactaceae Caryophyllales Selenicereus \n", + "31 Grape 81 Vitaceae Vitales Vitis \n", + "32 Morus 82 Moraceae Rosales Morus \n", + "33 Feijoa 76 Myrtaceae Myrtoideae Sellowiana \n", + "34 Avocado 84 Lauraceae Laurales Persea \n", + "35 Kiwifruit 85 Actinidiaceae Ericales Actinidia \n", + "36 Cranberry 87 Ericaceae Ericales Vaccinium \n", + "37 Cherry 9 Rosaceae Rosales Prunus \n", + "38 Peach 86 Rosaceae Rosales Prunus \n", + "39 Jackfruit 94 Moraceae Rosales Artocarpus \n", + "40 Horned Melon 95 Cucurbitaceae Cucurbitales Cucumis \n", + "41 Hazelnut 96 Betulaceae Fagales Corylus \n", + "42 Pomelo 98 Rutaceae Sapindales Citrus \n", + "43 Mangosteen 99 Clusiaceae Malpighiales Garcinia \n", + "44 Pumpkin 100 Cucurbitaceae Cucurbitales Cucurbita \n", + "45 Japanese Persimmon 101 Ebenaceae Ericales Diospyros \n", + "46 Papaya 42 Caricaceae Brassicales Carica \n", + "47 Annona 103 Annonaceae Rosales Annonas \n", + "48 Ceylon Gooseberry 104 Salicaceae Malpighiales Dovyalis \n", + "\n", + " nutritions \n", + "0 {'calories': 81, 'fat': 0.0, 'sugar': 18.0, 'c... \n", + "1 {'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca... \n", + "2 {'calories': 96, 'fat': 0.2, 'sugar': 17.2, 'c... \n", + "3 {'calories': 74, 'fat': 0.2, 'sugar': 2.6, 'ca... \n", + "4 {'calories': 57, 'fat': 0.1, 'sugar': 10.0, 'c... \n", + "5 {'calories': 147, 'fat': 5.3, 'sugar': 6.75, '... \n", + "6 {'calories': 40, 'fat': 0.4, 'sugar': 4.5, 'ca... \n", + "7 {'calories': 50, 'fat': 0.34, 'sugar': 5.74, '... \n", + "8 {'calories': 61, 'fat': 0.5, 'sugar': 9.0, 'ca... \n", + "9 {'calories': 66, 'fat': 0.44, 'sugar': 15.0, '... \n", + "10 {'calories': 50, 'fat': 0.12, 'sugar': 9.85, '... \n", + "11 {'calories': 74, 'fat': 0.3, 'sugar': 16.0, 'c... \n", + "12 {'calories': 44, 'fat': 0.6, 'sugar': 0.0, 'ca... \n", + "13 {'calories': 97, 'fat': 0.7, 'sugar': 11.2, 'c... \n", + "14 {'calories': 46, 'fat': 0.28, 'sugar': 9.92, '... \n", + "15 {'calories': 43, 'fat': 0.2, 'sugar': 8.2, 'ca... \n", + "16 {'calories': 21, 'fat': 0.1, 'sugar': 6.4, 'ca... \n", + "17 {'calories': 53, 'fat': 0.7, 'sugar': 4.4, 'ca... \n", + "18 {'calories': 30, 'fat': 0.2, 'sugar': 6.0, 'ca... \n", + "19 {'calories': 29, 'fat': 0.3, 'sugar': 2.5, 'ca... \n", + "20 {'calories': 60, 'fat': 0.38, 'sugar': 13.7, '... \n", + "21 {'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca... \n", + "22 {'calories': 52, 'fat': 0.4, 'sugar': 10.3, 'c... \n", + "23 {'calories': 68, 'fat': 1.0, 'sugar': 9.0, 'ca... \n", + "24 {'calories': 15, 'fat': 0.1, 'sugar': 3.2, 'ca... \n", + "25 {'calories': 34, 'fat': 0.0, 'sugar': 8.0, 'ca... \n", + "26 {'calories': 45, 'fat': 0.4, 'sugar': 9.1, 'ca... \n", + "27 {'calories': 36, 'fat': 0.4, 'sugar': 3.0, 'ca... \n", + "28 {'calories': 25, 'fat': 0.1, 'sugar': 1.7, 'ca... \n", + "29 {'calories': 83, 'fat': 1.2, 'sugar': 13.7, 'c... \n", + "30 {'calories': 60, 'fat': 1.5, 'sugar': 8.0, 'ca... \n", + "31 {'calories': 69, 'fat': 0.16, 'sugar': 16.0, '... \n", + "32 {'calories': 43, 'fat': 0.39, 'sugar': 8.1, 'c... \n", + "33 {'calories': 44, 'fat': 0.4, 'sugar': 3.0, 'ca... \n", + "34 {'calories': 160, 'fat': 14.66, 'sugar': 0.66,... \n", + "35 {'calories': 61, 'fat': 0.5, 'sugar': 8.9, 'ca... \n", + "36 {'calories': 46, 'fat': 0.1, 'sugar': 4.0, 'ca... \n", + "37 {'calories': 50, 'fat': 0.3, 'sugar': 8.0, 'ca... \n", + "38 {'calories': 39, 'fat': 0.25, 'sugar': 8.4, 'c... \n", + "39 {'calories': 95, 'fat': 0.0, 'sugar': 19.1, 'c... \n", + "40 {'calories': 44, 'fat': 1.26, 'sugar': 0.5, 'c... \n", + "41 {'calories': 628, 'fat': 61.0, 'sugar': 4.3, '... \n", + "42 {'calories': 37, 'fat': 0.0, 'sugar': 8.5, 'ca... \n", + "43 {'calories': 73, 'fat': 0.58, 'sugar': 16.11, ... \n", + "44 {'calories': 25, 'fat': 0.3, 'sugar': 3.3, 'ca... \n", + "45 {'calories': 70, 'fat': 0.2, 'sugar': 13.0, 'c... \n", + "46 {'calories': 39, 'fat': 0.3, 'sugar': 4.4, 'ca... \n", + "47 {'calories': 92, 'fat': 0.29, 'sugar': 3.4, 'c... \n", + "48 {'calories': 47, 'fat': 0.3, 'sugar': 8.1, 'ca... " + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pd.DataFrame(results)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The result is in a nested json format. The 'nutrition' column contains multiple subcolumns, so the data needs to be 'flattened' or normalized.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [], + "source": [ + "df2 = pd.json_normalize(results)" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    nameidfamilyordergenusnutritions.caloriesnutritions.fatnutritions.sugarnutritions.carbohydratesnutritions.protein
    0Persimmon52EbenaceaeRosalesDiospyros810.0018.0018.000.00
    1Strawberry3RosaceaeRosalesFragaria290.405.405.500.80
    2Banana1MusaceaeZingiberalesMusa960.2017.2022.001.00
    3Tomato5SolanaceaeSolanalesSolanum740.202.603.900.90
    4Pear4RosaceaeRosalesPyrus570.1010.0015.000.40
    5Durian60MalvaceaeMalvalesDurio1475.306.7527.101.50
    6Blackberry64RosaceaeRosalesRubus400.404.509.001.30
    7Lingonberry65EricaceaeEricalesVaccinium500.345.7411.300.75
    8Kiwi66ActinidiaceaeStruthioniformesApteryx610.509.0015.001.10
    9Lychee67SapindaceaeSapindalesLitchi660.4415.0017.000.80
    10Pineapple10BromeliaceaePoalesAnanas500.129.8513.120.54
    11Fig68MoraceaeRosalesFicus740.3016.0019.000.80
    12Gooseberry69GrossulariaceaeSaxifragalesRibes440.600.0010.000.90
    13Passionfruit70PassifloraceaeMalpighialesPassiflora970.7011.2022.402.20
    14Plum71RosaceaeRosalesPrunus460.289.9211.400.70
    15Orange2RutaceaeSapindalesCitrus430.208.208.301.00
    16GreenApple72RosaceaeRosalesMalus210.106.403.100.40
    17Raspberry23RosaceaeRosalesRubus530.704.4012.001.20
    18Watermelon25CucurbitaceaeCucurbitalesCitrullus300.206.008.000.60
    19Lemon26RutaceaeSapindalesCitrus290.302.509.001.10
    20Mango27AnacardiaceaeSapindalesMangifera600.3813.7015.000.82
    21Blueberry33RosaceaeRosalesFragaria290.405.405.500.00
    22Apple6RosaceaeRosalesMalus520.4010.3011.400.30
    23Guava37MyrtaceaeMyrtalesPsidium681.009.0014.002.60
    24Apricot35RosaceaeRosalesPrunus150.103.203.900.50
    25Melon41CucurbitaceaeCucurbitaceaeCucumis340.008.008.000.00
    26Tangerine77RutaceaeSapindalesCitrus450.409.108.300.00
    27Pitahaya78CactaceaeCaryophyllalesCactaceae360.403.007.001.00
    28Lime44RutaceaeSapindalesCitrus250.101.708.400.30
    29Pomegranate79LythraceaeMyrtalesPunica831.2013.7018.701.70
    30Dragonfruit80CactaceaeCaryophyllalesSelenicereus601.508.009.009.00
    31Grape81VitaceaeVitalesVitis690.1616.0018.100.72
    32Morus82MoraceaeRosalesMorus430.398.109.801.44
    33Feijoa76MyrtaceaeMyrtoideaeSellowiana440.403.008.000.60
    34Avocado84LauraceaeLauralesPersea16014.660.668.532.00
    35Kiwifruit85ActinidiaceaeEricalesActinidia610.508.9014.601.14
    36Cranberry87EricaceaeEricalesVaccinium460.104.0012.200.40
    37Cherry9RosaceaeRosalesPrunus500.308.0012.001.00
    38Peach86RosaceaeRosalesPrunus390.258.409.500.90
    39Jackfruit94MoraceaeRosalesArtocarpus950.0019.1023.201.72
    40Horned Melon95CucurbitaceaeCucurbitalesCucumis441.260.507.561.78
    41Hazelnut96BetulaceaeFagalesCorylus62861.004.3017.0015.00
    42Pomelo98RutaceaeSapindalesCitrus370.008.509.670.82
    43Mangosteen99ClusiaceaeMalpighialesGarcinia730.5816.1117.910.41
    44Pumpkin100CucurbitaceaeCucurbitalesCucurbita250.303.304.601.10
    45Japanese Persimmon101EbenaceaeEricalesDiospyros700.2013.0019.000.60
    46Papaya42CaricaceaeBrassicalesCarica390.304.405.800.50
    47Annona103AnnonaceaeRosalesAnnonas920.293.4019.101.50
    48Ceylon Gooseberry104SalicaceaeMalpighialesDovyalis470.308.109.601.20
    \n", + "
    " + ], + "text/plain": [ + " name id family order genus \\\n", + "0 Persimmon 52 Ebenaceae Rosales Diospyros \n", + "1 Strawberry 3 Rosaceae Rosales Fragaria \n", + "2 Banana 1 Musaceae Zingiberales Musa \n", + "3 Tomato 5 Solanaceae Solanales Solanum \n", + "4 Pear 4 Rosaceae Rosales Pyrus \n", + "5 Durian 60 Malvaceae Malvales Durio \n", + "6 Blackberry 64 Rosaceae Rosales Rubus \n", + "7 Lingonberry 65 Ericaceae Ericales Vaccinium \n", + "8 Kiwi 66 Actinidiaceae Struthioniformes Apteryx \n", + "9 Lychee 67 Sapindaceae Sapindales Litchi \n", + "10 Pineapple 10 Bromeliaceae Poales Ananas \n", + "11 Fig 68 Moraceae Rosales Ficus \n", + "12 Gooseberry 69 Grossulariaceae Saxifragales Ribes \n", + "13 Passionfruit 70 Passifloraceae Malpighiales Passiflora \n", + "14 Plum 71 Rosaceae Rosales Prunus \n", + "15 Orange 2 Rutaceae Sapindales Citrus \n", + "16 GreenApple 72 Rosaceae Rosales Malus \n", + "17 Raspberry 23 Rosaceae Rosales Rubus \n", + "18 Watermelon 25 Cucurbitaceae Cucurbitales Citrullus \n", + "19 Lemon 26 Rutaceae Sapindales Citrus \n", + "20 Mango 27 Anacardiaceae Sapindales Mangifera \n", + "21 Blueberry 33 Rosaceae Rosales Fragaria \n", + "22 Apple 6 Rosaceae Rosales Malus \n", + "23 Guava 37 Myrtaceae Myrtales Psidium \n", + "24 Apricot 35 Rosaceae Rosales Prunus \n", + "25 Melon 41 Cucurbitaceae Cucurbitaceae Cucumis \n", + "26 Tangerine 77 Rutaceae Sapindales Citrus \n", + "27 Pitahaya 78 Cactaceae Caryophyllales Cactaceae \n", + "28 Lime 44 Rutaceae Sapindales Citrus \n", + "29 Pomegranate 79 Lythraceae Myrtales Punica \n", + "30 Dragonfruit 80 Cactaceae Caryophyllales Selenicereus \n", + "31 Grape 81 Vitaceae Vitales Vitis \n", + "32 Morus 82 Moraceae Rosales Morus \n", + "33 Feijoa 76 Myrtaceae Myrtoideae Sellowiana \n", + "34 Avocado 84 Lauraceae Laurales Persea \n", + "35 Kiwifruit 85 Actinidiaceae Ericales Actinidia \n", + "36 Cranberry 87 Ericaceae Ericales Vaccinium \n", + "37 Cherry 9 Rosaceae Rosales Prunus \n", + "38 Peach 86 Rosaceae Rosales Prunus \n", + "39 Jackfruit 94 Moraceae Rosales Artocarpus \n", + "40 Horned Melon 95 Cucurbitaceae Cucurbitales Cucumis \n", + "41 Hazelnut 96 Betulaceae Fagales Corylus \n", + "42 Pomelo 98 Rutaceae Sapindales Citrus \n", + "43 Mangosteen 99 Clusiaceae Malpighiales Garcinia \n", + "44 Pumpkin 100 Cucurbitaceae Cucurbitales Cucurbita \n", + "45 Japanese Persimmon 101 Ebenaceae Ericales Diospyros \n", + "46 Papaya 42 Caricaceae Brassicales Carica \n", + "47 Annona 103 Annonaceae Rosales Annonas \n", + "48 Ceylon Gooseberry 104 Salicaceae Malpighiales Dovyalis \n", + "\n", + " nutritions.calories nutritions.fat nutritions.sugar \\\n", + "0 81 0.00 18.00 \n", + "1 29 0.40 5.40 \n", + "2 96 0.20 17.20 \n", + "3 74 0.20 2.60 \n", + "4 57 0.10 10.00 \n", + "5 147 5.30 6.75 \n", + "6 40 0.40 4.50 \n", + "7 50 0.34 5.74 \n", + "8 61 0.50 9.00 \n", + "9 66 0.44 15.00 \n", + "10 50 0.12 9.85 \n", + "11 74 0.30 16.00 \n", + "12 44 0.60 0.00 \n", + "13 97 0.70 11.20 \n", + "14 46 0.28 9.92 \n", + "15 43 0.20 8.20 \n", + "16 21 0.10 6.40 \n", + "17 53 0.70 4.40 \n", + "18 30 0.20 6.00 \n", + "19 29 0.30 2.50 \n", + "20 60 0.38 13.70 \n", + "21 29 0.40 5.40 \n", + "22 52 0.40 10.30 \n", + "23 68 1.00 9.00 \n", + "24 15 0.10 3.20 \n", + "25 34 0.00 8.00 \n", + "26 45 0.40 9.10 \n", + "27 36 0.40 3.00 \n", + "28 25 0.10 1.70 \n", + "29 83 1.20 13.70 \n", + "30 60 1.50 8.00 \n", + "31 69 0.16 16.00 \n", + "32 43 0.39 8.10 \n", + "33 44 0.40 3.00 \n", + "34 160 14.66 0.66 \n", + "35 61 0.50 8.90 \n", + "36 46 0.10 4.00 \n", + "37 50 0.30 8.00 \n", + "38 39 0.25 8.40 \n", + "39 95 0.00 19.10 \n", + "40 44 1.26 0.50 \n", + "41 628 61.00 4.30 \n", + "42 37 0.00 8.50 \n", + "43 73 0.58 16.11 \n", + "44 25 0.30 3.30 \n", + "45 70 0.20 13.00 \n", + "46 39 0.30 4.40 \n", + "47 92 0.29 3.40 \n", + "48 47 0.30 8.10 \n", + "\n", + " nutritions.carbohydrates nutritions.protein \n", + "0 18.00 0.00 \n", + "1 5.50 0.80 \n", + "2 22.00 1.00 \n", + "3 3.90 0.90 \n", + "4 15.00 0.40 \n", + "5 27.10 1.50 \n", + "6 9.00 1.30 \n", + "7 11.30 0.75 \n", + "8 15.00 1.10 \n", + "9 17.00 0.80 \n", + "10 13.12 0.54 \n", + "11 19.00 0.80 \n", + "12 10.00 0.90 \n", + "13 22.40 2.20 \n", + "14 11.40 0.70 \n", + "15 8.30 1.00 \n", + "16 3.10 0.40 \n", + "17 12.00 1.20 \n", + "18 8.00 0.60 \n", + "19 9.00 1.10 \n", + "20 15.00 0.82 \n", + "21 5.50 0.00 \n", + "22 11.40 0.30 \n", + "23 14.00 2.60 \n", + "24 3.90 0.50 \n", + "25 8.00 0.00 \n", + "26 8.30 0.00 \n", + "27 7.00 1.00 \n", + "28 8.40 0.30 \n", + "29 18.70 1.70 \n", + "30 9.00 9.00 \n", + "31 18.10 0.72 \n", + "32 9.80 1.44 \n", + "33 8.00 0.60 \n", + "34 8.53 2.00 \n", + "35 14.60 1.14 \n", + "36 12.20 0.40 \n", + "37 12.00 1.00 \n", + "38 9.50 0.90 \n", + "39 23.20 1.72 \n", + "40 7.56 1.78 \n", + "41 17.00 15.00 \n", + "42 9.67 0.82 \n", + "43 17.91 0.41 \n", + "44 4.60 1.10 \n", + "45 19.00 0.60 \n", + "46 5.80 0.50 \n", + "47 19.10 1.50 \n", + "48 9.60 1.20 " + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see if we can extract some information from this dataframe. Perhaps, we need to know the family and genus of a cherry.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "('Rosaceae', 'Prunus')" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cherry = df2.loc[df2[\"name\"] == 'Cherry']\n", + "(cherry.iloc[0]['family']) , (cherry.iloc[0]['genus'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 2\n", + "In this Exercise, find out how many calories are contained in a banana.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "cal_banana = df2.loc[df2[\"name\"] == 'Banana']\n", + "cal_banana.iloc[0]['nutritions.calories']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "cal_banana = df2.loc[df2[\"name\"] == 'Banana']\n", + "cal_banana.iloc[0]['nutritions.calories']\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 3\n", + "\n", + "This [page](https://mixedanalytics.com/blog/list-actually-free-open-no-auth-needed-apis/) contains a list of free public APIs for you to practice. Let us deal with the following example.\n", + "\n", + "#### Official Joke API \n", + "This API returns random jokes from a database. The following URL can be used to retrieve 10 random jokes.\n", + "\n", + "https://official-joke-api.appspot.com/jokes/ten\n", + "\n", + "1. Using `requests.get(\"url\")` function, load the data from the URL.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "data2 = requests.get(\"https://official-joke-api.appspot.com/jokes/ten\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "data2 = requests.get(\"https://official-joke-api.appspot.com/jokes/ten\")\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Retrieve results using `json.loads()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "results2 = json.loads(data2.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "results2 = json.loads(data2.text)\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "3. Convert json data into *pandas* data frame. Drop the type and id columns.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "df3 = pd.DataFrame(results2)\n", + "df3.drop(columns=[\"type\",\"id\"],inplace=True)\n", + "df3" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "df3 = pd.DataFrame(results2)\n", + "df3.drop(columns=[\"type\",\"id\"],inplace=True)\n", + "df3\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Congratulations! - You have completed the lab\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Author\n", + "Svitlana Kramar\n", + "\n", + "Svitlana is a master’s degree Data Science and Analytics student at University of Calgary, who enjoys travelling, learning new languages and cultures and loves spreading her passion for Data Science.\n", + "\n", + "## Additional Contributor\n", + "Abhishek Gagneja\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Copyright © 2023 IBM Corporation. All rights reserved.\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.8" + }, + "prev_pub_hash": "04bae9f5d988e5963bddc9fe88d29fb9d09098ac6fa470c436aa2dac078e9ee1" + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/PY0101EN-5 2_API_2 v2 (2).ipynb b/PY0101EN-5 2_API_2 v2 (2).ipynb new file mode 100644 index 0000000..68f98c2 --- /dev/null +++ b/PY0101EN-5 2_API_2 v2 (2).ipynb @@ -0,0 +1,2370 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    \n", + " \"cognitiveclass.ai\n", + "
    \n", + "\n", + "# Hands-on Lab: API Examples\n", + "## Random User and Fruityvice API Examples\n", + "\n", + "\n", + "Estimated time needed: **30** minutes\n", + "\n", + "## Objectives\n", + "\n", + "After completing this lab you will be able to:\n", + "\n", + "* Load and use RandomUser API, using `RandomUser()` Python library\n", + "* Load and use Fruityvice API, using `requests` Python library\n", + "* Load and use Open-Joke-API, using `requests` Python library\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The purpose of this notebook is to provide more examples on how to use simple APIs. As you have already learned from previous videos and notebooks, API stands for Application Programming Interface and is a software intermediary that allows two applications to talk to each other. \n", + "\n", + "The advantages of using APIs:\n", + " * **Automation**. Less human effort is required and workflows can be easily updated to become faster and more \n", + " productive.\n", + " * **Efficiency**. It allows to use the capabilities of one of the already developed APIs than to try to \n", + " independently implement some functionality from scratch.\n", + " \n", + "The disadvantage of using APIs:\n", + " * **Security**. If the API is poorly integrated, it means it will be vulnerable to attacks, resulting in data breeches or losses having financial or reputation implications.\n", + "\n", + "One of the applications we will use in this notebook is Random User Generator. RandomUser is an open-source, free API providing developers with randomly generated users to be used as placeholders for testing purposes. This makes the tool similar to Lorem Ipsum, but is a placeholder for people instead of text. The API can return multiple results, as well as specify generated user details such as gender, email, image, username, address, title, first and last name, and more. More information on [RandomUser](https://randomuser.me/documentation#intro) can be found here.\n", + "\n", + "Another example of simple API we will use in this notebook is Fruityvice application. The Fruityvice API web service which provides data for all kinds of fruit! You can use Fruityvice to find out interesting information about fruit and educate yourself. The web service is completely free to use and contribute to.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 1: RandomUser API\n", + "Bellow are Get Methods parameters that we can generate. For more information on the parameters, please visit this [documentation](https://randomuser.me/documentation) page.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## **Get Methods**\n", + "\n", + "- get_cell()\n", + "- get_city()\n", + "- get_dob()\n", + "- get_email()\n", + "- get_first_name()\n", + "- get_full_name()\n", + "- get_gender()\n", + "- get_id()\n", + "- get_id_number()\n", + "- get_id_type()\n", + "- get_info()\n", + "- get_last_name()\n", + "- get_login_md5()\n", + "- get_login_salt()\n", + "- get_login_sha1()\n", + "- get_login_sha256()\n", + "- get_nat()\n", + "- get_password()\n", + "- get_phone()\n", + "- get_picture()\n", + "- get_postcode()\n", + "- get_registered()\n", + "- get_state()\n", + "- get_street()\n", + "- get_username()\n", + "- get_zipcode()\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To start using the API you can install the `randomuser` library running the `pip install` command.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Requirement already satisfied: randomuser in /opt/conda/lib/python3.12/site-packages (1.6)\n", + "Requirement already satisfied: pandas in /opt/conda/lib/python3.12/site-packages (3.0.1)\n", + "Requirement already satisfied: numpy>=1.26.0 in /opt/conda/lib/python3.12/site-packages (from pandas) (2.4.3)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.12/site-packages (from pandas) (2.9.0.post0)\n", + "Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.12/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)\n" + ] + } + ], + "source": [ + "!pip install randomuser\n", + "!pip install pandas" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, we will load the necessary libraries.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "from randomuser import RandomUser\n", + "import pandas as pd" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we will create a random user object, r.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [], + "source": [ + "r = RandomUser()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, using `generate_users()` function, we get a list of random 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [], + "source": [ + "some_list = r.generate_users(10)" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ]" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "some_list" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The **\"Get Methods\"** functions mentioned at the beginning of this notebook, can generate the required parameters to construct a dataset. For example, to get full name, we call `get_full_name()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [], + "source": [ + "name = r.get_full_name()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's say we only need 10 users with full names and their email addresses. We can write a \"for-loop\" to print these 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Toivo Erkkila toivo.erkkila@example.com\n", + "Chama Hoedjes chama.hoedjes@example.com\n", + "Kuzey Taşçı kuzey.tasci@example.com\n", + "Naomi Singh naomi.singh@example.com\n", + "Sheila Welch sheila.welch@example.com\n", + "Phoebe Davies phoebe.davies@example.com\n", + "Luz Ortiz luz.ortiz@example.com\n", + "Hemitério Barros hemiterio.barros@example.com\n", + "Natacha Dubois natacha.dubois@example.com\n", + "Violet Bailey violet.bailey@example.com\n" + ] + } + ], + "source": [ + "for user in some_list:\n", + " print (user.get_full_name(),\" \",user.get_email())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 1\n", + "In this Exercise, generate photos of the random 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "https://randomuser.me/api/portraits/men/64.jpg\n", + "https://randomuser.me/api/portraits/women/77.jpg\n", + "https://randomuser.me/api/portraits/men/35.jpg\n", + "https://randomuser.me/api/portraits/women/63.jpg\n", + "https://randomuser.me/api/portraits/women/78.jpg\n", + "https://randomuser.me/api/portraits/women/86.jpg\n", + "https://randomuser.me/api/portraits/women/52.jpg\n", + "https://randomuser.me/api/portraits/men/95.jpg\n", + "https://randomuser.me/api/portraits/women/82.jpg\n", + "https://randomuser.me/api/portraits/women/19.jpg\n" + ] + } + ], + "source": [ + "## Write your code here\n", + "for user in some_list:\n", + " print (user.get_picture())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "for user in some_list:\n", + " print (user.get_picture())\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To generate a table with information about the users, we can write a function containing all desirable parameters. For example, name, gender, city, etc. The parameters will depend on the requirements of the test to be performed. We call the Get Methods, listed at the beginning of this notebook. Then, we return pandas dataframe with the users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "def get_users():\n", + " users =[]\n", + " \n", + " for user in RandomUser.generate_users(10):\n", + " users.append({\"Name\":user.get_full_name(),\"Gender\":user.get_gender(),\"City\":user.get_city(),\"State\":user.get_state(),\"Email\":user.get_email(), \"DOB\":user.get_dob(),\"Picture\":user.get_picture()})\n", + " \n", + " return pd.DataFrame(users) " + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    NameGenderCityStateEmailDOBPicture
    0Irène RenardfemaleMorcoteGraubündenirene.renard@example.com1968-07-05T03:08:20.465Zhttps://randomuser.me/api/portraits/women/36.jpg
    1Ida SchnoorfemaleKranichfeldSaarlandida.schnoor@example.com1969-09-22T11:23:25.734Zhttps://randomuser.me/api/portraits/women/51.jpg
    2Siebrigje VerhoogfemaleTerborgFlevolandsiebrigje.verhoog@example.com1986-06-12T19:23:24.048Zhttps://randomuser.me/api/portraits/women/61.jpg
    3Anolido NunesmaleResendeDistrito Federalanolido.nunes@example.com1946-02-08T04:26:32.531Zhttps://randomuser.me/api/portraits/men/68.jpg
    4Maël MathieumaleSaint-PierreLoire-Atlantiquemael.mathieu@example.com1983-09-04T11:24:22.303Zhttps://randomuser.me/api/portraits/men/57.jpg
    5Christina SaurfemaleFreisingHessenchristina.saur@example.com1951-07-03T13:24:53.323Zhttps://randomuser.me/api/portraits/women/28.jpg
    6Siete SpoelstramaleLeeuwarderadeelUtrechtsiete.spoelstra@example.com1948-04-08T03:02:35.102Zhttps://randomuser.me/api/portraits/men/74.jpg
    7Ron SpencermaleRoscommonFingalron.spencer@example.com1958-10-06T19:58:05.229Zhttps://randomuser.me/api/portraits/men/58.jpg
    8Nathan GarrettmaleSunnyvaleWyomingnathan.garrett@example.com1995-05-09T02:32:53.349Zhttps://randomuser.me/api/portraits/men/31.jpg
    9Claire GeorgefemaleLeixlipWexfordclaire.george@example.com1965-08-04T08:19:42.828Zhttps://randomuser.me/api/portraits/women/83.jpg
    \n", + "
    " + ], + "text/plain": [ + " Name Gender City State \\\n", + "0 Irène Renard female Morcote Graubünden \n", + "1 Ida Schnoor female Kranichfeld Saarland \n", + "2 Siebrigje Verhoog female Terborg Flevoland \n", + "3 Anolido Nunes male Resende Distrito Federal \n", + "4 Maël Mathieu male Saint-Pierre Loire-Atlantique \n", + "5 Christina Saur female Freising Hessen \n", + "6 Siete Spoelstra male Leeuwarderadeel Utrecht \n", + "7 Ron Spencer male Roscommon Fingal \n", + "8 Nathan Garrett male Sunnyvale Wyoming \n", + "9 Claire George female Leixlip Wexford \n", + "\n", + " Email DOB \\\n", + "0 irene.renard@example.com 1968-07-05T03:08:20.465Z \n", + "1 ida.schnoor@example.com 1969-09-22T11:23:25.734Z \n", + "2 siebrigje.verhoog@example.com 1986-06-12T19:23:24.048Z \n", + "3 anolido.nunes@example.com 1946-02-08T04:26:32.531Z \n", + "4 mael.mathieu@example.com 1983-09-04T11:24:22.303Z \n", + "5 christina.saur@example.com 1951-07-03T13:24:53.323Z \n", + "6 siete.spoelstra@example.com 1948-04-08T03:02:35.102Z \n", + "7 ron.spencer@example.com 1958-10-06T19:58:05.229Z \n", + "8 nathan.garrett@example.com 1995-05-09T02:32:53.349Z \n", + "9 claire.george@example.com 1965-08-04T08:19:42.828Z \n", + "\n", + " Picture \n", + "0 https://randomuser.me/api/portraits/women/36.jpg \n", + "1 https://randomuser.me/api/portraits/women/51.jpg \n", + "2 https://randomuser.me/api/portraits/women/61.jpg \n", + "3 https://randomuser.me/api/portraits/men/68.jpg \n", + "4 https://randomuser.me/api/portraits/men/57.jpg \n", + "5 https://randomuser.me/api/portraits/women/28.jpg \n", + "6 https://randomuser.me/api/portraits/men/74.jpg \n", + "7 https://randomuser.me/api/portraits/men/58.jpg \n", + "8 https://randomuser.me/api/portraits/men/31.jpg \n", + "9 https://randomuser.me/api/portraits/women/83.jpg " + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "get_users()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df1 = pd.DataFrame(get_users())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we have a *pandas* dataframe that can be used for any testing purposes that the tester might have.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 2: Fruityvice API\n", + "\n", + "Another, more common way to use APIs, is through `requests` library. The next lab, Requests and HTTP, will contain more information about requests.\n", + "\n", + "We will start by importing all required libraries.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "import json" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will obtain the [fruityvice](https://www.fruityvice.com) API data using `requests.get(\"url\")` function. The data is in a json format.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "data = requests.get(\"https://web.archive.org/web/20240929211114/https://fruityvice.com/api/fruit/all\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will retrieve results using `json.loads()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [], + "source": [ + "results = json.loads(data.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will convert our json data into *pandas* data frame. \n" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    nameidfamilyordergenusnutritions
    0Persimmon52EbenaceaeRosalesDiospyros{'calories': 81, 'fat': 0.0, 'sugar': 18.0, 'c...
    1Strawberry3RosaceaeRosalesFragaria{'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca...
    2Banana1MusaceaeZingiberalesMusa{'calories': 96, 'fat': 0.2, 'sugar': 17.2, 'c...
    3Tomato5SolanaceaeSolanalesSolanum{'calories': 74, 'fat': 0.2, 'sugar': 2.6, 'ca...
    4Pear4RosaceaeRosalesPyrus{'calories': 57, 'fat': 0.1, 'sugar': 10.0, 'c...
    5Durian60MalvaceaeMalvalesDurio{'calories': 147, 'fat': 5.3, 'sugar': 6.75, '...
    6Blackberry64RosaceaeRosalesRubus{'calories': 40, 'fat': 0.4, 'sugar': 4.5, 'ca...
    7Lingonberry65EricaceaeEricalesVaccinium{'calories': 50, 'fat': 0.34, 'sugar': 5.74, '...
    8Kiwi66ActinidiaceaeStruthioniformesApteryx{'calories': 61, 'fat': 0.5, 'sugar': 9.0, 'ca...
    9Lychee67SapindaceaeSapindalesLitchi{'calories': 66, 'fat': 0.44, 'sugar': 15.0, '...
    10Pineapple10BromeliaceaePoalesAnanas{'calories': 50, 'fat': 0.12, 'sugar': 9.85, '...
    11Fig68MoraceaeRosalesFicus{'calories': 74, 'fat': 0.3, 'sugar': 16.0, 'c...
    12Gooseberry69GrossulariaceaeSaxifragalesRibes{'calories': 44, 'fat': 0.6, 'sugar': 0.0, 'ca...
    13Passionfruit70PassifloraceaeMalpighialesPassiflora{'calories': 97, 'fat': 0.7, 'sugar': 11.2, 'c...
    14Plum71RosaceaeRosalesPrunus{'calories': 46, 'fat': 0.28, 'sugar': 9.92, '...
    15Orange2RutaceaeSapindalesCitrus{'calories': 43, 'fat': 0.2, 'sugar': 8.2, 'ca...
    16GreenApple72RosaceaeRosalesMalus{'calories': 21, 'fat': 0.1, 'sugar': 6.4, 'ca...
    17Raspberry23RosaceaeRosalesRubus{'calories': 53, 'fat': 0.7, 'sugar': 4.4, 'ca...
    18Watermelon25CucurbitaceaeCucurbitalesCitrullus{'calories': 30, 'fat': 0.2, 'sugar': 6.0, 'ca...
    19Lemon26RutaceaeSapindalesCitrus{'calories': 29, 'fat': 0.3, 'sugar': 2.5, 'ca...
    20Mango27AnacardiaceaeSapindalesMangifera{'calories': 60, 'fat': 0.38, 'sugar': 13.7, '...
    21Blueberry33RosaceaeRosalesFragaria{'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca...
    22Apple6RosaceaeRosalesMalus{'calories': 52, 'fat': 0.4, 'sugar': 10.3, 'c...
    23Guava37MyrtaceaeMyrtalesPsidium{'calories': 68, 'fat': 1.0, 'sugar': 9.0, 'ca...
    24Apricot35RosaceaeRosalesPrunus{'calories': 15, 'fat': 0.1, 'sugar': 3.2, 'ca...
    25Melon41CucurbitaceaeCucurbitaceaeCucumis{'calories': 34, 'fat': 0.0, 'sugar': 8.0, 'ca...
    26Tangerine77RutaceaeSapindalesCitrus{'calories': 45, 'fat': 0.4, 'sugar': 9.1, 'ca...
    27Pitahaya78CactaceaeCaryophyllalesCactaceae{'calories': 36, 'fat': 0.4, 'sugar': 3.0, 'ca...
    28Lime44RutaceaeSapindalesCitrus{'calories': 25, 'fat': 0.1, 'sugar': 1.7, 'ca...
    29Pomegranate79LythraceaeMyrtalesPunica{'calories': 83, 'fat': 1.2, 'sugar': 13.7, 'c...
    30Dragonfruit80CactaceaeCaryophyllalesSelenicereus{'calories': 60, 'fat': 1.5, 'sugar': 8.0, 'ca...
    31Grape81VitaceaeVitalesVitis{'calories': 69, 'fat': 0.16, 'sugar': 16.0, '...
    32Morus82MoraceaeRosalesMorus{'calories': 43, 'fat': 0.39, 'sugar': 8.1, 'c...
    33Feijoa76MyrtaceaeMyrtoideaeSellowiana{'calories': 44, 'fat': 0.4, 'sugar': 3.0, 'ca...
    34Avocado84LauraceaeLauralesPersea{'calories': 160, 'fat': 14.66, 'sugar': 0.66,...
    35Kiwifruit85ActinidiaceaeEricalesActinidia{'calories': 61, 'fat': 0.5, 'sugar': 8.9, 'ca...
    36Cranberry87EricaceaeEricalesVaccinium{'calories': 46, 'fat': 0.1, 'sugar': 4.0, 'ca...
    37Cherry9RosaceaeRosalesPrunus{'calories': 50, 'fat': 0.3, 'sugar': 8.0, 'ca...
    38Peach86RosaceaeRosalesPrunus{'calories': 39, 'fat': 0.25, 'sugar': 8.4, 'c...
    39Jackfruit94MoraceaeRosalesArtocarpus{'calories': 95, 'fat': 0.0, 'sugar': 19.1, 'c...
    40Horned Melon95CucurbitaceaeCucurbitalesCucumis{'calories': 44, 'fat': 1.26, 'sugar': 0.5, 'c...
    41Hazelnut96BetulaceaeFagalesCorylus{'calories': 628, 'fat': 61.0, 'sugar': 4.3, '...
    42Pomelo98RutaceaeSapindalesCitrus{'calories': 37, 'fat': 0.0, 'sugar': 8.5, 'ca...
    43Mangosteen99ClusiaceaeMalpighialesGarcinia{'calories': 73, 'fat': 0.58, 'sugar': 16.11, ...
    44Pumpkin100CucurbitaceaeCucurbitalesCucurbita{'calories': 25, 'fat': 0.3, 'sugar': 3.3, 'ca...
    45Japanese Persimmon101EbenaceaeEricalesDiospyros{'calories': 70, 'fat': 0.2, 'sugar': 13.0, 'c...
    46Papaya42CaricaceaeBrassicalesCarica{'calories': 39, 'fat': 0.3, 'sugar': 4.4, 'ca...
    47Annona103AnnonaceaeRosalesAnnonas{'calories': 92, 'fat': 0.29, 'sugar': 3.4, 'c...
    48Ceylon Gooseberry104SalicaceaeMalpighialesDovyalis{'calories': 47, 'fat': 0.3, 'sugar': 8.1, 'ca...
    \n", + "
    " + ], + "text/plain": [ + " name id family order genus \\\n", + "0 Persimmon 52 Ebenaceae Rosales Diospyros \n", + "1 Strawberry 3 Rosaceae Rosales Fragaria \n", + "2 Banana 1 Musaceae Zingiberales Musa \n", + "3 Tomato 5 Solanaceae Solanales Solanum \n", + "4 Pear 4 Rosaceae Rosales Pyrus \n", + "5 Durian 60 Malvaceae Malvales Durio \n", + "6 Blackberry 64 Rosaceae Rosales Rubus \n", + "7 Lingonberry 65 Ericaceae Ericales Vaccinium \n", + "8 Kiwi 66 Actinidiaceae Struthioniformes Apteryx \n", + "9 Lychee 67 Sapindaceae Sapindales Litchi \n", + "10 Pineapple 10 Bromeliaceae Poales Ananas \n", + "11 Fig 68 Moraceae Rosales Ficus \n", + "12 Gooseberry 69 Grossulariaceae Saxifragales Ribes \n", + "13 Passionfruit 70 Passifloraceae Malpighiales Passiflora \n", + "14 Plum 71 Rosaceae Rosales Prunus \n", + "15 Orange 2 Rutaceae Sapindales Citrus \n", + "16 GreenApple 72 Rosaceae Rosales Malus \n", + "17 Raspberry 23 Rosaceae Rosales Rubus \n", + "18 Watermelon 25 Cucurbitaceae Cucurbitales Citrullus \n", + "19 Lemon 26 Rutaceae Sapindales Citrus \n", + "20 Mango 27 Anacardiaceae Sapindales Mangifera \n", + "21 Blueberry 33 Rosaceae Rosales Fragaria \n", + "22 Apple 6 Rosaceae Rosales Malus \n", + "23 Guava 37 Myrtaceae Myrtales Psidium \n", + "24 Apricot 35 Rosaceae Rosales Prunus \n", + "25 Melon 41 Cucurbitaceae Cucurbitaceae Cucumis \n", + "26 Tangerine 77 Rutaceae Sapindales Citrus \n", + "27 Pitahaya 78 Cactaceae Caryophyllales Cactaceae \n", + "28 Lime 44 Rutaceae Sapindales Citrus \n", + "29 Pomegranate 79 Lythraceae Myrtales Punica \n", + "30 Dragonfruit 80 Cactaceae Caryophyllales Selenicereus \n", + "31 Grape 81 Vitaceae Vitales Vitis \n", + "32 Morus 82 Moraceae Rosales Morus \n", + "33 Feijoa 76 Myrtaceae Myrtoideae Sellowiana \n", + "34 Avocado 84 Lauraceae Laurales Persea \n", + "35 Kiwifruit 85 Actinidiaceae Ericales Actinidia \n", + "36 Cranberry 87 Ericaceae Ericales Vaccinium \n", + "37 Cherry 9 Rosaceae Rosales Prunus \n", + "38 Peach 86 Rosaceae Rosales Prunus \n", + "39 Jackfruit 94 Moraceae Rosales Artocarpus \n", + "40 Horned Melon 95 Cucurbitaceae Cucurbitales Cucumis \n", + "41 Hazelnut 96 Betulaceae Fagales Corylus \n", + "42 Pomelo 98 Rutaceae Sapindales Citrus \n", + "43 Mangosteen 99 Clusiaceae Malpighiales Garcinia \n", + "44 Pumpkin 100 Cucurbitaceae Cucurbitales Cucurbita \n", + "45 Japanese Persimmon 101 Ebenaceae Ericales Diospyros \n", + "46 Papaya 42 Caricaceae Brassicales Carica \n", + "47 Annona 103 Annonaceae Rosales Annonas \n", + "48 Ceylon Gooseberry 104 Salicaceae Malpighiales Dovyalis \n", + "\n", + " nutritions \n", + "0 {'calories': 81, 'fat': 0.0, 'sugar': 18.0, 'c... \n", + "1 {'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca... \n", + "2 {'calories': 96, 'fat': 0.2, 'sugar': 17.2, 'c... \n", + "3 {'calories': 74, 'fat': 0.2, 'sugar': 2.6, 'ca... \n", + "4 {'calories': 57, 'fat': 0.1, 'sugar': 10.0, 'c... \n", + "5 {'calories': 147, 'fat': 5.3, 'sugar': 6.75, '... \n", + "6 {'calories': 40, 'fat': 0.4, 'sugar': 4.5, 'ca... \n", + "7 {'calories': 50, 'fat': 0.34, 'sugar': 5.74, '... \n", + "8 {'calories': 61, 'fat': 0.5, 'sugar': 9.0, 'ca... \n", + "9 {'calories': 66, 'fat': 0.44, 'sugar': 15.0, '... \n", + "10 {'calories': 50, 'fat': 0.12, 'sugar': 9.85, '... \n", + "11 {'calories': 74, 'fat': 0.3, 'sugar': 16.0, 'c... \n", + "12 {'calories': 44, 'fat': 0.6, 'sugar': 0.0, 'ca... \n", + "13 {'calories': 97, 'fat': 0.7, 'sugar': 11.2, 'c... \n", + "14 {'calories': 46, 'fat': 0.28, 'sugar': 9.92, '... \n", + "15 {'calories': 43, 'fat': 0.2, 'sugar': 8.2, 'ca... \n", + "16 {'calories': 21, 'fat': 0.1, 'sugar': 6.4, 'ca... \n", + "17 {'calories': 53, 'fat': 0.7, 'sugar': 4.4, 'ca... \n", + "18 {'calories': 30, 'fat': 0.2, 'sugar': 6.0, 'ca... \n", + "19 {'calories': 29, 'fat': 0.3, 'sugar': 2.5, 'ca... \n", + "20 {'calories': 60, 'fat': 0.38, 'sugar': 13.7, '... \n", + "21 {'calories': 29, 'fat': 0.4, 'sugar': 5.4, 'ca... \n", + "22 {'calories': 52, 'fat': 0.4, 'sugar': 10.3, 'c... \n", + "23 {'calories': 68, 'fat': 1.0, 'sugar': 9.0, 'ca... \n", + "24 {'calories': 15, 'fat': 0.1, 'sugar': 3.2, 'ca... \n", + "25 {'calories': 34, 'fat': 0.0, 'sugar': 8.0, 'ca... \n", + "26 {'calories': 45, 'fat': 0.4, 'sugar': 9.1, 'ca... \n", + "27 {'calories': 36, 'fat': 0.4, 'sugar': 3.0, 'ca... \n", + "28 {'calories': 25, 'fat': 0.1, 'sugar': 1.7, 'ca... \n", + "29 {'calories': 83, 'fat': 1.2, 'sugar': 13.7, 'c... \n", + "30 {'calories': 60, 'fat': 1.5, 'sugar': 8.0, 'ca... \n", + "31 {'calories': 69, 'fat': 0.16, 'sugar': 16.0, '... \n", + "32 {'calories': 43, 'fat': 0.39, 'sugar': 8.1, 'c... \n", + "33 {'calories': 44, 'fat': 0.4, 'sugar': 3.0, 'ca... \n", + "34 {'calories': 160, 'fat': 14.66, 'sugar': 0.66,... \n", + "35 {'calories': 61, 'fat': 0.5, 'sugar': 8.9, 'ca... \n", + "36 {'calories': 46, 'fat': 0.1, 'sugar': 4.0, 'ca... \n", + "37 {'calories': 50, 'fat': 0.3, 'sugar': 8.0, 'ca... \n", + "38 {'calories': 39, 'fat': 0.25, 'sugar': 8.4, 'c... \n", + "39 {'calories': 95, 'fat': 0.0, 'sugar': 19.1, 'c... \n", + "40 {'calories': 44, 'fat': 1.26, 'sugar': 0.5, 'c... \n", + "41 {'calories': 628, 'fat': 61.0, 'sugar': 4.3, '... \n", + "42 {'calories': 37, 'fat': 0.0, 'sugar': 8.5, 'ca... \n", + "43 {'calories': 73, 'fat': 0.58, 'sugar': 16.11, ... \n", + "44 {'calories': 25, 'fat': 0.3, 'sugar': 3.3, 'ca... \n", + "45 {'calories': 70, 'fat': 0.2, 'sugar': 13.0, 'c... \n", + "46 {'calories': 39, 'fat': 0.3, 'sugar': 4.4, 'ca... \n", + "47 {'calories': 92, 'fat': 0.29, 'sugar': 3.4, 'c... \n", + "48 {'calories': 47, 'fat': 0.3, 'sugar': 8.1, 'ca... " + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pd.DataFrame(results)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The result is in a nested json format. The 'nutrition' column contains multiple subcolumns, so the data needs to be 'flattened' or normalized.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [], + "source": [ + "df2 = pd.json_normalize(results)" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    nameidfamilyordergenusnutritions.caloriesnutritions.fatnutritions.sugarnutritions.carbohydratesnutritions.protein
    0Persimmon52EbenaceaeRosalesDiospyros810.0018.0018.000.00
    1Strawberry3RosaceaeRosalesFragaria290.405.405.500.80
    2Banana1MusaceaeZingiberalesMusa960.2017.2022.001.00
    3Tomato5SolanaceaeSolanalesSolanum740.202.603.900.90
    4Pear4RosaceaeRosalesPyrus570.1010.0015.000.40
    5Durian60MalvaceaeMalvalesDurio1475.306.7527.101.50
    6Blackberry64RosaceaeRosalesRubus400.404.509.001.30
    7Lingonberry65EricaceaeEricalesVaccinium500.345.7411.300.75
    8Kiwi66ActinidiaceaeStruthioniformesApteryx610.509.0015.001.10
    9Lychee67SapindaceaeSapindalesLitchi660.4415.0017.000.80
    10Pineapple10BromeliaceaePoalesAnanas500.129.8513.120.54
    11Fig68MoraceaeRosalesFicus740.3016.0019.000.80
    12Gooseberry69GrossulariaceaeSaxifragalesRibes440.600.0010.000.90
    13Passionfruit70PassifloraceaeMalpighialesPassiflora970.7011.2022.402.20
    14Plum71RosaceaeRosalesPrunus460.289.9211.400.70
    15Orange2RutaceaeSapindalesCitrus430.208.208.301.00
    16GreenApple72RosaceaeRosalesMalus210.106.403.100.40
    17Raspberry23RosaceaeRosalesRubus530.704.4012.001.20
    18Watermelon25CucurbitaceaeCucurbitalesCitrullus300.206.008.000.60
    19Lemon26RutaceaeSapindalesCitrus290.302.509.001.10
    20Mango27AnacardiaceaeSapindalesMangifera600.3813.7015.000.82
    21Blueberry33RosaceaeRosalesFragaria290.405.405.500.00
    22Apple6RosaceaeRosalesMalus520.4010.3011.400.30
    23Guava37MyrtaceaeMyrtalesPsidium681.009.0014.002.60
    24Apricot35RosaceaeRosalesPrunus150.103.203.900.50
    25Melon41CucurbitaceaeCucurbitaceaeCucumis340.008.008.000.00
    26Tangerine77RutaceaeSapindalesCitrus450.409.108.300.00
    27Pitahaya78CactaceaeCaryophyllalesCactaceae360.403.007.001.00
    28Lime44RutaceaeSapindalesCitrus250.101.708.400.30
    29Pomegranate79LythraceaeMyrtalesPunica831.2013.7018.701.70
    30Dragonfruit80CactaceaeCaryophyllalesSelenicereus601.508.009.009.00
    31Grape81VitaceaeVitalesVitis690.1616.0018.100.72
    32Morus82MoraceaeRosalesMorus430.398.109.801.44
    33Feijoa76MyrtaceaeMyrtoideaeSellowiana440.403.008.000.60
    34Avocado84LauraceaeLauralesPersea16014.660.668.532.00
    35Kiwifruit85ActinidiaceaeEricalesActinidia610.508.9014.601.14
    36Cranberry87EricaceaeEricalesVaccinium460.104.0012.200.40
    37Cherry9RosaceaeRosalesPrunus500.308.0012.001.00
    38Peach86RosaceaeRosalesPrunus390.258.409.500.90
    39Jackfruit94MoraceaeRosalesArtocarpus950.0019.1023.201.72
    40Horned Melon95CucurbitaceaeCucurbitalesCucumis441.260.507.561.78
    41Hazelnut96BetulaceaeFagalesCorylus62861.004.3017.0015.00
    42Pomelo98RutaceaeSapindalesCitrus370.008.509.670.82
    43Mangosteen99ClusiaceaeMalpighialesGarcinia730.5816.1117.910.41
    44Pumpkin100CucurbitaceaeCucurbitalesCucurbita250.303.304.601.10
    45Japanese Persimmon101EbenaceaeEricalesDiospyros700.2013.0019.000.60
    46Papaya42CaricaceaeBrassicalesCarica390.304.405.800.50
    47Annona103AnnonaceaeRosalesAnnonas920.293.4019.101.50
    48Ceylon Gooseberry104SalicaceaeMalpighialesDovyalis470.308.109.601.20
    \n", + "
    " + ], + "text/plain": [ + " name id family order genus \\\n", + "0 Persimmon 52 Ebenaceae Rosales Diospyros \n", + "1 Strawberry 3 Rosaceae Rosales Fragaria \n", + "2 Banana 1 Musaceae Zingiberales Musa \n", + "3 Tomato 5 Solanaceae Solanales Solanum \n", + "4 Pear 4 Rosaceae Rosales Pyrus \n", + "5 Durian 60 Malvaceae Malvales Durio \n", + "6 Blackberry 64 Rosaceae Rosales Rubus \n", + "7 Lingonberry 65 Ericaceae Ericales Vaccinium \n", + "8 Kiwi 66 Actinidiaceae Struthioniformes Apteryx \n", + "9 Lychee 67 Sapindaceae Sapindales Litchi \n", + "10 Pineapple 10 Bromeliaceae Poales Ananas \n", + "11 Fig 68 Moraceae Rosales Ficus \n", + "12 Gooseberry 69 Grossulariaceae Saxifragales Ribes \n", + "13 Passionfruit 70 Passifloraceae Malpighiales Passiflora \n", + "14 Plum 71 Rosaceae Rosales Prunus \n", + "15 Orange 2 Rutaceae Sapindales Citrus \n", + "16 GreenApple 72 Rosaceae Rosales Malus \n", + "17 Raspberry 23 Rosaceae Rosales Rubus \n", + "18 Watermelon 25 Cucurbitaceae Cucurbitales Citrullus \n", + "19 Lemon 26 Rutaceae Sapindales Citrus \n", + "20 Mango 27 Anacardiaceae Sapindales Mangifera \n", + "21 Blueberry 33 Rosaceae Rosales Fragaria \n", + "22 Apple 6 Rosaceae Rosales Malus \n", + "23 Guava 37 Myrtaceae Myrtales Psidium \n", + "24 Apricot 35 Rosaceae Rosales Prunus \n", + "25 Melon 41 Cucurbitaceae Cucurbitaceae Cucumis \n", + "26 Tangerine 77 Rutaceae Sapindales Citrus \n", + "27 Pitahaya 78 Cactaceae Caryophyllales Cactaceae \n", + "28 Lime 44 Rutaceae Sapindales Citrus \n", + "29 Pomegranate 79 Lythraceae Myrtales Punica \n", + "30 Dragonfruit 80 Cactaceae Caryophyllales Selenicereus \n", + "31 Grape 81 Vitaceae Vitales Vitis \n", + "32 Morus 82 Moraceae Rosales Morus \n", + "33 Feijoa 76 Myrtaceae Myrtoideae Sellowiana \n", + "34 Avocado 84 Lauraceae Laurales Persea \n", + "35 Kiwifruit 85 Actinidiaceae Ericales Actinidia \n", + "36 Cranberry 87 Ericaceae Ericales Vaccinium \n", + "37 Cherry 9 Rosaceae Rosales Prunus \n", + "38 Peach 86 Rosaceae Rosales Prunus \n", + "39 Jackfruit 94 Moraceae Rosales Artocarpus \n", + "40 Horned Melon 95 Cucurbitaceae Cucurbitales Cucumis \n", + "41 Hazelnut 96 Betulaceae Fagales Corylus \n", + "42 Pomelo 98 Rutaceae Sapindales Citrus \n", + "43 Mangosteen 99 Clusiaceae Malpighiales Garcinia \n", + "44 Pumpkin 100 Cucurbitaceae Cucurbitales Cucurbita \n", + "45 Japanese Persimmon 101 Ebenaceae Ericales Diospyros \n", + "46 Papaya 42 Caricaceae Brassicales Carica \n", + "47 Annona 103 Annonaceae Rosales Annonas \n", + "48 Ceylon Gooseberry 104 Salicaceae Malpighiales Dovyalis \n", + "\n", + " nutritions.calories nutritions.fat nutritions.sugar \\\n", + "0 81 0.00 18.00 \n", + "1 29 0.40 5.40 \n", + "2 96 0.20 17.20 \n", + "3 74 0.20 2.60 \n", + "4 57 0.10 10.00 \n", + "5 147 5.30 6.75 \n", + "6 40 0.40 4.50 \n", + "7 50 0.34 5.74 \n", + "8 61 0.50 9.00 \n", + "9 66 0.44 15.00 \n", + "10 50 0.12 9.85 \n", + "11 74 0.30 16.00 \n", + "12 44 0.60 0.00 \n", + "13 97 0.70 11.20 \n", + "14 46 0.28 9.92 \n", + "15 43 0.20 8.20 \n", + "16 21 0.10 6.40 \n", + "17 53 0.70 4.40 \n", + "18 30 0.20 6.00 \n", + "19 29 0.30 2.50 \n", + "20 60 0.38 13.70 \n", + "21 29 0.40 5.40 \n", + "22 52 0.40 10.30 \n", + "23 68 1.00 9.00 \n", + "24 15 0.10 3.20 \n", + "25 34 0.00 8.00 \n", + "26 45 0.40 9.10 \n", + "27 36 0.40 3.00 \n", + "28 25 0.10 1.70 \n", + "29 83 1.20 13.70 \n", + "30 60 1.50 8.00 \n", + "31 69 0.16 16.00 \n", + "32 43 0.39 8.10 \n", + "33 44 0.40 3.00 \n", + "34 160 14.66 0.66 \n", + "35 61 0.50 8.90 \n", + "36 46 0.10 4.00 \n", + "37 50 0.30 8.00 \n", + "38 39 0.25 8.40 \n", + "39 95 0.00 19.10 \n", + "40 44 1.26 0.50 \n", + "41 628 61.00 4.30 \n", + "42 37 0.00 8.50 \n", + "43 73 0.58 16.11 \n", + "44 25 0.30 3.30 \n", + "45 70 0.20 13.00 \n", + "46 39 0.30 4.40 \n", + "47 92 0.29 3.40 \n", + "48 47 0.30 8.10 \n", + "\n", + " nutritions.carbohydrates nutritions.protein \n", + "0 18.00 0.00 \n", + "1 5.50 0.80 \n", + "2 22.00 1.00 \n", + "3 3.90 0.90 \n", + "4 15.00 0.40 \n", + "5 27.10 1.50 \n", + "6 9.00 1.30 \n", + "7 11.30 0.75 \n", + "8 15.00 1.10 \n", + "9 17.00 0.80 \n", + "10 13.12 0.54 \n", + "11 19.00 0.80 \n", + "12 10.00 0.90 \n", + "13 22.40 2.20 \n", + "14 11.40 0.70 \n", + "15 8.30 1.00 \n", + "16 3.10 0.40 \n", + "17 12.00 1.20 \n", + "18 8.00 0.60 \n", + "19 9.00 1.10 \n", + "20 15.00 0.82 \n", + "21 5.50 0.00 \n", + "22 11.40 0.30 \n", + "23 14.00 2.60 \n", + "24 3.90 0.50 \n", + "25 8.00 0.00 \n", + "26 8.30 0.00 \n", + "27 7.00 1.00 \n", + "28 8.40 0.30 \n", + "29 18.70 1.70 \n", + "30 9.00 9.00 \n", + "31 18.10 0.72 \n", + "32 9.80 1.44 \n", + "33 8.00 0.60 \n", + "34 8.53 2.00 \n", + "35 14.60 1.14 \n", + "36 12.20 0.40 \n", + "37 12.00 1.00 \n", + "38 9.50 0.90 \n", + "39 23.20 1.72 \n", + "40 7.56 1.78 \n", + "41 17.00 15.00 \n", + "42 9.67 0.82 \n", + "43 17.91 0.41 \n", + "44 4.60 1.10 \n", + "45 19.00 0.60 \n", + "46 5.80 0.50 \n", + "47 19.10 1.50 \n", + "48 9.60 1.20 " + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see if we can extract some information from this dataframe. Perhaps, we need to know the family and genus of a cherry.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "('Rosaceae', 'Prunus')" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cherry = df2.loc[df2[\"name\"] == 'Cherry']\n", + "(cherry.iloc[0]['family']) , (cherry.iloc[0]['genus'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 2\n", + "In this Exercise, find out how many calories are contained in a banana.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "np.int64(96)" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Write your code here\n", + "cal_banana = df2.loc[df2[\"name\"] == 'Banana']\n", + "cal_banana.iloc[0]['nutritions.calories']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "cal_banana = df2.loc[df2[\"name\"] == 'Banana']\n", + "cal_banana.iloc[0]['nutritions.calories']\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 3\n", + "\n", + "This [page](https://mixedanalytics.com/blog/list-actually-free-open-no-auth-needed-apis/) contains a list of free public APIs for you to practice. Let us deal with the following example.\n", + "\n", + "#### Official Joke API \n", + "This API returns random jokes from a database. The following URL can be used to retrieve 10 random jokes.\n", + "\n", + "https://official-joke-api.appspot.com/jokes/ten\n", + "\n", + "1. Using `requests.get(\"url\")` function, load the data from the URL.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "data2 = requests.get(\"https://official-joke-api.appspot.com/jokes/ten\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "data2 = requests.get(\"https://official-joke-api.appspot.com/jokes/ten\")\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Retrieve results using `json.loads()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "results2 = json.loads(data2.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "results2 = json.loads(data2.text)\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "3. Convert json data into *pandas* data frame. Drop the type and id columns.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    setuppunchline
    0Why are graveyards so noisy?Because of all the coffin.
    1Why do bananas have to put on sunscreen before...Because they might peel!
    2What do vegetarian zombies eat?Grrrrrainnnnnssss.
    3What did the digital clock say to the grandfat...Look, no hands!
    4Dad, can you put my shoes on?I don't think they'll fit me.
    5What did one ocean say to the other ocean?Nothing, they just waved.
    6What is the most used language in programming?Profanity.
    7What do you give a sick lemon?Lemonaid.
    8While I was sleeping my friends decided to wri...You should have seen the expression on my face...
    9Knock knock. \\n Who's there? \\n Opportunity.That is impossible. Opportunity doesn’t come k...
    \n", + "
    " + ], + "text/plain": [ + " setup \\\n", + "0 Why are graveyards so noisy? \n", + "1 Why do bananas have to put on sunscreen before... \n", + "2 What do vegetarian zombies eat? \n", + "3 What did the digital clock say to the grandfat... \n", + "4 Dad, can you put my shoes on? \n", + "5 What did one ocean say to the other ocean? \n", + "6 What is the most used language in programming? \n", + "7 What do you give a sick lemon? \n", + "8 While I was sleeping my friends decided to wri... \n", + "9 Knock knock. \\n Who's there? \\n Opportunity. \n", + "\n", + " punchline \n", + "0 Because of all the coffin. \n", + "1 Because they might peel! \n", + "2 Grrrrrainnnnnssss. \n", + "3 Look, no hands! \n", + "4 I don't think they'll fit me. \n", + "5 Nothing, they just waved. \n", + "6 Profanity. \n", + "7 Lemonaid. \n", + "8 You should have seen the expression on my face... \n", + "9 That is impossible. Opportunity doesn’t come k... " + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Write your code here\n", + "df3 = pd.DataFrame(results2)\n", + "df3.drop(columns=[\"type\",\"id\"],inplace=True)\n", + "df3" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "df3 = pd.DataFrame(results2)\n", + "df3.drop(columns=[\"type\",\"id\"],inplace=True)\n", + "df3\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Congratulations! - You have completed the lab\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Author\n", + "Svitlana Kramar\n", + "\n", + "Svitlana is a master’s degree Data Science and Analytics student at University of Calgary, who enjoys travelling, learning new languages and cultures and loves spreading her passion for Data Science.\n", + "\n", + "## Additional Contributor\n", + "Abhishek Gagneja\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Copyright © 2023 IBM Corporation. All rights reserved.\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.8" + }, + "prev_pub_hash": "04bae9f5d988e5963bddc9fe88d29fb9d09098ac6fa470c436aa2dac078e9ee1" + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/PY0101EN-5 2_API_2 v2.ipynb b/PY0101EN-5 2_API_2 v2.ipynb new file mode 100644 index 0000000..e68bb20 --- /dev/null +++ b/PY0101EN-5 2_API_2 v2.ipynb @@ -0,0 +1,835 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    \n", + " \"cognitiveclass.ai\n", + "
    \n", + "\n", + "# Hands-on Lab: API Examples\n", + "## Random User and Fruityvice API Examples\n", + "\n", + "\n", + "Estimated time needed: **30** minutes\n", + "\n", + "## Objectives\n", + "\n", + "After completing this lab you will be able to:\n", + "\n", + "* Load and use RandomUser API, using `RandomUser()` Python library\n", + "* Load and use Fruityvice API, using `requests` Python library\n", + "* Load and use Open-Joke-API, using `requests` Python library\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The purpose of this notebook is to provide more examples on how to use simple APIs. As you have already learned from previous videos and notebooks, API stands for Application Programming Interface and is a software intermediary that allows two applications to talk to each other. \n", + "\n", + "The advantages of using APIs:\n", + " * **Automation**. Less human effort is required and workflows can be easily updated to become faster and more \n", + " productive.\n", + " * **Efficiency**. It allows to use the capabilities of one of the already developed APIs than to try to \n", + " independently implement some functionality from scratch.\n", + " \n", + "The disadvantage of using APIs:\n", + " * **Security**. If the API is poorly integrated, it means it will be vulnerable to attacks, resulting in data breeches or losses having financial or reputation implications.\n", + "\n", + "One of the applications we will use in this notebook is Random User Generator. RandomUser is an open-source, free API providing developers with randomly generated users to be used as placeholders for testing purposes. This makes the tool similar to Lorem Ipsum, but is a placeholder for people instead of text. The API can return multiple results, as well as specify generated user details such as gender, email, image, username, address, title, first and last name, and more. More information on [RandomUser](https://randomuser.me/documentation#intro) can be found here.\n", + "\n", + "Another example of simple API we will use in this notebook is Fruityvice application. The Fruityvice API web service which provides data for all kinds of fruit! You can use Fruityvice to find out interesting information about fruit and educate yourself. The web service is completely free to use and contribute to.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 1: RandomUser API\n", + "Bellow are Get Methods parameters that we can generate. For more information on the parameters, please visit this [documentation](https://randomuser.me/documentation) page.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## **Get Methods**\n", + "\n", + "- get_cell()\n", + "- get_city()\n", + "- get_dob()\n", + "- get_email()\n", + "- get_first_name()\n", + "- get_full_name()\n", + "- get_gender()\n", + "- get_id()\n", + "- get_id_number()\n", + "- get_id_type()\n", + "- get_info()\n", + "- get_last_name()\n", + "- get_login_md5()\n", + "- get_login_salt()\n", + "- get_login_sha1()\n", + "- get_login_sha256()\n", + "- get_nat()\n", + "- get_password()\n", + "- get_phone()\n", + "- get_picture()\n", + "- get_postcode()\n", + "- get_registered()\n", + "- get_state()\n", + "- get_street()\n", + "- get_username()\n", + "- get_zipcode()\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To start using the API you can install the `randomuser` library running the `pip install` command.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Collecting randomuser\n", + " Downloading randomuser-1.6.tar.gz (5.0 kB)\n", + " Preparing metadata (setup.py) ... \u001b[?25ldone\n", + "\u001b[?25hBuilding wheels for collected packages: randomuser\n", + " Building wheel for randomuser (setup.py) ... \u001b[?25done\n", + "\u001b[?25h Created wheel for randomuser: filename=randomuser-1.6-py3-none-any.whl size=5104 sha256=7d0f593299b8b462ac49fa0e67609a8b3cc610aebc3b7747e1f349b2e3f510ef\n", + " Stored in directory: /home/jupyterlab/.cache/pip/wheels/be/62/c8/71e1b48f4758ea5b78af7595d87178f628cde315a3326610ee\n", + "Successfully built randomuser\n", + "Installing collected packages: randomuser\n", + "Successfully installed randomuser-1.6\n", + "Collecting pandas\n", + " Downloading pandas-3.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (79 kB)\n", + "Collecting numpy>=1.26.0 (from pandas)\n", + " Downloading numpy-2.4.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (6.6 kB)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.12/site-packages (from pandas) (2.9.0.post0)\n", + "Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.12/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)\n", + "Downloading pandas-3.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (10.9 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m10.9/10.9 MB\u001b[0m \u001b[31m96.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "Downloading numpy-2.4.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.6 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m16.6/16.6 MB\u001b[0m \u001b[31m140.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "Installing collected packages: numpy, pandas\n", + "Successfully installed numpy-2.4.3 pandas-3.0.1\n" + ] + } + ], + "source": [ + "!pip install randomuser\n", + "!pip install pandas" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, we will load the necessary libraries.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "from randomuser import RandomUser\n", + "import pandas as pd" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we will create a random user object, r.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "r = RandomUser()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, using `generate_users()` function, we get a list of random 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "some_list = r.generate_users(10)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ,\n", + " ]" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "some_list" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The **\"Get Methods\"** functions mentioned at the beginning of this notebook, can generate the required parameters to construct a dataset. For example, to get full name, we call `get_full_name()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "name = r.get_full_name()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's say we only need 10 users with full names and their email addresses. We can write a \"for-loop\" to print these 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "مهدیس حسینی mhdys.hsyny@example.com\n", + "Lilje Woll lilje.woll@example.com\n", + "Guy Jensen guy.jensen@example.com\n", + "Harri Gries harri.gries@example.com\n", + "Kristina Grimnes kristina.grimnes@example.com\n", + "Jacob Rice jacob.rice@example.com\n", + "Isaac Thomas isaac.thomas@example.com\n", + "Olivia Niemi olivia.niemi@example.com\n", + "Charles Wong charles.wong@example.com\n", + "Rachel Arnesen rachel.arnesen@example.com\n" + ] + } + ], + "source": [ + "for user in some_list:\n", + " print (user.get_full_name(),\" \",user.get_email())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 1\n", + "In this Exercise, generate photos of the random 10 users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "https://randomuser.me/api/portraits/women/58.jpg\n", + "https://randomuser.me/api/portraits/women/42.jpg\n", + "https://randomuser.me/api/portraits/men/56.jpg\n", + "https://randomuser.me/api/portraits/men/5.jpg\n", + "https://randomuser.me/api/portraits/women/22.jpg\n", + "https://randomuser.me/api/portraits/men/65.jpg\n", + "https://randomuser.me/api/portraits/men/55.jpg\n", + "https://randomuser.me/api/portraits/women/20.jpg\n", + "https://randomuser.me/api/portraits/men/79.jpg\n", + "https://randomuser.me/api/portraits/women/85.jpg\n" + ] + } + ], + "source": [ + "## Write your code here\n", + "for user in some_list:\n", + " print (user.get_picture())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "for user in some_list:\n", + " print (user.get_picture())\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To generate a table with information about the users, we can write a function containing all desirable parameters. For example, name, gender, city, etc. The parameters will depend on the requirements of the test to be performed. We call the Get Methods, listed at the beginning of this notebook. Then, we return pandas dataframe with the users.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "def get_users():\n", + " users =[]\n", + " \n", + " for user in RandomUser.generate_users(10):\n", + " users.append({\"Name\":user.get_full_name(),\"Gender\":user.get_gender(),\"City\":user.get_city(),\"State\":user.get_state(),\"Email\":user.get_email(), \"DOB\":user.get_dob(),\"Picture\":user.get_picture()})\n", + " \n", + " return pd.DataFrame(users) " + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
    \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
    NameGenderCityStateEmailDOBPicture
    0Floyd RogersmaleDundeeCounty Londonderryfloyd.rogers@example.com1959-02-08T13:15:30.999Zhttps://randomuser.me/api/portraits/men/79.jpg
    1Marie KristensenfemaleLintrupMidtjyllandmarie.kristensen@example.com1955-08-11T03:30:19.966Zhttps://randomuser.me/api/portraits/women/71.jpg
    2Clifton KnightmaleOrangeWestern Australiaclifton.knight@example.com1967-01-11T15:24:56.289Zhttps://randomuser.me/api/portraits/men/83.jpg
    3Alison CarrollfemaleMountmellickCarlowalison.carroll@example.com1997-07-23T02:50:30.973Zhttps://randomuser.me/api/portraits/women/23.jpg
    4Kadir DoğanmaleAğrıBursakadir.dogan@example.com1994-08-11T04:25:43.693Zhttps://randomuser.me/api/portraits/men/40.jpg
    5Fatih ÇağıranmaleErzurumÇorumfatih.cagiran@example.com1950-05-17T03:51:43.648Zhttps://randomuser.me/api/portraits/men/67.jpg
    6Leroy PowellmaleBendigoNorthern Territoryleroy.powell@example.com1947-08-07T21:50:01.248Zhttps://randomuser.me/api/portraits/men/43.jpg
    7Alexander JohansenmaleVipperødSyddanmarkalexander.johansen@example.com1951-11-28T16:07:53.979Zhttps://randomuser.me/api/portraits/men/61.jpg
    8Vicky SchmidtfemalePortsmouthFifevicky.schmidt@example.com1974-10-07T03:07:52.920Zhttps://randomuser.me/api/portraits/women/27.jpg
    9Bill KingmaleNewarkIllinoisbill.king@example.com1962-07-20T01:16:35.497Zhttps://randomuser.me/api/portraits/men/59.jpg
    \n", + "
    " + ], + "text/plain": [ + " Name Gender City State \\\n", + "0 Floyd Rogers male Dundee County Londonderry \n", + "1 Marie Kristensen female Lintrup Midtjylland \n", + "2 Clifton Knight male Orange Western Australia \n", + "3 Alison Carroll female Mountmellick Carlow \n", + "4 Kadir Doğan male Ağrı Bursa \n", + "5 Fatih Çağıran male Erzurum Çorum \n", + "6 Leroy Powell male Bendigo Northern Territory \n", + "7 Alexander Johansen male Vipperød Syddanmark \n", + "8 Vicky Schmidt female Portsmouth Fife \n", + "9 Bill King male Newark Illinois \n", + "\n", + " Email DOB \\\n", + "0 floyd.rogers@example.com 1959-02-08T13:15:30.999Z \n", + "1 marie.kristensen@example.com 1955-08-11T03:30:19.966Z \n", + "2 clifton.knight@example.com 1967-01-11T15:24:56.289Z \n", + "3 alison.carroll@example.com 1997-07-23T02:50:30.973Z \n", + "4 kadir.dogan@example.com 1994-08-11T04:25:43.693Z \n", + "5 fatih.cagiran@example.com 1950-05-17T03:51:43.648Z \n", + "6 leroy.powell@example.com 1947-08-07T21:50:01.248Z \n", + "7 alexander.johansen@example.com 1951-11-28T16:07:53.979Z \n", + "8 vicky.schmidt@example.com 1974-10-07T03:07:52.920Z \n", + "9 bill.king@example.com 1962-07-20T01:16:35.497Z \n", + "\n", + " Picture \n", + "0 https://randomuser.me/api/portraits/men/79.jpg \n", + "1 https://randomuser.me/api/portraits/women/71.jpg \n", + "2 https://randomuser.me/api/portraits/men/83.jpg \n", + "3 https://randomuser.me/api/portraits/women/23.jpg \n", + "4 https://randomuser.me/api/portraits/men/40.jpg \n", + "5 https://randomuser.me/api/portraits/men/67.jpg \n", + "6 https://randomuser.me/api/portraits/men/43.jpg \n", + "7 https://randomuser.me/api/portraits/men/61.jpg \n", + "8 https://randomuser.me/api/portraits/women/27.jpg \n", + "9 https://randomuser.me/api/portraits/men/59.jpg " + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "get_users()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df1 = pd.DataFrame(get_users())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we have a *pandas* dataframe that can be used for any testing purposes that the tester might have.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 2: Fruityvice API\n", + "\n", + "Another, more common way to use APIs, is through `requests` library. The next lab, Requests and HTTP, will contain more information about requests.\n", + "\n", + "We will start by importing all required libraries.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "import json" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will obtain the [fruityvice](https://www.fruityvice.com) API data using `requests.get(\"url\")` function. The data is in a json format.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "data = requests.get(\"https://web.archive.org/web/20240929211114/https://fruityvice.com/api/fruit/all\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will retrieve results using `json.loads()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "results = json.loads(data.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will convert our json data into *pandas* data frame. \n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pd.DataFrame(results)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The result is in a nested json format. The 'nutrition' column contains multiple subcolumns, so the data needs to be 'flattened' or normalized.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df2 = pd.json_normalize(results)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see if we can extract some information from this dataframe. Perhaps, we need to know the family and genus of a cherry.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cherry = df2.loc[df2[\"name\"] == 'Cherry']\n", + "(cherry.iloc[0]['family']) , (cherry.iloc[0]['genus'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 2\n", + "In this Exercise, find out how many calories are contained in a banana.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "cal_banana = df2.loc[df2[\"name\"] == 'Banana']\n", + "cal_banana.iloc[0]['nutritions.calories']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "cal_banana = df2.loc[df2[\"name\"] == 'Banana']\n", + "cal_banana.iloc[0]['nutritions.calories']\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise 3\n", + "\n", + "This [page](https://mixedanalytics.com/blog/list-actually-free-open-no-auth-needed-apis/) contains a list of free public APIs for you to practice. Let us deal with the following example.\n", + "\n", + "#### Official Joke API \n", + "This API returns random jokes from a database. The following URL can be used to retrieve 10 random jokes.\n", + "\n", + "https://official-joke-api.appspot.com/jokes/ten\n", + "\n", + "1. Using `requests.get(\"url\")` function, load the data from the URL.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "data2 = requests.get(\"https://official-joke-api.appspot.com/jokes/ten\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "data2 = requests.get(\"https://official-joke-api.appspot.com/jokes/ten\")\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Retrieve results using `json.loads()` function.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "results2 = json.loads(data2.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "results2 = json.loads(data2.text)\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "3. Convert json data into *pandas* data frame. Drop the type and id columns.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Write your code here\n", + "df3 = pd.DataFrame(results2)\n", + "df3.drop(columns=[\"type\",\"id\"],inplace=True)\n", + "df3" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
    Click here for the solution\n", + "\n", + "```python\n", + "df3 = pd.DataFrame(results2)\n", + "df3.drop(columns=[\"type\",\"id\"],inplace=True)\n", + "df3\n", + "```\n", + "\n", + "
    \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Congratulations! - You have completed the lab\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Author\n", + "Svitlana Kramar\n", + "\n", + "Svitlana is a master’s degree Data Science and Analytics student at University of Calgary, who enjoys travelling, learning new languages and cultures and loves spreading her passion for Data Science.\n", + "\n", + "## Additional Contributor\n", + "Abhishek Gagneja\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Copyright © 2023 IBM Corporation. All rights reserved.\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.8" + }, + "prev_pub_hash": "04bae9f5d988e5963bddc9fe88d29fb9d09098ac6fa470c436aa2dac078e9ee1" + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/PY0101EN-5 4_WorkingWithDifferent.ipynb b/PY0101EN-5 4_WorkingWithDifferent.ipynb new file mode 100644 index 0000000..0d77b3b --- /dev/null +++ b/PY0101EN-5 4_WorkingWithDifferent.ipynb @@ -0,0 +1,707 @@ +{ + "metadata": { + "kernelspec": { + "name": "python", + "display_name": "Python (Pyodide)", + "language": "python" + }, + "language_info": { + "codemirror_mode": { + "name": "python", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8" + }, + "prev_pub_hash": "a04e80168ae105af276ac73f3ecaa3d13151e085121e7c55d350cfbce8c34bf7" + }, + "nbformat_minor": 4, + "nbformat": 4, + "cells": [ + { + "cell_type": "markdown", + "source": "
    \n \"cognitiveclass.ai\n
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Hands-on Lab: Working with different file formats\n\nEstimated time: **40 mins**\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Table of Contents\n\n\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Data Engineering\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "**Data engineering** is one of the most critical and foundational skills in any data scientist’s toolkit.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Data Engineering Process\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "There are several steps in Data Engineering process.\n\n1. **Extract** - Data extraction is getting data from multiple sources. Ex. Data extraction from a website using Web scraping or gathering information from the data that are stored in different formats(JSON, CSV, XLSX etc.).\n\n2. **Transform** - Transforming the data means removing the data that we don't need for further analysis and converting the data in the format that all the data from the multiple sources is in the same format.\n\n3. **Load** - Loading the data inside a data warehouse. Data warehouse essentially contains large volumes of data that are accessed to gather insights.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Working with different file formats\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "In the real-world, people rarely get neat tabular data. Thus, it is mandatory for any data scientist (or data engineer) to be aware of different file formats, common challenges in handling them and the best, most efficient ways to handle this data in real life. We have reviewed some of this content in other modules.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "#### File Format\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "

    A file format is a standard way in which information is encoded for storage in a file. First, the file format specifies whether the file is a binary or ASCII file. Second, it shows how the information is organized. For example, the comma-separated values (CSV) file format stores tabular data in plain text.\n\nTo identify a file format, you can usually look at the file extension to get an idea. For example, a file saved with name \"Data\" in \"CSV\" format will appear as **Data.csv**. By noticing the **.csv** extension, we can clearly identify that it is a **CSV** file and the data is stored in a tabular format.

    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "There are various formats for a dataset, .csv, .json, .xlsx etc. The dataset can be stored in different places, on your local machine or sometimes online.\n\n**In this section, you will learn how to load a dataset into our Jupyter Notebook.**\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Now, we will look at some file formats and how to read them in Python:\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Comma-separated values (CSV) file format\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "The **Comma-separated values** file format falls under a spreadsheet file format.\n\nIn a spreadsheet file format, data is stored in cells. Each cell is organized in rows and columns. A column in the spreadsheet file can have different types. For example, a column can be of string type, a date type, or an integer type.\n\nEach line in CSV file represents an observation, or commonly called a record. Each record may contain one or more fields which are separated by a comma.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "## Reading data from CSV in Python\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "The **Pandas** Library is a useful tool that enables us to read various datasets into a Pandas data frame\n\nLet us look at how to read a CSV file in Pandas Library.\n\nWe use **pandas.read_csv()** function to read the csv file. In the parentheses, we put the file path along with a quotation mark as an argument, so that pandas will read the file into a data frame from that address. The file path can be either a URL or your local file address.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "import piplite\nawait piplite.install(['seaborn', 'lxml', 'openpyxl'])\n\nimport pandas as pd", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "from pyodide.http import pyfetch\n\nfilename = \"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/addresses.csv\"\n\nasync def download(url, filename):\n response = await pyfetch(url)\n if response.status == 200:\n with open(filename, \"wb\") as f:\n f.write(await response.bytes())\n\nawait download(filename, \"addresses.csv\")\n\ndf = pd.read_csv(\"addresses.csv\", header=None)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "df", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "#### Adding column name to the DataFrame\n\nWe can add columns to an existing DataFrame using its **columns** attribute.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "df.columns =['First Name', 'Last Name', 'Location ', 'City','State','Area Code']", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "df", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "#### Selecting a single column\n\nTo select the first column 'First Name', you can pass the column name as a string to the indexing operator.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "df[\"First Name\"]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "#### Selecting multiple columns\n\nTo select multiple columns, you can pass a list of column names to the indexing operator.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "df = df[['First Name', 'Last Name', 'Location ', 'City','State','Area Code']]\ndf", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "#### Selecting rows using .iloc and .loc\n\nNow, let's see how to use .loc for selecting rows from our DataFrame.\n\n**loc() : loc() is label based data selecting method which means that we have to pass the name of the row or column which we want to select.**\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# To select the first row\ndf.loc[0]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# To select the 0th,1st and 2nd row of \"First Name\" column only\ndf.loc[[0,1,2], \"First Name\" ]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Now, let's see how to use .iloc for selecting rows from our DataFrame.\n\n**iloc() : iloc() is a indexed based selecting method which means that we have to pass integer index in the method to select specific row/column.**\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# To select the 0th,1st and 2nd row of \"First Name\" column only\ndf.iloc[[0,1,2], 0]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "For more information please read the [documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2021-01-01).\n\nLet's perform some basic transformation in pandas.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "### Transform Function in Pandas\n\nPython's Transform function returns a self-produced dataframe with transformed values after applying the function specified in its parameter.\n\nLet's see how Transform function works.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "#import library\nimport pandas as pd\nimport numpy as np", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "#creating a dataframe\ndf=pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['a', 'b', 'c'])\ndf", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Let’s say we want to add 10 to each element in a dataframe:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "#applying the transform function\ndf = df.transform(func = lambda x : x + 10)\ndf", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Now we will use DataFrame.transform() function to find the square root to each element of the dataframe.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "result = df.transform(func = ['sqrt'])", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "result", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "For more information about the **transform()** function please read the [documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.transform.html?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2021-01-01).\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# JSON file Format\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "**JSON (JavaScript Object Notation)** is a lightweight data-interchange format. It is easy for humans to read and write.\n\nJSON is built on two structures:\n\n1. A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.\n\n2. An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.\n\nJSON is a language-independent data format. It was derived from JavaScript, but many modern programming languages include code to generate and parse JSON-format data. It is a very common data format with a diverse range of applications.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "The text in JSON is done through quoted string which contains the values in key-value mappings within { }. It is similar to the dictionary in Python.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Python supports JSON through a built-in package called **json**. To use this feature, we import the json package in Python script.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "import json", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "# Writing JSON to a File\n\nThis is usually called **serialization**. It is the process of converting an object into a special format which is suitable for transmitting over the network or storing in file or database.\n\nTo handle the data flow in a file, the JSON library in Python uses the **dump()** or **dumps()** function to convert the Python objects into their respective JSON object. This makes it easy to write data to files.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "import json\nperson = {\n 'first_name' : 'Mark',\n 'last_name' : 'abc',\n 'age' : 27,\n 'address': {\n \"streetAddress\": \"21 2nd Street\",\n \"city\": \"New York\",\n \"state\": \"NY\",\n \"postalCode\": \"10021-3100\"\n }\n}", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "#### serialization using dump() function\n\n**json.dump()** method can be used for writing to JSON file.\n\nSyntax: json.dump(dict, file_pointer)\n\nParameters:\n\n1. **dictionary** – name of the dictionary which should be converted to JSON object.\n2. **file pointer** – pointer of the file opened in write or append mode.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "with open('person.json', 'w') as f: # writing JSON object\n json.dump(person, f)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "#### serialization using dumps() function\n\n**json.dumps()** that helps in converting a dictionary to a JSON object.\n\nIt takes two parameters:\n\n1. **dictionary** – name of the dictionary which should be converted to JSON object.\n2. **indent** – defines the number of units for indentation\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Serializing json \njson_object = json.dumps(person, indent = 4) \n \n# Writing to sample.json \nwith open(\"sample.json\", \"w\") as outfile: \n outfile.write(json_object) ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "print(json_object)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Our Python objects are now serialized to the file. For deserialize it back to the Python object, we use the load() function.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Reading JSON to a File\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "This process is usually called **Deserialization** - it is the reverse of serialization. It converts the special format returned by the serialization back into a usable object.\n\n### Using json.load()\n\nThe JSON package has json.load() function that loads the json content from a json file into a dictionary.\n\nIt takes one parameter:\n\n**File pointer** : A file pointer that points to a JSON file.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "import json \n \n# Opening JSON file \nwith open('sample.json', 'r') as openfile: \n \n # Reading from json file \n json_object = json.load(openfile) \n \nprint(json_object) \nprint(type(json_object)) ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "# XLSX file format\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "**XLSX** is a Microsoft Excel Open XML file format. It is another type of Spreadsheet file format.\n\nIn XLSX data is organized under the cells and columns in a sheet.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "## Reading the data from XLSX file\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Let's load the data from XLSX file and define the sheet name. For loading the data you can use the Pandas library in python.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "import pandas as pd", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Not needed unless you're running locally\n# import urllib.request\n# urllib.request.urlretrieve(\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/file_example_XLSX_10.xlsx\", \"sample.xlsx\")\n\nfilename = \"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/file_example_XLSX_10.xlsx\"\n\nasync def download(url, filename):\n response = await pyfetch(url)\n if response.status == 200:\n with open(filename, \"wb\") as f:\n f.write(await response.bytes())\n\nawait download(filename, \"file_example_XLSX_10.xlsx\")\n\ndf = pd.read_excel(\"file_example_XLSX_10.xlsx\")", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "df", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "# XML file format\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "**XML is also known as Extensible Markup Language**. As the name suggests, it is a markup language. It has certain rules for encoding data. XML file format is a human-readable and machine-readable file format.\n\nWe will take a look at how we can use other modules to read data from an XML file, and load it into a Pandas DataFrame.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "### Writing with xml.etree.ElementTree\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "The **xml.etree.ElementTree** module comes built-in with Python. It provides functionality for parsing and creating XML documents. **ElementTree** represents the XML document as a tree. We can move across the document using nodes which are elements and sub-elements of the XML file.\n\nFor more information please read the [xml.etree.ElementTree](https://docs.python.org/3/library/xml.etree.elementtree.html?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2021-01-01) documentation.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "import xml.etree.ElementTree as ET\n\n# create the file structure\nemployee = ET.Element('employee')\ndetails = ET.SubElement(employee, 'details')\nfirst = ET.SubElement(details, 'firstname')\nsecond = ET.SubElement(details, 'lastname')\nthird = ET.SubElement(details, 'age')\nfirst.text = 'Shiv'\nsecond.text = 'Mishra'\nthird.text = '23'\n\n# create a new XML file with the results\nmydata1 = ET.ElementTree(employee)\n# myfile = open(\"items2.xml\", \"wb\")\n# myfile.write(mydata)\nwith open(\"new_sample.xml\", \"wb\") as files:\n mydata1.write(files)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "### Reading with xml.etree.ElementTree\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Let's have a look at a one way to read XML data and put it in a Pandas DataFrame. You can see the XML file in the Notepad of your local machine.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Not needed unless running locally\n# !wget https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/Sample-employee-XML-file.xml\n\nimport xml.etree.ElementTree as etree\n\nfilename = \"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/Sample-employee-XML-file.xml\"\n\nasync def download(url, filename):\n response = await pyfetch(url)\n if response.status == 200:\n with open(filename, \"wb\") as f:\n f.write(await response.bytes())\n\nawait download(filename, \"Sample-employee-XML-file.xml\")", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "You would need to firstly parse an XML file and create a list of columns for data frame, then extract useful information from the XML file and add to a pandas data frame.\n\nHere is a sample code that you can use.:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Parse the XML file\ntree = etree.parse(\"Sample-employee-XML-file.xml\")\n\n# Get the root of the XML tree\nroot = tree.getroot()\n\n# Define the columns for the DataFrame\ncolumns = [\"firstname\", \"lastname\", \"title\", \"division\", \"building\", \"room\"]\n\n# Initialize an empty DataFrame\ndatatframe = pd.DataFrame(columns=columns)\n\n# Iterate through each node in the XML root\nfor node in root:\n # Extract text from each element\n firstname = node.find(\"firstname\").text\n lastname = node.find(\"lastname\").text\n title = node.find(\"title\").text\n division = node.find(\"division\").text\n building = node.find(\"building\").text\n room = node.find(\"room\").text\n \n # Create a DataFrame for the current row\n row_df = pd.DataFrame([[firstname, lastname, title, division, building, room]], columns=columns)\n \n # Concatenate with the existing DataFrame\n datatframe = pd.concat([datatframe, row_df], ignore_index=True)\n\n", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "datatframe", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "### Reading xml file using pandas.read_xml function\n\nWe can also read the downloaded xml file using the read_xml function present in the pandas library which returns a Dataframe object.\n\nFor more information read the pandas.read_xml documentation.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Herein xpath we mention the set of xml nodes to be considered for migrating to the dataframe which in this case is details node under employees.\ndf=pd.read_xml(\"Sample-employee-XML-file.xml\", xpath=\"/employees/details\") ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "### Save Data\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Correspondingly, Pandas enables us to save the dataset to csv by using the **dataframe.to_csv()** method, you can add the file path and name along with quotation marks in the parentheses.\n\nFor example, if you would save the dataframe df as **employee.csv** to your local machine, you may use the syntax below:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "datatframe.to_csv(\"employee.csv\", index=False)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "We can also read and save other file formats, we can use similar functions to **`pd.read_csv()`** and **`df.to_csv()`** for other data formats. The functions are listed in the following table:\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "

    Read/Save Other Data Formats

    \n\n| Data Formate | Read | Save |\n| ------------ | :---------------: | --------------: |\n| csv | `pd.read_csv()` | `df.to_csv()` |\n| json | `pd.read_json()` | `df.to_json()` |\n| excel | `pd.read_excel()` | `df.to_excel()` |\n| hdf | `pd.read_hdf()` | `df.to_hdf()` |\n| sql | `pd.read_sql()` | `df.to_sql()` |\n| ... | ... | ... |\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Let's move ahead and perform some **Data Analysis**.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Binary File Format\n\n\"Binary\" files are any files where the format isn't made up of readable characters. It contain formatting information that only certain applications or processors can understand. While humans can read text files, binary files must be run on the appropriate software or processor before humans can read them.\n\nBinary files can range from image files like JPEGs or GIFs, audio files like MP3s or binary document formats like Word or PDF.\n\nLet's see how to read an **Image** file.\n\n## Reading the Image file\n\nPython supports very powerful tools when it comes to image processing. Let's see how to process the images using the **PIL** library.\n\n**PIL** is the Python Imaging Library which provides the python interpreter with image editing capabilities.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# importing PIL \nfrom PIL import Image \n\n# Uncomment if running locally\n# import urllib.request\n# urllib.request.urlretrieve(\"https://hips.hearstapps.com/hmg-prod.s3.amazonaws.com/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg\", \"dog.jpg\")\n\nfilename = \"https://hips.hearstapps.com/hmg-prod.s3.amazonaws.com/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg\"\n\nasync def download(url, filename):\n response = await pyfetch(url)\n if response.status == 200:\n with open(filename, \"wb\") as f:\n f.write(await response.bytes())\n\nawait download(filename, \"./dog.jpg\")", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Read image \nimg = Image.open('./dog.jpg','r') \n \n# Output Images \nimg.show()", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "# Data Analysis\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "In this section, you will learn how to approach data acquisition in various ways and obtain necessary insights from a dataset. By the end of this lab, you will successfully load the data into Jupyter Notebook and gain some fundamental insights via the Pandas Library.\n\nIn our case, the **Diabetes Dataset** is an online source and it is in CSV (comma separated value) format. Let's use this dataset as an example to practice data reading.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "## About this Dataset\n\n**Context:** This dataset is originally from the **National Institute of Diabetes and Digestive and Kidney Diseases**. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years of age of Pima Indian heritage.\n\n**Content:** The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We have 768 rows and 9 columns. The first 8 columns represent the features and the last column represent the target/label.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Import pandas library\nimport pandas as pd", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "filename = \"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/diabetes.csv\"\n\nasync def download(url, filename):\n response = await pyfetch(url)\n if response.status == 200:\n with open(filename, \"wb\") as f:\n f.write(await response.bytes())\n\nawait download(filename, \"diabetes.csv\")\ndf = pd.read_csv(\"diabetes.csv\")", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "After reading the dataset, we can use the **dataframe.head(n)** method to check the top n rows of the dataframe, where n is an integer. Contrary to **dataframe.head(n)**, **dataframe.tail(n)** will show you the bottom n rows of the dataframe.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# show the first 5 rows using dataframe.head() method\nprint(\"The first 5 rows of the dataframe\") \ndf.head(5)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "To view the dimensions of the dataframe, we use the **`.shape`** parameter.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "df.shape", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "# Statistical Overview of dataset\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "df.info()", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "This method prints information about a DataFrame including the index dtype and columns, non-null values and memory usage.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "df.describe()", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Pandas **describe()** is used to view some basic statistical details like percentile, mean, standard deviation, etc. of a data frame or a series of numeric values. When this method is applied to a series of strings, it returns a different output\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "### Identify and handle missing values\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We use Python's built-in functions to identify these missing values. There are two methods to detect missing data:\n\n**.isnull()**\n\n**.notnull()**\n\nThe output is a boolean value indicating whether the value that is passed into the argument is in fact missing data.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "missing_data = df.isnull()\nmissing_data.head(5)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "\"True\" stands for missing value, while \"False\" stands for not missing value.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "

    Count missing values in each column

    \n

    \nUsing a for loop in Python, we can quickly figure out the number of missing values in each column. As mentioned above, \"True\" represents a missing value, \"False\" means the value is present in the dataset. In the body of the for loop the method \".value_counts()\" counts the number of \"True\" values. \n

    \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "for column in missing_data.columns.values.tolist():\n print(column)\n print (missing_data[column].value_counts())\n print(\"\") ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "As you can see above, there is no missing values in the dataset.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "

    Correct data format

    \n\n

    Check all data is in the correct format (int, float, text or other).

    \n\nIn Pandas, we use\n\n

    .dtype() to check the data type

    \n

    .astype() to change the data type

    \n\nNumerical variables should have type **'float'** or **'int'**.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "df.dtypes", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "As we can see above, All columns have the correct data type.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Visualization\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "**Visualization** is one of the best way to get insights from the dataset. **Seaborn** and **Matplotlib** are two of Python's most powerful visualization libraries.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# import libraries\nimport matplotlib.pyplot as plt\nimport seaborn as sns", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "labels= 'Not Diabetic','Diabetic'\nplt.pie(df['Outcome'].value_counts(),labels=labels,autopct='%0.02f%%')\nplt.legend()\nplt.show()", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "As you can see above, 65.10% females are not Diabetic and 34.90% are Diabetic.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "# Thank you for completing this Notebook\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + } + ] +} \ No newline at end of file diff --git a/PY0101EN-5-2-Numpy2D.ipynb b/PY0101EN-5-2-Numpy2D.ipynb new file mode 100644 index 0000000..d8647f5 --- /dev/null +++ b/PY0101EN-5-2-Numpy2D.ipynb @@ -0,0 +1,481 @@ +{ + "metadata": { + "kernelspec": { + "display_name": "Python", + "language": "python", + "name": "conda-env-python-py" + }, + "language_info": { + "name": "" + } + }, + "nbformat_minor": 4, + "nbformat": 4, + "cells": [ + { + "cell_type": "markdown", + "source": "

    \n \n \"Skills\n \n

    \n\n\n# 2D Numpy in Python\n\n\nEstimated time needed: **30** minutes\n \n\n## Objectives\n\nAfter completing this lab you will be able to:\n\n* Operate comfortably with `numpy`\n* Perform complex operations with `numpy`\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "

    Table of Contents

    \n\n\n
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "## Create a 2D Numpy Array\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Import the libraries\n\nimport numpy as np", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Consider the list a, which contains three nested lists **each of equal size**. \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Create a list\n\na = [[11, 12, 13], [21, 22, 23], [31, 32, 33]]\na", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "We can cast the list to a Numpy Array as follows:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Convert list to Numpy Array\n# Every element is the same type\n\nA = np.array(a)\nA", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "We can use the attribute ndim to obtain the number of axes or dimensions, referred to as the rank. \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Show the numpy array dimensions\n\nA.ndim", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Attribute shape returns a tuple corresponding to the size or number of each dimension.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Show the numpy array shape\n\nA.shape", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "The total number of elements in the array is given by the attribute size.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Show the numpy array size\n\nA.size", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "## Accessing different elements of a Numpy Array\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We can use rectangular brackets to access the different elements of the array. The correspondence between the rectangular brackets and the list and the rectangular representation is shown in the following figure for a 3x3 array: \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We can access the 2nd-row, 3rd column as shown in the following figure:\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": " We simply use the square brackets and the indices corresponding to the element we would like:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Access the element on the second row and third column\n\nA[1, 2]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": " We can also use the following notation to obtain the elements: \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Access the element on the second row and third column\n\nA[1][2]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": " Consider the elements shown in the following figure \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We can access the element as follows: \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Access the element on the first row and first column\n\nA[0][0]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "We can also use slicing in numpy arrays. Consider the following figure. We would like to obtain the first two columns in the first row\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": " This can be done with the following syntax: \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Access the element on the first row and first and second columns\n\nA[0][0:2]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Similarly, we can obtain the first two rows of the 3rd column as follows:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Access the element on the first and second rows and third column\n\nA[0:2, 2]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Corresponding to the following figure: \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "## Basic Operations\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We can also add arrays. The process is identical to matrix addition. Matrix addition of X and Y is shown in the following figure:\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "The numpy array is given by X and Y\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Create a numpy array X\n\nX = np.array([[1, 0], [0, 1]]) \nX", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Create a numpy array Y\n\nY = np.array([[2, 1], [1, 2]]) \nY", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": " We can add the numpy arrays as follows.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Add X and Y\n\nZ = X + Y\nZ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Multiplying a numpy array by a scaler is identical to multiplying a matrix by a scaler. If we multiply the matrix Y by the scaler 2, we simply multiply every element in the matrix by 2, as shown in the figure.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We can perform the same operation in numpy as follows \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Create a numpy array Y\n\nY = np.array([[2, 1], [1, 2]]) \nY", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Multiply Y with 2\n\nZ = 2 * Y\nZ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "Multiplication of two arrays corresponds to an element-wise product or Hadamard product. Consider matrix X and Y. The Hadamard product corresponds to multiplying each of the elements in the same position, i.e. multiplying elements contained in the same color boxes together. The result is a new matrix that is the same size as matrix Y or X, as shown in the following figure.\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "We can perform element-wise product of the array X and Y as follows:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Create a numpy array Y\n\nY = np.array([[2, 1], [1, 2]]) \nY", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Create a numpy array X\n\nX = np.array([[1, 0], [0, 1]]) \nX", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Multiply X with Y\n\nZ = X * Y\nZ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "We can also perform matrix multiplication with the numpy arrays A and B as follows:\n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "First, we define matrix A and B:\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Create a matrix A\n\nA = np.array([[0, 1, 1], [1, 0, 1]])\nA", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Create a matrix B\n\nB = np.array([[1, 1], [1, 1], [-1, 1]])\nB", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "We use the numpy function dot to multiply the arrays together.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Calculate the dot product\n\nZ = np.dot(A,B)\nZ", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Calculate the sine of Z\n\nnp.sin(Z)", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "We use the numpy attribute T to calculate the transposed matrix\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Create a matrix C\n\nC = np.array([[1,1],[2,2],[3,3]])\nC", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "source": "# Get the transposed of C\n\nC.T", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "

    Quiz on 2D Numpy Array

    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Consider the following list a, convert it to Numpy Array. \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Write your code below and press Shift+Enter to execute\n\na = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "
    Click here for the solution\n\n```python\nA = np.array(a)\nA\n```\n\n
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Calculate the numpy array size.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Write your code below and press Shift+Enter to execute\n", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "
    Click here for the solution\n\n```python\nA.size\n```\n\n
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Access the element on the first row and first and second columns.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Write your code below and press Shift+Enter to execute\n", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "
    Click here for the solution\n\n```python\nA[0][0:2]\n```\n\n
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "Perform matrix multiplication with the numpy arrays A and B.\n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "# Write your code below and press Shift+Enter to execute\n\nB = np.array([[0, 1], [1, 0], [1, 1], [-1, 0]])", + "metadata": {}, + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "source": "
    Click here for the solution\n\n```python\nX = np.dot(A,B)\nX\n```\n\n
    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "
    \n

    The last exercise!

    \n

    Congratulations, you have completed your first lesson and hands-on lab in Python. \n


    \n", + "metadata": {} + }, + { + "cell_type": "markdown", + "source": "## Author\n\nJoseph Santarcangelo\n\n\n## Other contributors\n\nMavis Zhou\n\n\n## Change Log\n\n\n| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n|---|---|---|---|\n| 2023-11-02 | 2.2 | Abhishek Gagneja | Updated instructions |\n| 2022-01-10 | 2.1 | Malika | Removed the readme for GitShare|\n| 2021-01-05 | 2.2 | Malika | Updated the solution for dot multiplication |\n| 2020-09-09 | 2.1 | Malika | Updated the screenshot for first two rows of the 3rd column |\n| 2020-08-26 | 2.0 | Lavanya | Moved lab to course repo in GitLab |\n\n\n\n
    \n\n##

    © IBM Corporation 2023. All rights reserved.

    \n", + "metadata": {} + }, + { + "cell_type": "code", + "source": "", + "metadata": {}, + "outputs": [], + "execution_count": null + } + ] +} \ No newline at end of file diff --git a/Python Cheat Sheet - The Basics Coursera.pdf b/Python Cheat Sheet - The Basics Coursera.pdf new file mode 100644 index 0000000..0c80f28 Binary files /dev/null and b/Python Cheat Sheet - The Basics Coursera.pdf differ diff --git a/Reading_ Beginner's Guide to NumPy _ Coursera.pdf b/Reading_ Beginner's Guide to NumPy _ Coursera.pdf new file mode 100644 index 0000000..1850992 Binary files /dev/null and b/Reading_ Beginner's Guide to NumPy _ Coursera.pdf differ diff --git a/Reading_ Web Scraping - A Key Tool in Data Science _ Coursera.pdf b/Reading_ Web Scraping - A Key Tool in Data Science _ Coursera.pdf new file mode 100644 index 0000000..ab741f5 Binary files /dev/null and b/Reading_ Web Scraping - A Key Tool in Data Science _ Coursera.pdf differ diff --git a/download.pdf b/download.pdf new file mode 100644 index 0000000..4951aa3 Binary files /dev/null and b/download.pdf differ