From 9dc713790bbea1ec0f31d814aae6a05b328629a4 Mon Sep 17 00:00:00 2001 From: evill013 Date: Fri, 23 Aug 2019 23:39:20 -0400 Subject: [PATCH 01/38] First line changed --- module-1/lab-resolving-git-conflicts/your-code/about-me.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/module-1/lab-resolving-git-conflicts/your-code/about-me.md b/module-1/lab-resolving-git-conflicts/your-code/about-me.md index 30a999d50..4b9a1db91 100644 --- a/module-1/lab-resolving-git-conflicts/your-code/about-me.md +++ b/module-1/lab-resolving-git-conflicts/your-code/about-me.md @@ -1,4 +1,4 @@ -Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque viverra laoreet lorem et dapibus. Integer auctor dignissim egestas. Ut id purus neque. Pellentesque imperdiet lacus in libero laoreet, at tempus felis tristique. Cras fermentum erat a dui vulputate gravida. Nulla aliquet nisi interdum nulla pretium, ac vestibulum diam congue. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Phasellus lacus risus, sodales vitae viverra quis, maximus ac ipsum. Sed consequat viverra mattis. Curabitur iaculis varius mollis. +Line of text changed. consectetur adipiscing elit. Quisque viverra laoreet lorem et dapibus. Integer auctor dignissim egestas. Ut id purus neque. Pellentesque imperdiet lacus in libero laoreet, at tempus felis tristique. Cras fermentum erat a dui vulputate gravida. Nulla aliquet nisi interdum nulla pretium, ac vestibulum diam congue. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Phasellus lacus risus, sodales vitae viverra quis, maximus ac ipsum. Sed consequat viverra mattis. Curabitur iaculis varius mollis. Ut porttitor iaculis tellus bibendum euismod. Morbi porta, ante nec tempus porta, felis mi faucibus lacus, sed tristique purus nunc sed est. Aenean pulvinar urna ut lacus interdum aliquam. Pellentesque sit amet magna accumsan, sagittis metus a, volutpat velit. Mauris vitae ex vehicula, posuere nisi sed, sagittis nunc. Ut scelerisque, mi non tristique tristique, mi enim luctus nunc, eu mattis sem quam auctor nunc. Donec lobortis tellus eget blandit ultricies. Vivamus euismod metus eget leo blandit, at malesuada magna efficitur. Praesent sodales faucibus mi, ullamcorper ultrices orci. Vivamus maximus malesuada massa, nec placerat leo feugiat vel. Nam vitae eleifend enim. Nullam interdum ipsum velit, vitae faucibus lectus blandit euismod. From ca319d08712da727ae393004bb46fac2125d3fc8 Mon Sep 17 00:00:00 2001 From: evill013 Date: Fri, 23 Aug 2019 23:47:35 -0400 Subject: [PATCH 02/38] My info added --- .../lab-resolving-git-conflicts/your-code/about-me.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/module-1/lab-resolving-git-conflicts/your-code/about-me.md b/module-1/lab-resolving-git-conflicts/your-code/about-me.md index 4b9a1db91..5e51c0667 100644 --- a/module-1/lab-resolving-git-conflicts/your-code/about-me.md +++ b/module-1/lab-resolving-git-conflicts/your-code/about-me.md @@ -1,7 +1,6 @@ -Line of text changed. consectetur adipiscing elit. Quisque viverra laoreet lorem et dapibus. Integer auctor dignissim egestas. Ut id purus neque. Pellentesque imperdiet lacus in libero laoreet, at tempus felis tristique. Cras fermentum erat a dui vulputate gravida. Nulla aliquet nisi interdum nulla pretium, ac vestibulum diam congue. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Phasellus lacus risus, sodales vitae viverra quis, maximus ac ipsum. Sed consequat viverra mattis. Curabitur iaculis varius mollis. +This is a short intro of myself: +I am from Venezuela, I graduated on 2016 form FIU with an IT bachelor´s degree. Currently, I have a system application analyst job at HBO Latinamerica on the Applications Support Team. -Ut porttitor iaculis tellus bibendum euismod. Morbi porta, ante nec tempus porta, felis mi faucibus lacus, sed tristique purus nunc sed est. Aenean pulvinar urna ut lacus interdum aliquam. Pellentesque sit amet magna accumsan, sagittis metus a, volutpat velit. Mauris vitae ex vehicula, posuere nisi sed, sagittis nunc. Ut scelerisque, mi non tristique tristique, mi enim luctus nunc, eu mattis sem quam auctor nunc. Donec lobortis tellus eget blandit ultricies. Vivamus euismod metus eget leo blandit, at malesuada magna efficitur. Praesent sodales faucibus mi, ullamcorper ultrices orci. Vivamus maximus malesuada massa, nec placerat leo feugiat vel. Nam vitae eleifend enim. Nullam interdum ipsum velit, vitae faucibus lectus blandit euismod. +I decided to join Ironhack to obtain more skills that will allow me to move to a different area on the tech industry. -Suspendisse ut malesuada ex. Nulla ultricies nisl et nisi rhoncus sollicitudin. Vestibulum maximus iaculis ligula, nec commodo nunc ullamcorper nec. Duis quis condimentum sapien. Cras vestibulum interdum felis eu auctor. Quisque semper, magna at dapibus faucibus, felis risus semper ligula, id aliquam lectus ligula vel nisi. In hac habitasse platea dictumst. Donec arcu sapien, suscipit ac dictum et, imperdiet id tortor. Maecenas ornare sodales interdum. Mauris dictum felis eu eros vestibulum cursus. Phasellus accumsan, turpis ut malesuada sollicitudin, augue leo venenatis ante, vel convallis tellus diam sit amet lacus. Aenean eu mauris eros. Praesent ante lacus, gravida sit amet tellus nec, laoreet ultrices lacus. Integer commodo semper vestibulum. Fusce felis massa, consectetur facilisis rutrum nec, pulvinar et nisi. - -Morbi fermentum ultricies tortor, vehicula ultrices eros elementum a. Duis ornare aliquam facilisis. Proin aliquam tincidunt odio vitae dignissim. Sed malesuada lacinia massa, nec blandit urna auctor elementum. Duis auctor non tortor in consequat. Mauris id vestibulum risus. In eget erat sed lacus efficitur viverra sed eu est. Aliquam interdum consequat molestie. Aliquam metus nisi, blandit non semper ut, blandit vel leo. Cras dictum turpis erat, sed iaculis ligula facilisis dapibus. Aliquam posuere dignissim fermentum. Praesent at neque sit amet lectus ornare iaculis. Curabitur id urna quis lorem varius ultrices eu sit amet sapien. Curabitur maximus volutpat suscipit. Proin imperdiet elementum lacus a eleifend. Sed tempor lacus posuere diam vehicula iaculis. +By the end of the course I would like get a job on the healthcare industry as a Data Analyst. My main goal is to participate in a project that will provide a mean for a company to find gaps and area of growth opportunity for the compan From 7f8c4ed9d355dfcb4f550c5596ea794c9864d206 Mon Sep 17 00:00:00 2001 From: evill013 <53126587+evill013@users.noreply.github.com> Date: Fri, 23 Aug 2019 23:57:37 -0400 Subject: [PATCH 03/38] Update about-me.md --- module-1/lab-resolving-git-conflicts/your-code/about-me.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/module-1/lab-resolving-git-conflicts/your-code/about-me.md b/module-1/lab-resolving-git-conflicts/your-code/about-me.md index 5e51c0667..cf64e245f 100644 --- a/module-1/lab-resolving-git-conflicts/your-code/about-me.md +++ b/module-1/lab-resolving-git-conflicts/your-code/about-me.md @@ -3,4 +3,4 @@ I am from Venezuela, I graduated on 2016 form FIU with an IT bachelor´s degree. I decided to join Ironhack to obtain more skills that will allow me to move to a different area on the tech industry. -By the end of the course I would like get a job on the healthcare industry as a Data Analyst. My main goal is to participate in a project that will provide a mean for a company to find gaps and area of growth opportunity for the compan +By the end of the course I would like get a job on the healthcare industry as a Data Analyst. My main goal is to participate in a project that will provide means for the company to take better desicions based on data analysis. From 1974a934aeb72e9a3db6cde04920d44f03a995c8 Mon Sep 17 00:00:00 2001 From: evill013 <53126587+evill013@users.noreply.github.com> Date: Sun, 25 Aug 2019 17:14:17 -0400 Subject: [PATCH 04/38] Work_In_Progress --- .../your-code/main.ipynb | 478 ++++++++++++++++-- 1 file changed, 442 insertions(+), 36 deletions(-) diff --git a/module-1/lab-list-comprehensions/your-code/main.ipynb b/module-1/lab-list-comprehensions/your-code/main.ipynb index c5931c41f..02922ca2c 100644 --- a/module-1/lab-list-comprehensions/your-code/main.ipynb +++ b/module-1/lab-list-comprehensions/your-code/main.ipynb @@ -11,7 +11,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 38, "metadata": {}, "outputs": [], "source": [ @@ -29,10 +29,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 39, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]\n" + ] + } + ], + "source": [ + "lst= [x for x in range(1,51)]\n", + "print(lst)" + ] }, { "cell_type": "markdown", @@ -43,10 +54,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 40, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200]\n" + ] + } + ], + "source": [ + "even_lst=[x for x in range(2,201,2)]\n", + "print (even_lst)" + ] }, { "cell_type": "markdown", @@ -57,7 +79,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 41, "metadata": {}, "outputs": [], "source": [ @@ -70,15 +92,71 @@ " [0.71725408, 0.87702738, 0.31244595, 0.76615487],\n", " [0.20754036, 0.57871812, 0.07214068, 0.40356048],\n", " [0.12149553, 0.53222417, 0.9976855 , 0.12536346],\n", - " [0.80930099, 0.50962849, 0.94555126, 0.33364763]])" + " [0.80930099, 0.50962849, 0.94555126, 0.33364763]])\n" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 42, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "data": { + "text/plain": [ + "[0.84062117,\n", + " 0.48006452,\n", + " 0.7876326,\n", + " 0.77109654,\n", + " 0.44409793,\n", + " 0.09014516,\n", + " 0.81835917,\n", + " 0.87645456,\n", + " 0.7066597,\n", + " 0.09610873,\n", + " 0.41247947,\n", + " 0.57433389,\n", + " 0.29960807,\n", + " 0.42315023,\n", + " 0.34452557,\n", + " 0.4751035,\n", + " 0.17003563,\n", + " 0.46843998,\n", + " 0.92796258,\n", + " 0.69814654,\n", + " 0.41290051,\n", + " 0.19561071,\n", + " 0.16284783,\n", + " 0.97016248,\n", + " 0.71725408,\n", + " 0.87702738,\n", + " 0.31244595,\n", + " 0.76615487,\n", + " 0.20754036,\n", + " 0.57871812,\n", + " 0.07214068,\n", + " 0.40356048,\n", + " 0.12149553,\n", + " 0.53222417,\n", + " 0.9976855,\n", + " 0.12536346,\n", + " 0.80930099,\n", + " 0.50962849,\n", + " 0.94555126,\n", + " 0.33364763,\n", + " 0.33364763]" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nump_lst=[val for sublist in a for val in sublist]\n", + "nump_lst.append(val)\n", + "\n", + "nump_lst" + ] }, { "cell_type": "markdown", @@ -89,10 +167,45 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 43, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "data": { + "text/plain": [ + "[0.84062117,\n", + " 0.7876326,\n", + " 0.77109654,\n", + " 0.81835917,\n", + " 0.87645456,\n", + " 0.7066597,\n", + " 0.57433389,\n", + " 0.92796258,\n", + " 0.69814654,\n", + " 0.97016248,\n", + " 0.71725408,\n", + " 0.87702738,\n", + " 0.76615487,\n", + " 0.57871812,\n", + " 0.53222417,\n", + " 0.9976855,\n", + " 0.80930099,\n", + " 0.50962849,\n", + " 0.94555126,\n", + " 0.33364763]" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nump_list=[val for sublist in a for val in sublist if val >= 0.5]\n", + "nump_list.append(val)\n", + "\n", + "nump_list" + ] }, { "cell_type": "markdown", @@ -103,9 +216,17 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 44, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.55867166, 0.06210792, 0.08147297, 0.82579068, 0.91512478, 0.06833034, 0.05440634, 0.65857693, 0.30296619, 0.06769833, 0.96031863, 0.51293743, 0.09143215, 0.71893382, 0.45850679, 0.58256464, 0.59005654, 0.56266457, 0.71600294, 0.87392666, 0.11434044, 0.8694668, 0.65669313, 0.10708681, 0.07529684, 0.46470767, 0.47984544, 0.65368638, 0.14901286, 0.23760688, 0.33364763]\n" + ] + } + ], "source": [ "b = np.array([[[0.55867166, 0.06210792, 0.08147297],\n", " [0.82579068, 0.91512478, 0.06833034]],\n", @@ -120,16 +241,15 @@ " [0.8694668 , 0.65669313, 0.10708681]],\n", "\n", " [[0.07529684, 0.46470767, 0.47984544],\n", - " [0.65368638, 0.14901286, 0.23760688]]])" + " [0.65368638, 0.14901286, 0.23760688]]])\n", + "\n", + "new_nplst=[val for lst in b for sublist in lst for val in sublist]\n", + "\n", + "new_nplst.append(val)\n", + "\n", + "print(new_nplst)" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, @@ -139,10 +259,24 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 118, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.08147297, 0.06833034, 0.30296619, 0.45850679, 0.11434044, 0.10708681, 0.47984544, 0.23760688, 0.33364763]\n" + ] + } + ], + "source": [ + "new_numplst=[sublist[-1] for lst in b for sublist in lst if sublist[-1] <= 0.5]\n", + "\n", + "new_numplst.append(sublist[-1])\n", + "\n", + "print(new_numplst)" + ] }, { "cell_type": "markdown", @@ -153,10 +287,219 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] + "execution_count": 147, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['sample_file_0.csv', 'sample_file_1.csv', 'sample_file_2.csv', 'sample_file_3.csv', 'sample_file_4.csv', 'sample_file_5.csv', 'sample_file_6.csv', 'sample_file_7.csv', 'sample_file_8.csv', 'sample_file_9.csv', 'test.csv', 'sample_file_9.txt']\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
012345678910111213141516171819
00.2151900.1553520.1608480.8077360.3635870.8998320.1467540.0948020.7051330.8827620.7733200.6877450.0167890.3407250.9841820.9854610.4120440.8678940.1134320.349845
10.8955440.9551960.0899250.8275550.0890710.6428830.9960520.8790200.4218370.4121410.8585130.2170910.1761570.5512360.8343780.4195350.0414310.6022580.9846280.516899
20.4137520.6930520.7897960.9291640.5361910.4397690.7734740.9820740.8769550.6331540.2790050.4833170.9082880.7561720.4621300.2898920.1452330.0768190.7978360.197592
30.7280010.3481560.9357870.8511630.4445730.7150800.9884080.2103320.7321330.8923830.2168930.3675950.8462080.2401110.4718800.3997210.7581960.6655680.9315420.448124
40.1349420.8759310.2735050.2075880.0806960.7173960.0339300.6468370.8887220.9227420.1765930.8613330.3894510.6952440.1299550.3641140.4282240.3654420.8478180.588319
\n", + "
" + ], + "text/plain": [ + " 0 1 2 3 4 5 6 \\\n", + "0 0.215190 0.155352 0.160848 0.807736 0.363587 0.899832 0.146754 \n", + "1 0.895544 0.955196 0.089925 0.827555 0.089071 0.642883 0.996052 \n", + "2 0.413752 0.693052 0.789796 0.929164 0.536191 0.439769 0.773474 \n", + "3 0.728001 0.348156 0.935787 0.851163 0.444573 0.715080 0.988408 \n", + "4 0.134942 0.875931 0.273505 0.207588 0.080696 0.717396 0.033930 \n", + "\n", + " 7 8 9 10 11 12 13 \\\n", + "0 0.094802 0.705133 0.882762 0.773320 0.687745 0.016789 0.340725 \n", + "1 0.879020 0.421837 0.412141 0.858513 0.217091 0.176157 0.551236 \n", + "2 0.982074 0.876955 0.633154 0.279005 0.483317 0.908288 0.756172 \n", + "3 0.210332 0.732133 0.892383 0.216893 0.367595 0.846208 0.240111 \n", + "4 0.646837 0.888722 0.922742 0.176593 0.861333 0.389451 0.695244 \n", + "\n", + " 14 15 16 17 18 19 \n", + "0 0.984182 0.985461 0.412044 0.867894 0.113432 0.349845 \n", + "1 0.834378 0.419535 0.041431 0.602258 0.984628 0.516899 \n", + "2 0.462130 0.289892 0.145233 0.076819 0.797836 0.197592 \n", + "3 0.471880 0.399721 0.758196 0.665568 0.931542 0.448124 \n", + "4 0.129955 0.364114 0.428224 0.365442 0.847818 0.588319 " + ] + }, + "execution_count": 147, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "files = os.listdir(\"../data\")\n", + "\n", + "file_lst=[f for f in files if f.endswith(\".csv\")]\n", + "file_lst.append(f)\n", + "\n", + "print(file_lst)\n", + "\n", + "pd.read_csv(\"../data/\"+ f)" + ] }, { "cell_type": "markdown", @@ -167,10 +510,73 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 148, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version\n", + "of pandas will change to not sort by default.\n", + "\n", + "To accept the future behavior, pass 'sort=False'.\n", + "\n", + "To retain the current behavior and silence the warning, pass 'sort=True'.\n", + "\n", + " \"\"\"Entry point for launching an IPython kernel.\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
0
0...
\n", + "
" + ], + "text/plain": [ + " 0\n", + "0 ..." + ] + }, + "execution_count": 148, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data_frame=[pd.concat([pd.read_csv(\"../data/\"+ f) for f in file_lst ])]\n", + "df = pd.DataFrame(data_frame) \n", + "#df.to_csv(r'../data/test.csv')\n", + "\n", + "df.head(10)" + ] }, { "cell_type": "markdown", @@ -231,7 +637,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.0" + "version": "3.7.4" } }, "nbformat": 4, From a199ba7bb1bca3557dbb15e9f524045999418585 Mon Sep 17 00:00:00 2001 From: evill013 <53126587+evill013@users.noreply.github.com> Date: Sat, 7 Sep 2019 15:44:44 -0400 Subject: [PATCH 05/38] SOLVED --- .../your-code/my-code-python-project.ipynb | 394 ++++++++++++++++++ 1 file changed, 394 insertions(+) create mode 100644 module-1/python-project/your-code/my-code-python-project.ipynb diff --git a/module-1/python-project/your-code/my-code-python-project.ipynb b/module-1/python-project/your-code/my-code-python-project.ipynb new file mode 100644 index 000000000..9da6aab92 --- /dev/null +++ b/module-1/python-project/your-code/my-code-python-project.ipynb @@ -0,0 +1,394 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "# define rooms and items\n", + "\n", + "couch = {\n", + " \"name\": \"couch\",\n", + " \"type\": \"furniture\",\n", + "}\n", + "\n", + "door_a = {\n", + " \"name\": \"door a\",\n", + " \"type\": \"door\",\n", + "}\n", + "\n", + "key_a = {\n", + " \"name\": \"key for door a\",\n", + " \"type\": \"key\",\n", + " \"target\": door_a,\n", + "}\n", + "\n", + "piano = {\n", + " \"name\": \"piano\",\n", + " \"type\": \"furniture\",\n", + "}\n", + "\n", + "game_room = {\n", + " \"name\": \"game room\",\n", + " \"type\": \"room\",\n", + "}\n", + "\n", + "outside = {\n", + " \"name\": \"outside\"\n", + "}\n", + "\n", + "queen_bed = {\n", + " \"name\": \"queen bed\",\n", + " \"type\": \"furniture\",\n", + "}\n", + "\n", + "door_b = {\n", + " \"name\": \"door b\",\n", + " \"type\": \"door\",\n", + "}\n", + "\n", + "door_c = {\n", + " \"name\": \"door c\",\n", + " \"type\": \"door\",\n", + "}\n", + "\n", + "key_b = {\n", + " \"name\": \"key for door b\",\n", + " \"type\": \"key\",\n", + " \"target\": door_b,\n", + "}\n", + "\n", + "bed_room1 = {\n", + " \"name\": \"bedroom 1\",\n", + " \"type\": \"room\",\n", + "}\n", + "\n", + "\n", + "double_bed = {\n", + " \"name\": \"double bed\",\n", + " \"type\": \"furniture\",\n", + "}\n", + "\n", + "dresser = {\n", + " \"name\": \"dresser\",\n", + " \"type\": \"furniture\",\n", + "}\n", + "\n", + "door_d = {\n", + " \"name\": \"door d\",\n", + " \"type\": \"door\",\n", + "}\n", + "key_c = {\n", + " \"name\": \"key for door c\",\n", + " \"type\": \"key\",\n", + " \"target\": door_c,\n", + "}\n", + "key_d = {\n", + " \"name\": \"key for door d\",\n", + " \"type\": \"key\",\n", + " \"target\": door_d,\n", + "}\n", + "\n", + "bed_room2 = {\n", + " \"name\": \"bedroom 2\",\n", + " \"type\": \"room\",\n", + "}\n", + "\n", + "dinning_table = {\n", + " \"name\": \"dinning table\",\n", + " \"type\": \"furniture\",\n", + "}\n", + "\n", + "living_room = {\n", + " \"name\": \"living room\",\n", + " \"type\": \"room\",\n", + "}\n", + "all_rooms = [game_room,bed_room1,bed_room2,living_room]\n", + "all_doors = [door_a,door_b,door_c,door_d]\n", + "\n", + "# define which items/rooms are related\n", + "\n", + "object_relations = {\n", + " \"game room\": [couch, piano,door_a],\n", + " \"bedroom 1\": [queen_bed,door_a,door_b,door_c],\n", + " \"bedroom 2\": [double_bed,dresser,door_b],\n", + " \"living room\": [dinning_table,door_d],\n", + " \"piano\": [key_a],\n", + " \"queen bed\": [key_b],\n", + " \"double bed\":[key_c],\n", + " \"dresser\":[key_d],\n", + " \"outside\": [[door_a],[door_b],[door_c],[door_d]],\n", + " \"door a\": [game_room,bed_room1],\n", + " \"door b\": [bed_room1,bed_room2],\n", + " \"door c\": [bed_room1,living_room],\n", + " \"door d\": [living_room,outside]\n", + "}\n", + "\n", + "# define game state. Do not directly change this dict. \n", + "# Instead, when a new game starts, make a copy of this\n", + "# dict and use the copy to store gameplay state. This \n", + "# way you can replay the game multiple times.\n", + "\n", + "INIT_GAME_STATE = {\n", + " \"current_room\": game_room,\n", + " \"keys_collected\": [],\n", + " \"target_room\": outside\n", + "}" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "def linebreak():\n", + " \"\"\"\n", + " Print a line break\n", + " \"\"\"\n", + " print(\"\\n\\n\")\n", + "\n", + "def start_game():\n", + " \"\"\"\n", + " Start the game\n", + " \"\"\"\n", + " print(\"You wake up on a couch and find yourself in a strange house with no windows which you have never been to before. You don't remember why you are here and what had happened before. You feel some unknown danger is approaching and you must get out of the house, NOW!\")\n", + " play_room(game_state[\"current_room\"])\n", + "\n", + "def play_room(room):\n", + " \"\"\"\n", + " Play a room. First check if the room being played is the target room.\n", + " If it is, the game will end with success. Otherwise, let player either \n", + " explore (list all items in this room) or examine an item found here.\n", + " \"\"\"\n", + " game_state[\"current_room\"] = room\n", + " if(game_state[\"current_room\"] == game_state[\"target_room\"]):\n", + " print(\"Congrats! You escaped the room!\")\n", + " else:\n", + " print(\"You are now in \" + room[\"name\"])\n", + " intended_action = input(\"What would you like to do? Type 'explore' or 'examine'?\").strip()\n", + " if intended_action == \"explore\":\n", + " explore_room(room)\n", + " play_room(room)\n", + " elif intended_action == \"examine\":\n", + " examine_item(input(\"What would you like to examine?\").strip())\n", + " else:\n", + " print(\"Not sure what you mean. Type 'explore' or 'examine'.\")\n", + " play_room(room)\n", + " linebreak()\n", + "\n", + "def explore_room(room):\n", + " \"\"\"\n", + " Explore a room. List all items belonging to this room.\n", + " \"\"\"\n", + " items = [i[\"name\"] for i in object_relations[room[\"name\"]]]\n", + " print(\"You explore the room. This is \" + room[\"name\"] + \". You find \" + \", \".join(items))\n", + "\n", + "def get_next_room_of_door(door, current_room):\n", + " \"\"\"\n", + " From object_relations, find the two rooms connected to the given door.\n", + " Return the room that is not the current_room.\n", + " \"\"\"\n", + " connected_rooms = object_relations[door[\"name\"]]\n", + " for room in connected_rooms:\n", + " if(not current_room == room):\n", + " return room\n", + "\n", + "def examine_item(item_name):\n", + " \"\"\"\n", + " Examine an item which can be a door or furniture.\n", + " First make sure the intended item belongs to the current room.\n", + " Then check if the item is a door. Tell player if key hasn't been \n", + " collected yet. Otherwise ask player if they want to go to the next\n", + " room. If the item is not a door, then check if it contains keys.\n", + " Collect the key if found and update the game state. At the end,\n", + " play either the current or the next room depending on the game state\n", + " to keep playing.\n", + " \"\"\"\n", + " current_room = game_state[\"current_room\"]\n", + " next_room = \"\"\n", + " output = None\n", + " \n", + " for item in object_relations[current_room[\"name\"]]:\n", + " if(item[\"name\"] == item_name):\n", + " output = \"You examine \" + item_name + \". \"\n", + " if(item[\"type\"] == \"door\"):\n", + " have_key = False\n", + " for key in game_state[\"keys_collected\"]:\n", + " if(key[\"target\"] == item):\n", + " have_key = True\n", + " if(have_key):\n", + " output += \"You unlock it with a key you have.\"\n", + " next_room = get_next_room_of_door(item, current_room)\n", + " else:\n", + " output += \"It is locked but you don't have the key.\"\n", + " else:\n", + " if(item[\"name\"] in object_relations and len(object_relations[item[\"name\"]])>0):\n", + " item_found = object_relations[item[\"name\"]].pop()\n", + " game_state[\"keys_collected\"].append(item_found)\n", + " output += \"You find \" + item_found[\"name\"] + \".\"\n", + " else:\n", + " output += \"There isn't anything interesting about it.\"\n", + " print(output)\n", + " break\n", + "\n", + " if(output is None):\n", + " print(\"The item you requested is not found in the current room.\")\n", + " \n", + " if(next_room and input(\"Do you want to go to the next room? Ener 'yes' or 'no'\").strip() == 'yes'):\n", + " play_room(next_room)\n", + " else:\n", + " play_room(current_room)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "You wake up on a couch and find yourself in a strange house with no windows which you have never been to before. You don't remember why you are here and what had happened before. You feel some unknown danger is approaching and you must get out of the house, NOW!\n", + "You are now in game room\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?piano\n", + "You examine piano. You find key for door a.\n", + "You are now in game room\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?door a\n", + "You examine door a. You unlock it with a key you have.\n", + "Do you want to go to the next room? Ener 'yes' or 'no'yes\n", + "You are now in bedroom 1\n", + "What would you like to do? Type 'explore' or 'examine'?explore\n", + "You explore the room. This is bedroom 1. You find queen bed, door a, door b, door c\n", + "You are now in bedroom 1\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?queen bed\n", + "You examine queen bed. You find key for door b.\n", + "You are now in bedroom 1\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?door b\n", + "You examine door b. You unlock it with a key you have.\n", + "Do you want to go to the next room? Ener 'yes' or 'no'yes\n", + "You are now in bedroom 2\n", + "What would you like to do? Type 'explore' or 'examine'?explore\n", + "You explore the room. This is bedroom 2. You find double bed, dresser, door b\n", + "You are now in bedroom 2\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?double bed\n", + "You examine double bed. You find key for door c.\n", + "You are now in bedroom 2\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?dresser\n", + "You examine dresser. You find key for door d.\n", + "You are now in bedroom 2\n", + "What would you like to do? Type 'explore' or 'examine'?explore\n", + "You explore the room. This is bedroom 2. You find double bed, dresser, door b\n", + "You are now in bedroom 2\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?door b\n", + "You examine door b. You unlock it with a key you have.\n", + "Do you want to go to the next room? Ener 'yes' or 'no'yes\n", + "You are now in bedroom 1\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?door c\n", + "You examine door c. You unlock it with a key you have.\n", + "Do you want to go to the next room? Ener 'yes' or 'no'yes\n", + "You are now in living room\n", + "What would you like to do? Type 'explore' or 'examine'?examine\n", + "What would you like to examine?door d\n", + "You examine door d. You unlock it with a key you have.\n", + "Do you want to go to the next room? Ener 'yes' or 'no'yes\n", + "Congrats! You escaped the room!\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ] + } + ], + "source": [ + "game_state = INIT_GAME_STATE.copy()\n", + "\n", + "start_game()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.4" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} From 7c77931db3e82b98e9f54a57a5ef93a6d05ac767 Mon Sep 17 00:00:00 2001 From: evill013 <53126587+evill013@users.noreply.github.com> Date: Thu, 26 Sep 2019 12:17:36 -0400 Subject: [PATCH 06/38] SOLVED --- .../your-code/main.ipynb | 606 +++++++++++------- 1 file changed, 391 insertions(+), 215 deletions(-) diff --git a/module-1/lab-list-comprehensions/your-code/main.ipynb b/module-1/lab-list-comprehensions/your-code/main.ipynb index 02922ca2c..052e09937 100644 --- a/module-1/lab-list-comprehensions/your-code/main.ipynb +++ b/module-1/lab-list-comprehensions/your-code/main.ipynb @@ -11,7 +11,7 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -29,7 +29,7 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 2, "metadata": {}, "outputs": [ { @@ -54,7 +54,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 3, "metadata": {}, "outputs": [ { @@ -79,7 +79,7 @@ }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -97,7 +97,7 @@ }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 5, "metadata": {}, "outputs": [ { @@ -142,18 +142,20 @@ " 0.80930099,\n", " 0.50962849,\n", " 0.94555126,\n", - " 0.33364763,\n", " 0.33364763]" ] }, - "execution_count": 42, + "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "nump_lst=[val for sublist in a for val in sublist]\n", - "nump_lst.append(val)\n", + "nump_lst = []\n", + "\n", + "for x in a:\n", + " for elmt in x:\n", + " nump_lst.append(elmt)\n", "\n", "nump_lst" ] @@ -167,7 +169,7 @@ }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 6, "metadata": {}, "outputs": [ { @@ -191,19 +193,22 @@ " 0.9976855,\n", " 0.80930099,\n", " 0.50962849,\n", - " 0.94555126,\n", - " 0.33364763]" + " 0.94555126]" ] }, - "execution_count": 43, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "nump_list=[val for sublist in a for val in sublist if val >= 0.5]\n", - "nump_list.append(val)\n", + "nump_list=[]\n", "\n", + "for sublist in a:\n", + " for i in sublist:\n", + " if i >= 0.5:\n", + " nump_list.append(i)\n", + " \n", "nump_list" ] }, @@ -216,7 +221,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 7, "metadata": {}, "outputs": [ { @@ -243,9 +248,9 @@ " [[0.07529684, 0.46470767, 0.47984544],\n", " [0.65368638, 0.14901286, 0.23760688]]])\n", "\n", - "new_nplst=[val for lst in b for sublist in lst for val in sublist]\n", + "new_nplst=[i for lst in b for sublist in lst for i in sublist]\n", "\n", - "new_nplst.append(val)\n", + "new_nplst.append(i)\n", "\n", "print(new_nplst)" ] @@ -259,7 +264,7 @@ }, { "cell_type": "code", - "execution_count": 118, + "execution_count": 8, "metadata": {}, "outputs": [ { @@ -287,7 +292,7 @@ }, { "cell_type": "code", - "execution_count": 147, + "execution_count": 9, "metadata": { "scrolled": true }, @@ -296,9 +301,33 @@ "name": "stdout", "output_type": "stream", "text": [ - "['sample_file_0.csv', 'sample_file_1.csv', 'sample_file_2.csv', 'sample_file_3.csv', 'sample_file_4.csv', 'sample_file_5.csv', 'sample_file_6.csv', 'sample_file_7.csv', 'sample_file_8.csv', 'sample_file_9.csv', 'test.csv', 'sample_file_9.txt']\n" + "['sample_file_0.csv', 'sample_file_1.csv', 'sample_file_2.csv', 'sample_file_3.csv', 'sample_file_4.csv', 'sample_file_5.csv', 'sample_file_6.csv', 'sample_file_7.csv', 'sample_file_8.csv', 'sample_file_9.csv']\n" ] - }, + } + ], + "source": [ + "files = os.listdir(\"../data\")\n", + "\n", + "file_lst=[]\n", + "for f in files:\n", + " if f.endswith(\".csv\"):\n", + " file_lst.append(f)\n", + "\n", + "print(file_lst)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 7. Use a list comprehension and the Pandas `read_csv` and `concat` methods to read all CSV files in the */data* directory and combine them into a single data frame. Display the top 10 rows of the resulting data frame." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ { "data": { "text/html": [ @@ -345,118 +374,233 @@ " \n", " \n", " 0\n", - " 0.215190\n", - " 0.155352\n", - " 0.160848\n", - " 0.807736\n", - " 0.363587\n", - " 0.899832\n", - " 0.146754\n", - " 0.094802\n", - " 0.705133\n", - " 0.882762\n", - " 0.773320\n", - " 0.687745\n", - " 0.016789\n", - " 0.340725\n", - " 0.984182\n", - " 0.985461\n", - " 0.412044\n", - " 0.867894\n", - " 0.113432\n", - " 0.349845\n", + " 0.734751\n", + " 0.195362\n", + " 0.734309\n", + " 0.598184\n", + " 0.763433\n", + " 0.263434\n", + " 0.868066\n", + " 0.058092\n", + " 0.753502\n", + " 0.587513\n", + " 0.311608\n", + " 0.178356\n", + " 0.182922\n", + " 0.147631\n", + " 0.391188\n", + " 0.816049\n", + " 0.749068\n", + " 0.293260\n", + " 0.937828\n", + " 0.880858\n", + " \n", + " \n", + " 1\n", + " 0.772607\n", + " 0.445391\n", + " 0.249642\n", + " 0.787922\n", + " 0.598583\n", + " 0.827238\n", + " 0.624126\n", + " 0.601524\n", + " 0.688753\n", + " 0.338870\n", + " 0.081595\n", + " 0.471474\n", + " 0.267443\n", + " 0.453351\n", + " 0.800716\n", + " 0.045749\n", + " 0.683793\n", + " 0.389789\n", + " 0.016787\n", + " 0.503695\n", + " \n", + " \n", + " 2\n", + " 0.226428\n", + " 0.268764\n", + " 0.694262\n", + " 0.622335\n", + " 0.063843\n", + " 0.122683\n", + " 0.815625\n", + " 0.584542\n", + " 0.032594\n", + " 0.589775\n", + " 0.764350\n", + " 0.650973\n", + " 0.565705\n", + " 0.691784\n", + " 0.265223\n", + " 0.739031\n", + " 0.560394\n", + " 0.334802\n", + " 0.517694\n", + " 0.646110\n", + " \n", + " \n", + " 3\n", + " 0.362748\n", + " 0.495430\n", + " 0.113876\n", + " 0.594149\n", + " 0.612522\n", + " 0.625204\n", + " 0.864050\n", + " 0.260279\n", + " 0.528873\n", + " 0.168043\n", + " 0.715929\n", + " 0.677014\n", + " 0.175735\n", + " 0.632370\n", + " 0.926715\n", + " 0.085675\n", + " 0.120525\n", + " 0.141746\n", + " 0.771144\n", + " 0.489660\n", + " \n", + " \n", + " 4\n", + " 0.033415\n", + " 0.340433\n", + " 0.464971\n", + " 0.363737\n", + " 0.025815\n", + " 0.434129\n", + " 0.415163\n", + " 0.892210\n", + " 0.381701\n", + " 0.415264\n", + " 0.790801\n", + " 0.696930\n", + " 0.819751\n", + " 0.944029\n", + " 0.869965\n", + " 0.041723\n", + " 0.819140\n", + " 0.676051\n", + " 0.109349\n", + " 0.872947\n", + " \n", + " \n", + " 0\n", + " 0.276827\n", + " 0.260054\n", + " 0.942397\n", + " 0.113187\n", + " 0.781355\n", + " 0.475740\n", + " 0.152061\n", + " 0.250324\n", + " 0.147078\n", + " 0.162984\n", + " 0.977025\n", + " 0.509619\n", + " 0.593212\n", + " 0.911839\n", + " 0.257645\n", + " 0.386457\n", + " 0.696932\n", + " 0.069162\n", + " 0.952291\n", + " 0.286542\n", " \n", " \n", " 1\n", - " 0.895544\n", - " 0.955196\n", - " 0.089925\n", - " 0.827555\n", - " 0.089071\n", - " 0.642883\n", - " 0.996052\n", - " 0.879020\n", - " 0.421837\n", - " 0.412141\n", - " 0.858513\n", - " 0.217091\n", - " 0.176157\n", - " 0.551236\n", - " 0.834378\n", - " 0.419535\n", - " 0.041431\n", - " 0.602258\n", - " 0.984628\n", - " 0.516899\n", + " 0.995885\n", + " 0.158381\n", + " 0.244274\n", + " 0.962163\n", + " 0.651900\n", + " 0.930665\n", + " 0.577190\n", + " 0.087914\n", + " 0.960261\n", + " 0.580840\n", + " 0.194616\n", + " 0.661459\n", + " 0.674085\n", + " 0.049326\n", + " 0.785803\n", + " 0.315645\n", + " 0.495355\n", + " 0.232135\n", + " 0.549324\n", + " 0.572232\n", " \n", " \n", " 2\n", - " 0.413752\n", - " 0.693052\n", - " 0.789796\n", - " 0.929164\n", - " 0.536191\n", - " 0.439769\n", - " 0.773474\n", - " 0.982074\n", - " 0.876955\n", - " 0.633154\n", - " 0.279005\n", - " 0.483317\n", - " 0.908288\n", - " 0.756172\n", - " 0.462130\n", - " 0.289892\n", - " 0.145233\n", - " 0.076819\n", - " 0.797836\n", - " 0.197592\n", + " 0.641917\n", + " 0.821055\n", + " 0.392437\n", + " 0.782617\n", + " 0.510762\n", + " 0.428320\n", + " 0.017324\n", + " 0.680720\n", + " 0.340412\n", + " 0.462513\n", + " 0.785776\n", + " 0.251949\n", + " 0.032847\n", + " 0.995700\n", + " 0.816563\n", + " 0.735692\n", + " 0.435998\n", + " 0.430411\n", + " 0.531757\n", + " 0.489528\n", " \n", " \n", " 3\n", - " 0.728001\n", - " 0.348156\n", - " 0.935787\n", - " 0.851163\n", - " 0.444573\n", - " 0.715080\n", - " 0.988408\n", - " 0.210332\n", - " 0.732133\n", - " 0.892383\n", - " 0.216893\n", - " 0.367595\n", - " 0.846208\n", - " 0.240111\n", - " 0.471880\n", - " 0.399721\n", - " 0.758196\n", - " 0.665568\n", - " 0.931542\n", - " 0.448124\n", + " 0.806532\n", + " 0.569258\n", + " 0.148175\n", + " 0.809987\n", + " 0.459632\n", + " 0.735762\n", + " 0.730664\n", + " 0.934502\n", + " 0.080322\n", + " 0.763502\n", + " 0.398504\n", + " 0.027637\n", + " 0.409665\n", + " 0.942846\n", + " 0.133256\n", + " 0.157158\n", + " 0.929446\n", + " 0.402791\n", + " 0.685976\n", + " 0.246594\n", " \n", " \n", " 4\n", - " 0.134942\n", - " 0.875931\n", - " 0.273505\n", - " 0.207588\n", - " 0.080696\n", - " 0.717396\n", - " 0.033930\n", - " 0.646837\n", - " 0.888722\n", - " 0.922742\n", - " 0.176593\n", - " 0.861333\n", - " 0.389451\n", - " 0.695244\n", - " 0.129955\n", - " 0.364114\n", - " 0.428224\n", - " 0.365442\n", - " 0.847818\n", - " 0.588319\n", + " 0.311185\n", + " 0.501165\n", + " 0.365979\n", + " 0.782807\n", + " 0.776795\n", + " 0.797199\n", + " 0.791946\n", + " 0.847157\n", + " 0.771811\n", + " 0.233944\n", + " 0.522344\n", + " 0.053030\n", + " 0.208551\n", + " 0.824354\n", + " 0.588567\n", + " 0.604341\n", + " 0.232964\n", + " 0.229109\n", + " 0.022881\n", + " 0.479022\n", " \n", " \n", "\n", @@ -464,154 +608,186 @@ ], "text/plain": [ " 0 1 2 3 4 5 6 \\\n", - "0 0.215190 0.155352 0.160848 0.807736 0.363587 0.899832 0.146754 \n", - "1 0.895544 0.955196 0.089925 0.827555 0.089071 0.642883 0.996052 \n", - "2 0.413752 0.693052 0.789796 0.929164 0.536191 0.439769 0.773474 \n", - "3 0.728001 0.348156 0.935787 0.851163 0.444573 0.715080 0.988408 \n", - "4 0.134942 0.875931 0.273505 0.207588 0.080696 0.717396 0.033930 \n", + "0 0.734751 0.195362 0.734309 0.598184 0.763433 0.263434 0.868066 \n", + "1 0.772607 0.445391 0.249642 0.787922 0.598583 0.827238 0.624126 \n", + "2 0.226428 0.268764 0.694262 0.622335 0.063843 0.122683 0.815625 \n", + "3 0.362748 0.495430 0.113876 0.594149 0.612522 0.625204 0.864050 \n", + "4 0.033415 0.340433 0.464971 0.363737 0.025815 0.434129 0.415163 \n", + "0 0.276827 0.260054 0.942397 0.113187 0.781355 0.475740 0.152061 \n", + "1 0.995885 0.158381 0.244274 0.962163 0.651900 0.930665 0.577190 \n", + "2 0.641917 0.821055 0.392437 0.782617 0.510762 0.428320 0.017324 \n", + "3 0.806532 0.569258 0.148175 0.809987 0.459632 0.735762 0.730664 \n", + "4 0.311185 0.501165 0.365979 0.782807 0.776795 0.797199 0.791946 \n", "\n", " 7 8 9 10 11 12 13 \\\n", - "0 0.094802 0.705133 0.882762 0.773320 0.687745 0.016789 0.340725 \n", - "1 0.879020 0.421837 0.412141 0.858513 0.217091 0.176157 0.551236 \n", - "2 0.982074 0.876955 0.633154 0.279005 0.483317 0.908288 0.756172 \n", - "3 0.210332 0.732133 0.892383 0.216893 0.367595 0.846208 0.240111 \n", - "4 0.646837 0.888722 0.922742 0.176593 0.861333 0.389451 0.695244 \n", + "0 0.058092 0.753502 0.587513 0.311608 0.178356 0.182922 0.147631 \n", + "1 0.601524 0.688753 0.338870 0.081595 0.471474 0.267443 0.453351 \n", + "2 0.584542 0.032594 0.589775 0.764350 0.650973 0.565705 0.691784 \n", + "3 0.260279 0.528873 0.168043 0.715929 0.677014 0.175735 0.632370 \n", + "4 0.892210 0.381701 0.415264 0.790801 0.696930 0.819751 0.944029 \n", + "0 0.250324 0.147078 0.162984 0.977025 0.509619 0.593212 0.911839 \n", + "1 0.087914 0.960261 0.580840 0.194616 0.661459 0.674085 0.049326 \n", + "2 0.680720 0.340412 0.462513 0.785776 0.251949 0.032847 0.995700 \n", + "3 0.934502 0.080322 0.763502 0.398504 0.027637 0.409665 0.942846 \n", + "4 0.847157 0.771811 0.233944 0.522344 0.053030 0.208551 0.824354 \n", "\n", " 14 15 16 17 18 19 \n", - "0 0.984182 0.985461 0.412044 0.867894 0.113432 0.349845 \n", - "1 0.834378 0.419535 0.041431 0.602258 0.984628 0.516899 \n", - "2 0.462130 0.289892 0.145233 0.076819 0.797836 0.197592 \n", - "3 0.471880 0.399721 0.758196 0.665568 0.931542 0.448124 \n", - "4 0.129955 0.364114 0.428224 0.365442 0.847818 0.588319 " + "0 0.391188 0.816049 0.749068 0.293260 0.937828 0.880858 \n", + "1 0.800716 0.045749 0.683793 0.389789 0.016787 0.503695 \n", + "2 0.265223 0.739031 0.560394 0.334802 0.517694 0.646110 \n", + "3 0.926715 0.085675 0.120525 0.141746 0.771144 0.489660 \n", + "4 0.869965 0.041723 0.819140 0.676051 0.109349 0.872947 \n", + "0 0.257645 0.386457 0.696932 0.069162 0.952291 0.286542 \n", + "1 0.785803 0.315645 0.495355 0.232135 0.549324 0.572232 \n", + "2 0.816563 0.735692 0.435998 0.430411 0.531757 0.489528 \n", + "3 0.133256 0.157158 0.929446 0.402791 0.685976 0.246594 \n", + "4 0.588567 0.604341 0.232964 0.229109 0.022881 0.479022 " ] }, - "execution_count": 147, + "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "files = os.listdir(\"../data\")\n", - "\n", - "file_lst=[f for f in files if f.endswith(\".csv\")]\n", - "file_lst.append(f)\n", - "\n", - "print(file_lst)\n", + "data_frame=pd.concat([pd.read_csv(\"../data/\"+ f) for f in file_lst ])\n", + "pd.DataFrame(data_frame) \n", + "#df.to_csv(r'../data/test.csv')\n", "\n", - "pd.read_csv(\"../data/\"+ f)" + "data_frame.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 7. Use a list comprehension and the Pandas `read_csv` and `concat` methods to read all CSV files in the */data* directory and combine them into a single data frame. Display the top 10 rows of the resulting data frame." + "### 8. Use a list comprehension to select and print the column numbers for columns from the data set whose median is less than 0.48." ] }, { "cell_type": "code", - "execution_count": 148, + "execution_count": 11, "metadata": {}, "outputs": [ { - "name": "stderr", + "name": "stdout", "output_type": "stream", "text": [ - "/usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version\n", - "of pandas will change to not sort by default.\n", - "\n", - "To accept the future behavior, pass 'sort=False'.\n", - "\n", - "To retain the current behavior and silence the warning, pass 'sort=True'.\n", - "\n", - " \"\"\"Entry point for launching an IPython kernel.\n" + "['1', '9', '12']\n" ] - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
0
0...
\n", - "
" - ], - "text/plain": [ - " 0\n", - "0 ..." - ] - }, - "execution_count": 148, - "metadata": {}, - "output_type": "execute_result" } ], "source": [ - "data_frame=[pd.concat([pd.read_csv(\"../data/\"+ f) for f in file_lst ])]\n", - "df = pd.DataFrame(data_frame) \n", - "#df.to_csv(r'../data/test.csv')\n", - "\n", - "df.head(10)" + "selected_columns = [col for col in data_frame if data_frame[col].mean() < 0.48]\n", + "print(selected_columns)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 8. Use a list comprehension to select and print the column numbers for columns from the data set whose median is less than 0.48." + "### 9. Use a list comprehension to add a new column (20) to the data frame whose values are the values in column 19 minus 0.1. Display the top 10 rows of the resulting data frame." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 12, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0 0.780858\n", + "1 0.403695\n", + "2 0.546110\n", + "3 0.389660\n", + "4 0.772947\n", + "0 0.186542\n", + "1 0.472232\n", + "2 0.389528\n", + "3 0.146594\n", + "4 0.379022\n", + "0 0.703393\n", + "1 0.864889\n", + "2 0.213071\n", + "3 0.773029\n", + "4 0.568447\n", + "0 0.113938\n", + "1 0.210951\n", + "2 0.296453\n", + "3 -0.073048\n", + "4 0.831145\n", + "0 0.472150\n", + "1 -0.039694\n", + "2 -0.051260\n", + "3 0.460742\n", + "4 0.077401\n", + "0 0.845250\n", + "1 0.015311\n", + "2 0.482192\n", + "3 0.428075\n", + "4 0.727487\n", + "0 0.451858\n", + "1 0.586294\n", + "2 0.103699\n", + "3 0.639383\n", + "4 0.567051\n", + "0 0.195344\n", + "1 0.477262\n", + "2 -0.097404\n", + "3 0.826007\n", + "4 0.692851\n", + "0 0.868681\n", + "1 0.866198\n", + "2 0.624486\n", + "3 0.737695\n", + "4 0.298914\n", + "0 0.249845\n", + "1 0.416899\n", + "2 0.097592\n", + "3 0.348124\n", + "4 0.488319\n", + "Name: 20, dtype: float64\n" + ] + } + ], + "source": [ + "data_frame ['20']= [cell-0.1 for cell in data_frame['19']]\n", + "print(data_frame['20'])" + ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 9. Use a list comprehension to add a new column (20) to the data frame whose values are the values in column 19 minus 0.1. Display the top 10 rows of the resulting data frame." + "### 10. Use a list comprehension to extract and print all values from the data set that are between 0.7 and 0.75." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 86, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.7726065618884856, 0.9958848354862676, 0.8065316312867377, 0.948664301130462, 0.9341361023782744, 0.7570384010450522, 0.9013813844892271, 0.7638227991824795, 0.9561683913109984, 0.8091290158551179, 0.9360652845266276, 0.8955436373659004, 0.8210553167704127, 0.8039263164643641, 0.9189637493493394, 0.8162825701475874, 0.8965896767032822, 0.7909438089282793, 0.8545959675361845, 0.7676037591340263, 0.9551964568754536, 0.8759310591478083, 0.9423969993645016, 0.9182696050929382, 0.9166552105313126, 0.9540565244826162, 0.8026630492814062, 0.8333676394186291, 0.7977301593436705, 0.8923284352936591, 0.9593545317726128, 0.8505283589030188, 0.807188895572174, 0.7897963029599847, 0.9357868966513616, 0.7879222598307901, 0.9621633670380298, 0.7826166033304552, 0.8099868091360504, 0.7828069868371588, 0.7752339688339761, 0.8533873364185985, 0.8376860918849494, 0.9771886139272962, 0.92192270004039, 0.9752685500281912, 0.8177851323554917, 0.8578024096347506, 0.9184018818607405, 0.7525175693305356, 0.8077364370775799, 0.8275552972468996, 0.929163515820231, 0.8511625939208394, 0.7634334207035142, 0.7813550645836627, 0.7767950291165509, 0.7551203080129055, 0.9117988168842972, 0.8034908279201619, 0.7606875740566486, 0.8835049031816071, 0.7618047467529835, 0.8261716403125922, 0.8514243308071603, 0.8272382506031565, 0.9306647449477456, 0.7971987987858091, 0.9716089362466106, 0.8190320451738211, 0.7642091848730479, 0.8293465033900734, 0.9577397691597084, 0.8869213339995774, 0.8155857108157181, 0.7863019458267585, 0.8397564083916338, 0.8889377387462668, 0.962085889248278, 0.8998323533564464, 0.8680663549990865, 0.8156248348531779, 0.8640498341918672, 0.7919455450866845, 0.9102079618289258, 0.9414125645626396, 0.9724292380028491, 0.9447582771063441, 0.9809157775823522, 0.9101860556639954, 0.9684654942198184, 0.9803877212263026, 0.9600664601991564, 0.9960524391565686, 0.7734736363278485, 0.9884082519395649, 0.8922104131249807, 0.9345023904998434, 0.8471569562536997, 0.8712035961024672, 0.8752586492257234, 0.9868895989992108, 0.8083697481962028, 0.7747604230976926, 0.7778847226476526, 0.7660061435457575, 0.7742685105146389, 0.9509701540481998, 0.9373500528853604, 0.8556734272408868, 0.7643789400964608, 0.9454621278249584, 0.9215078483469727, 0.9807003100165752, 0.8790198844792824, 0.9820737039870038, 0.7535023641545986, 0.9602606770551084, 0.7718114664973373, 0.9323875156439116, 0.9384436225941936, 0.898233985786227, 0.7756233623158909, 0.7926918683952956, 0.9736146616584688, 0.9985283282262558, 0.9123481445545524, 0.8769548324923689, 0.8887221433812952, 0.7635021345768744, 0.8675113261037845, 0.9569491000630076, 0.7831069316157523, 0.847337370368865, 0.8958104669248041, 0.8921827161950341, 0.8827620542316649, 0.8923825308653724, 0.9227421284008488, 0.7643495353716243, 0.7908008486938785, 0.9770254336405146, 0.7857758997289559, 0.8870537312874786, 0.8133086138445641, 0.8117399389362686, 0.7881982261001286, 0.7500072865719334, 0.7687252329274414, 0.8324307593657718, 0.8895011009135932, 0.8646641441272116, 0.8986325771219401, 0.7733197144915908, 0.8585132170719993, 0.848865583423898, 0.7661529839830151, 0.99348146349248, 0.924226259547408, 0.7903119731466789, 0.9622141734634174, 0.9021571279934334, 0.9440772327200532, 0.7765520162518851, 0.8613334765924818, 0.8197511267505602, 0.9153418393802571, 0.8564368122654824, 0.8478796266730341, 0.940414111517426, 0.8412100553830514, 0.8984636884271819, 0.9316232786251768, 0.7565927719674911, 0.9082883777050392, 0.8462078521626053, 0.944028613970866, 0.9118389221659232, 0.9957004548575084, 0.942845729801833, 0.8243542373861713, 0.7657571843839385, 0.8508208979364613, 0.8488117117792991, 0.8789220036417789, 0.8323370924450664, 0.7734564609970822, 0.9532282003696234, 0.9407298955058142, 0.8247887205530644, 0.991854969105204, 0.8415754262938553, 0.7884974781268034, 0.8409380093100868, 0.7702954351316351, 0.9084985510834864, 0.7561716566060438, 0.8007161479739965, 0.9267149333179154, 0.8699647196644326, 0.7858030110348141, 0.8165630269275471, 0.9312638809606996, 0.948127217802303, 0.7984665743201926, 0.9600301286589044, 0.9333774298018972, 0.8210990892928189, 0.8446554617758595, 0.9841824171540198, 0.83437813647431, 0.8160493911980281, 0.97977691003667, 0.8489259164852261, 0.9696566391533, 0.9714518998942214, 0.7890782517599441, 0.9572423023495398, 0.8108067566208851, 0.9721353959388824, 0.8186063635991061, 0.87723888899012, 0.7669665288649764, 0.9519073414777496, 0.7660331528173858, 0.9578236531102852, 0.7727975101727471, 0.9854608748806004, 0.8191396818327873, 0.9294460292904924, 0.9547833146802479, 0.8227749935788521, 0.9928973285511736, 0.9475593173591972, 0.8938414589624106, 0.954667818131447, 0.8069639236247445, 0.9736839804909522, 0.9683240132606706, 0.7581961610392675, 0.8589556278865201, 0.8866435207534104, 0.7731550667497674, 0.8561348022144614, 0.7528657050578258, 0.8073996779008903, 0.9045926452518674, 0.9822559636900514, 0.792803388861299, 0.867894228815596, 0.93782845947314, 0.7711443458405843, 0.9522914787919003, 0.8415392556348382, 0.8770484476450304, 0.9628267209870766, 0.8303889161870508, 0.8160364023085047, 0.8855156771852954, 0.7739529437132698, 0.8036286169027608, 0.7944259173248532, 0.8913815362537028, 0.9846278416523276, 0.7978360408802792, 0.9315422182707142, 0.8478184311696535, 0.8808575048234396, 0.8729470926088846, 0.8033930409580708, 0.9648886658100324, 0.8730292489443628, 0.93114474390249, 0.9452500269680332, 0.8274869207330491, 0.9260066079467686, 0.7928514808864792, 0.9686807909262536, 0.9661983776596004, 0.8376946446827493, 0.7808575048234396, 0.7729470926088846, 0.8648886658100324, 0.7730292489443629, 0.83114474390249, 0.8452500269680332, 0.8260066079467686, 0.8686807909262536, 0.8661983776596004]\n" + ] + } + ], + "source": [ + "values=[value for index in data_frame for value in data_frame[index] if 0.7 <= value >= 0.75]\n", + "print(values)\n" + ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 78, "metadata": {}, - "source": [ - "### 10. Use a list comprehension to extract and print all values from the data set that are between 0.7 and 0.75." - ] + "outputs": [], + "source": [] }, { "cell_type": "code", From 12fb4c0cffd50b46e7a8345d92ca7ca6d04b9c2d Mon Sep 17 00:00:00 2001 From: evill013 <53126587+evill013@users.noreply.github.com> Date: Thu, 26 Sep 2019 12:22:02 -0400 Subject: [PATCH 07/38] SOLVED --- .../lab-advanced-regex/your-code/main.ipynb | 820 ++++++++++++++---- 1 file changed, 642 insertions(+), 178 deletions(-) diff --git a/module-1/lab-advanced-regex/your-code/main.ipynb b/module-1/lab-advanced-regex/your-code/main.ipynb index b898da503..052e09937 100644 --- a/module-1/lab-advanced-regex/your-code/main.ipynb +++ b/module-1/lab-advanced-regex/your-code/main.ipynb @@ -4,327 +4,791 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Advanced Regular Expressions Lab\n", + "# List Comprehensions Lab\n", "\n", - "Complete the following set of exercises to solidify your knowledge of regular expressions." + "Complete the following set of exercises to solidify your knowledge of list comprehensions." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ - "import re" + "import os\n", + "import numpy as np\n", + "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 1. Use a regular expression to find and extract all vowels in the following text." + "### 1. Use a list comprehension to create and print a list of consecutive integers starting with 1 and ending with 50." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]\n" + ] + } + ], "source": [ - "text = \"This is going to be a sentence with a good number of vowels in it.\"" + "lst= [x for x in range(1,51)]\n", + "print(lst)" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 2. Use a regular expression to find and extract all occurrences and tenses (singular and plural) of the word \"puppy\" in the text below." + "### 2. Use a list comprehension to create and print a list of even numbers starting with 2 and ending with 200." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200]\n" + ] + } + ], "source": [ - "text = \"The puppy saw all the rest of the puppies playing and wanted to join them. I saw this and wanted a puppy of my own!\"" + "even_lst=[x for x in range(2,201,2)]\n", + "print (even_lst)" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 3. Use a regular expression to find and extract all tenses (present and past) of the word \"run\" in the text below." + "### 3. Use a list comprehension to create and print a list containing all elements of the 10 x 4 Numpy array below." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ - "text = \"I ran the relay race the only way I knew how to run it.\"" + "a = np.array([[0.84062117, 0.48006452, 0.7876326 , 0.77109654],\n", + " [0.44409793, 0.09014516, 0.81835917, 0.87645456],\n", + " [0.7066597 , 0.09610873, 0.41247947, 0.57433389],\n", + " [0.29960807, 0.42315023, 0.34452557, 0.4751035 ],\n", + " [0.17003563, 0.46843998, 0.92796258, 0.69814654],\n", + " [0.41290051, 0.19561071, 0.16284783, 0.97016248],\n", + " [0.71725408, 0.87702738, 0.31244595, 0.76615487],\n", + " [0.20754036, 0.57871812, 0.07214068, 0.40356048],\n", + " [0.12149553, 0.53222417, 0.9976855 , 0.12536346],\n", + " [0.80930099, 0.50962849, 0.94555126, 0.33364763]])\n" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[0.84062117,\n", + " 0.48006452,\n", + " 0.7876326,\n", + " 0.77109654,\n", + " 0.44409793,\n", + " 0.09014516,\n", + " 0.81835917,\n", + " 0.87645456,\n", + " 0.7066597,\n", + " 0.09610873,\n", + " 0.41247947,\n", + " 0.57433389,\n", + " 0.29960807,\n", + " 0.42315023,\n", + " 0.34452557,\n", + " 0.4751035,\n", + " 0.17003563,\n", + " 0.46843998,\n", + " 0.92796258,\n", + " 0.69814654,\n", + " 0.41290051,\n", + " 0.19561071,\n", + " 0.16284783,\n", + " 0.97016248,\n", + " 0.71725408,\n", + " 0.87702738,\n", + " 0.31244595,\n", + " 0.76615487,\n", + " 0.20754036,\n", + " 0.57871812,\n", + " 0.07214068,\n", + " 0.40356048,\n", + " 0.12149553,\n", + " 0.53222417,\n", + " 0.9976855,\n", + " 0.12536346,\n", + " 0.80930099,\n", + " 0.50962849,\n", + " 0.94555126,\n", + " 0.33364763]" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "### 4. Use a regular expression to find and extract all words that begin with the letter \"r\" from the previous text." + "nump_lst = []\n", + "\n", + "for x in a:\n", + " for elmt in x:\n", + " nump_lst.append(elmt)\n", + "\n", + "nump_lst" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 5. Use a regular expression to find and substitute the letter \"i\" for the exclamation marks in the text below." + "### 4. Add a condition to the list comprehension above so that only values greater than or equal to 0.5 are printed." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[0.84062117,\n", + " 0.7876326,\n", + " 0.77109654,\n", + " 0.81835917,\n", + " 0.87645456,\n", + " 0.7066597,\n", + " 0.57433389,\n", + " 0.92796258,\n", + " 0.69814654,\n", + " 0.97016248,\n", + " 0.71725408,\n", + " 0.87702738,\n", + " 0.76615487,\n", + " 0.57871812,\n", + " 0.53222417,\n", + " 0.9976855,\n", + " 0.80930099,\n", + " 0.50962849,\n", + " 0.94555126]" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "text = \"Th!s !s a sentence w!th spec!al characters !n !t.\"" + "nump_list=[]\n", + "\n", + "for sublist in a:\n", + " for i in sublist:\n", + " if i >= 0.5:\n", + " nump_list.append(i)\n", + " \n", + "nump_list" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 6. Use a regular expression to find and extract words longer than 4 characters in the text below." + "### 5. Use a list comprehension to create and print a list containing all elements of the 5 x 2 x 3 Numpy array below." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.55867166, 0.06210792, 0.08147297, 0.82579068, 0.91512478, 0.06833034, 0.05440634, 0.65857693, 0.30296619, 0.06769833, 0.96031863, 0.51293743, 0.09143215, 0.71893382, 0.45850679, 0.58256464, 0.59005654, 0.56266457, 0.71600294, 0.87392666, 0.11434044, 0.8694668, 0.65669313, 0.10708681, 0.07529684, 0.46470767, 0.47984544, 0.65368638, 0.14901286, 0.23760688, 0.33364763]\n" + ] + } + ], "source": [ - "text = \"This sentence has words of varying lengths.\"" + "b = np.array([[[0.55867166, 0.06210792, 0.08147297],\n", + " [0.82579068, 0.91512478, 0.06833034]],\n", + "\n", + " [[0.05440634, 0.65857693, 0.30296619],\n", + " [0.06769833, 0.96031863, 0.51293743]],\n", + "\n", + " [[0.09143215, 0.71893382, 0.45850679],\n", + " [0.58256464, 0.59005654, 0.56266457]],\n", + "\n", + " [[0.71600294, 0.87392666, 0.11434044],\n", + " [0.8694668 , 0.65669313, 0.10708681]],\n", + "\n", + " [[0.07529684, 0.46470767, 0.47984544],\n", + " [0.65368638, 0.14901286, 0.23760688]]])\n", + "\n", + "new_nplst=[i for lst in b for sublist in lst for i in sublist]\n", + "\n", + "new_nplst.append(i)\n", + "\n", + "print(new_nplst)" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 7. Use a regular expression to find and extract all occurrences of the letter \"b\", some letter(s), and then the letter \"t\" in the sentence below." + "### 5. Add a condition to the list comprehension above so that the last value in each subarray is printed, but only if it is less than or equal to 0.5." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.08147297, 0.06833034, 0.30296619, 0.45850679, 0.11434044, 0.10708681, 0.47984544, 0.23760688, 0.33364763]\n" + ] + } + ], "source": [ - "text = \"I bet the robot couldn't beat the other bot with a bat, but instead it bit me.\"" + "new_numplst=[sublist[-1] for lst in b for sublist in lst if sublist[-1] <= 0.5]\n", + "\n", + "new_numplst.append(sublist[-1])\n", + "\n", + "print(new_numplst)" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 8. Use a regular expression to find and extract all words that contain either \"ea\" or \"eo\" in them." + "### 6. Use a list comprehension to select and print the names of all CSV files in the */data* directory." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 9, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['sample_file_0.csv', 'sample_file_1.csv', 'sample_file_2.csv', 'sample_file_3.csv', 'sample_file_4.csv', 'sample_file_5.csv', 'sample_file_6.csv', 'sample_file_7.csv', 'sample_file_8.csv', 'sample_file_9.csv']\n" + ] + } + ], "source": [ - "text = \"During many of the peaks and troughs of history, the people living it didn't fully realize what was unfolding. But we all know we're navigating breathtaking history: Nearly every day could be — maybe will be — a book.\"\n" + "files = os.listdir(\"../data\")\n", + "\n", + "file_lst=[]\n", + "for f in files:\n", + " if f.endswith(\".csv\"):\n", + " file_lst.append(f)\n", + "\n", + "print(file_lst)\n" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 9. Use a regular expression to find and extract all the capitalized words in the text below individually." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "text = \"Teddy Roosevelt and Abraham Lincoln walk into a bar.\"" + "### 7. Use a list comprehension and the Pandas `read_csv` and `concat` methods to read all CSV files in the */data* directory and combine them into a single data frame. Display the top 10 rows of the resulting data frame." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
012345678910111213141516171819
00.7347510.1953620.7343090.5981840.7634330.2634340.8680660.0580920.7535020.5875130.3116080.1783560.1829220.1476310.3911880.8160490.7490680.2932600.9378280.880858
10.7726070.4453910.2496420.7879220.5985830.8272380.6241260.6015240.6887530.3388700.0815950.4714740.2674430.4533510.8007160.0457490.6837930.3897890.0167870.503695
20.2264280.2687640.6942620.6223350.0638430.1226830.8156250.5845420.0325940.5897750.7643500.6509730.5657050.6917840.2652230.7390310.5603940.3348020.5176940.646110
30.3627480.4954300.1138760.5941490.6125220.6252040.8640500.2602790.5288730.1680430.7159290.6770140.1757350.6323700.9267150.0856750.1205250.1417460.7711440.489660
40.0334150.3404330.4649710.3637370.0258150.4341290.4151630.8922100.3817010.4152640.7908010.6969300.8197510.9440290.8699650.0417230.8191400.6760510.1093490.872947
00.2768270.2600540.9423970.1131870.7813550.4757400.1520610.2503240.1470780.1629840.9770250.5096190.5932120.9118390.2576450.3864570.6969320.0691620.9522910.286542
10.9958850.1583810.2442740.9621630.6519000.9306650.5771900.0879140.9602610.5808400.1946160.6614590.6740850.0493260.7858030.3156450.4953550.2321350.5493240.572232
20.6419170.8210550.3924370.7826170.5107620.4283200.0173240.6807200.3404120.4625130.7857760.2519490.0328470.9957000.8165630.7356920.4359980.4304110.5317570.489528
30.8065320.5692580.1481750.8099870.4596320.7357620.7306640.9345020.0803220.7635020.3985040.0276370.4096650.9428460.1332560.1571580.9294460.4027910.6859760.246594
40.3111850.5011650.3659790.7828070.7767950.7971990.7919460.8471570.7718110.2339440.5223440.0530300.2085510.8243540.5885670.6043410.2329640.2291090.0228810.479022
\n", + "
" + ], + "text/plain": [ + " 0 1 2 3 4 5 6 \\\n", + "0 0.734751 0.195362 0.734309 0.598184 0.763433 0.263434 0.868066 \n", + "1 0.772607 0.445391 0.249642 0.787922 0.598583 0.827238 0.624126 \n", + "2 0.226428 0.268764 0.694262 0.622335 0.063843 0.122683 0.815625 \n", + "3 0.362748 0.495430 0.113876 0.594149 0.612522 0.625204 0.864050 \n", + "4 0.033415 0.340433 0.464971 0.363737 0.025815 0.434129 0.415163 \n", + "0 0.276827 0.260054 0.942397 0.113187 0.781355 0.475740 0.152061 \n", + "1 0.995885 0.158381 0.244274 0.962163 0.651900 0.930665 0.577190 \n", + "2 0.641917 0.821055 0.392437 0.782617 0.510762 0.428320 0.017324 \n", + "3 0.806532 0.569258 0.148175 0.809987 0.459632 0.735762 0.730664 \n", + "4 0.311185 0.501165 0.365979 0.782807 0.776795 0.797199 0.791946 \n", + "\n", + " 7 8 9 10 11 12 13 \\\n", + "0 0.058092 0.753502 0.587513 0.311608 0.178356 0.182922 0.147631 \n", + "1 0.601524 0.688753 0.338870 0.081595 0.471474 0.267443 0.453351 \n", + "2 0.584542 0.032594 0.589775 0.764350 0.650973 0.565705 0.691784 \n", + "3 0.260279 0.528873 0.168043 0.715929 0.677014 0.175735 0.632370 \n", + "4 0.892210 0.381701 0.415264 0.790801 0.696930 0.819751 0.944029 \n", + "0 0.250324 0.147078 0.162984 0.977025 0.509619 0.593212 0.911839 \n", + "1 0.087914 0.960261 0.580840 0.194616 0.661459 0.674085 0.049326 \n", + "2 0.680720 0.340412 0.462513 0.785776 0.251949 0.032847 0.995700 \n", + "3 0.934502 0.080322 0.763502 0.398504 0.027637 0.409665 0.942846 \n", + "4 0.847157 0.771811 0.233944 0.522344 0.053030 0.208551 0.824354 \n", + "\n", + " 14 15 16 17 18 19 \n", + "0 0.391188 0.816049 0.749068 0.293260 0.937828 0.880858 \n", + "1 0.800716 0.045749 0.683793 0.389789 0.016787 0.503695 \n", + "2 0.265223 0.739031 0.560394 0.334802 0.517694 0.646110 \n", + "3 0.926715 0.085675 0.120525 0.141746 0.771144 0.489660 \n", + "4 0.869965 0.041723 0.819140 0.676051 0.109349 0.872947 \n", + "0 0.257645 0.386457 0.696932 0.069162 0.952291 0.286542 \n", + "1 0.785803 0.315645 0.495355 0.232135 0.549324 0.572232 \n", + "2 0.816563 0.735692 0.435998 0.430411 0.531757 0.489528 \n", + "3 0.133256 0.157158 0.929446 0.402791 0.685976 0.246594 \n", + "4 0.588567 0.604341 0.232964 0.229109 0.022881 0.479022 " + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "### 10. Use a regular expression to find and extract all the sets of consecutive capitalized words in the text above." + "data_frame=pd.concat([pd.read_csv(\"../data/\"+ f) for f in file_lst ])\n", + "pd.DataFrame(data_frame) \n", + "#df.to_csv(r'../data/test.csv')\n", + "\n", + "data_frame.head(10)" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 11. Use a regular expression to find and extract all the quotes from the text below.\n", - "\n", - "*Hint: This one is a little more complex than the single quote example in the lesson because there are multiple quotes in the text.*" + "### 8. Use a list comprehension to select and print the column numbers for columns from the data set whose median is less than 0.48." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['1', '9', '12']\n" + ] + } + ], "source": [ - "text = 'Roosevelt says to Lincoln, \"I will bet you $50 I can get the bartender to give me a free drink.\" Lincoln says, \"I am in!\"'\n" + "selected_columns = [col for col in data_frame if data_frame[col].mean() < 0.48]\n", + "print(selected_columns)\n" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 12. Use a regular expression to find and extract all the numbers from the text below." + "### 9. Use a list comprehension to add a new column (20) to the data frame whose values are the values in column 19 minus 0.1. Display the top 10 rows of the resulting data frame." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0 0.780858\n", + "1 0.403695\n", + "2 0.546110\n", + "3 0.389660\n", + "4 0.772947\n", + "0 0.186542\n", + "1 0.472232\n", + "2 0.389528\n", + "3 0.146594\n", + "4 0.379022\n", + "0 0.703393\n", + "1 0.864889\n", + "2 0.213071\n", + "3 0.773029\n", + "4 0.568447\n", + "0 0.113938\n", + "1 0.210951\n", + "2 0.296453\n", + "3 -0.073048\n", + "4 0.831145\n", + "0 0.472150\n", + "1 -0.039694\n", + "2 -0.051260\n", + "3 0.460742\n", + "4 0.077401\n", + "0 0.845250\n", + "1 0.015311\n", + "2 0.482192\n", + "3 0.428075\n", + "4 0.727487\n", + "0 0.451858\n", + "1 0.586294\n", + "2 0.103699\n", + "3 0.639383\n", + "4 0.567051\n", + "0 0.195344\n", + "1 0.477262\n", + "2 -0.097404\n", + "3 0.826007\n", + "4 0.692851\n", + "0 0.868681\n", + "1 0.866198\n", + "2 0.624486\n", + "3 0.737695\n", + "4 0.298914\n", + "0 0.249845\n", + "1 0.416899\n", + "2 0.097592\n", + "3 0.348124\n", + "4 0.488319\n", + "Name: 20, dtype: float64\n" + ] + } + ], "source": [ - "text = \"There were 30 students in the class. Of the 30 students, 14 were male and 16 were female. Only 10 students got A's on the exam.\"\n" + "data_frame ['20']= [cell-0.1 for cell in data_frame['19']]\n", + "print(data_frame['20'])" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 13. Use a regular expression to find and extract all the social security numbers from the text below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "text = \"\"\"\n", - "Henry's social security number is 876-93-2289 and his phone number is (847)789-0984.\n", - "Darlene's social security number is 098-32-5295 and her phone number is (987)222-0901.\n", - "\"\"\"" + "### 10. Use a list comprehension to extract and print all values from the data set that are between 0.7 and 0.75." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, + "execution_count": 86, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0.7726065618884856, 0.9958848354862676, 0.8065316312867377, 0.948664301130462, 0.9341361023782744, 0.7570384010450522, 0.9013813844892271, 0.7638227991824795, 0.9561683913109984, 0.8091290158551179, 0.9360652845266276, 0.8955436373659004, 0.8210553167704127, 0.8039263164643641, 0.9189637493493394, 0.8162825701475874, 0.8965896767032822, 0.7909438089282793, 0.8545959675361845, 0.7676037591340263, 0.9551964568754536, 0.8759310591478083, 0.9423969993645016, 0.9182696050929382, 0.9166552105313126, 0.9540565244826162, 0.8026630492814062, 0.8333676394186291, 0.7977301593436705, 0.8923284352936591, 0.9593545317726128, 0.8505283589030188, 0.807188895572174, 0.7897963029599847, 0.9357868966513616, 0.7879222598307901, 0.9621633670380298, 0.7826166033304552, 0.8099868091360504, 0.7828069868371588, 0.7752339688339761, 0.8533873364185985, 0.8376860918849494, 0.9771886139272962, 0.92192270004039, 0.9752685500281912, 0.8177851323554917, 0.8578024096347506, 0.9184018818607405, 0.7525175693305356, 0.8077364370775799, 0.8275552972468996, 0.929163515820231, 0.8511625939208394, 0.7634334207035142, 0.7813550645836627, 0.7767950291165509, 0.7551203080129055, 0.9117988168842972, 0.8034908279201619, 0.7606875740566486, 0.8835049031816071, 0.7618047467529835, 0.8261716403125922, 0.8514243308071603, 0.8272382506031565, 0.9306647449477456, 0.7971987987858091, 0.9716089362466106, 0.8190320451738211, 0.7642091848730479, 0.8293465033900734, 0.9577397691597084, 0.8869213339995774, 0.8155857108157181, 0.7863019458267585, 0.8397564083916338, 0.8889377387462668, 0.962085889248278, 0.8998323533564464, 0.8680663549990865, 0.8156248348531779, 0.8640498341918672, 0.7919455450866845, 0.9102079618289258, 0.9414125645626396, 0.9724292380028491, 0.9447582771063441, 0.9809157775823522, 0.9101860556639954, 0.9684654942198184, 0.9803877212263026, 0.9600664601991564, 0.9960524391565686, 0.7734736363278485, 0.9884082519395649, 0.8922104131249807, 0.9345023904998434, 0.8471569562536997, 0.8712035961024672, 0.8752586492257234, 0.9868895989992108, 0.8083697481962028, 0.7747604230976926, 0.7778847226476526, 0.7660061435457575, 0.7742685105146389, 0.9509701540481998, 0.9373500528853604, 0.8556734272408868, 0.7643789400964608, 0.9454621278249584, 0.9215078483469727, 0.9807003100165752, 0.8790198844792824, 0.9820737039870038, 0.7535023641545986, 0.9602606770551084, 0.7718114664973373, 0.9323875156439116, 0.9384436225941936, 0.898233985786227, 0.7756233623158909, 0.7926918683952956, 0.9736146616584688, 0.9985283282262558, 0.9123481445545524, 0.8769548324923689, 0.8887221433812952, 0.7635021345768744, 0.8675113261037845, 0.9569491000630076, 0.7831069316157523, 0.847337370368865, 0.8958104669248041, 0.8921827161950341, 0.8827620542316649, 0.8923825308653724, 0.9227421284008488, 0.7643495353716243, 0.7908008486938785, 0.9770254336405146, 0.7857758997289559, 0.8870537312874786, 0.8133086138445641, 0.8117399389362686, 0.7881982261001286, 0.7500072865719334, 0.7687252329274414, 0.8324307593657718, 0.8895011009135932, 0.8646641441272116, 0.8986325771219401, 0.7733197144915908, 0.8585132170719993, 0.848865583423898, 0.7661529839830151, 0.99348146349248, 0.924226259547408, 0.7903119731466789, 0.9622141734634174, 0.9021571279934334, 0.9440772327200532, 0.7765520162518851, 0.8613334765924818, 0.8197511267505602, 0.9153418393802571, 0.8564368122654824, 0.8478796266730341, 0.940414111517426, 0.8412100553830514, 0.8984636884271819, 0.9316232786251768, 0.7565927719674911, 0.9082883777050392, 0.8462078521626053, 0.944028613970866, 0.9118389221659232, 0.9957004548575084, 0.942845729801833, 0.8243542373861713, 0.7657571843839385, 0.8508208979364613, 0.8488117117792991, 0.8789220036417789, 0.8323370924450664, 0.7734564609970822, 0.9532282003696234, 0.9407298955058142, 0.8247887205530644, 0.991854969105204, 0.8415754262938553, 0.7884974781268034, 0.8409380093100868, 0.7702954351316351, 0.9084985510834864, 0.7561716566060438, 0.8007161479739965, 0.9267149333179154, 0.8699647196644326, 0.7858030110348141, 0.8165630269275471, 0.9312638809606996, 0.948127217802303, 0.7984665743201926, 0.9600301286589044, 0.9333774298018972, 0.8210990892928189, 0.8446554617758595, 0.9841824171540198, 0.83437813647431, 0.8160493911980281, 0.97977691003667, 0.8489259164852261, 0.9696566391533, 0.9714518998942214, 0.7890782517599441, 0.9572423023495398, 0.8108067566208851, 0.9721353959388824, 0.8186063635991061, 0.87723888899012, 0.7669665288649764, 0.9519073414777496, 0.7660331528173858, 0.9578236531102852, 0.7727975101727471, 0.9854608748806004, 0.8191396818327873, 0.9294460292904924, 0.9547833146802479, 0.8227749935788521, 0.9928973285511736, 0.9475593173591972, 0.8938414589624106, 0.954667818131447, 0.8069639236247445, 0.9736839804909522, 0.9683240132606706, 0.7581961610392675, 0.8589556278865201, 0.8866435207534104, 0.7731550667497674, 0.8561348022144614, 0.7528657050578258, 0.8073996779008903, 0.9045926452518674, 0.9822559636900514, 0.792803388861299, 0.867894228815596, 0.93782845947314, 0.7711443458405843, 0.9522914787919003, 0.8415392556348382, 0.8770484476450304, 0.9628267209870766, 0.8303889161870508, 0.8160364023085047, 0.8855156771852954, 0.7739529437132698, 0.8036286169027608, 0.7944259173248532, 0.8913815362537028, 0.9846278416523276, 0.7978360408802792, 0.9315422182707142, 0.8478184311696535, 0.8808575048234396, 0.8729470926088846, 0.8033930409580708, 0.9648886658100324, 0.8730292489443628, 0.93114474390249, 0.9452500269680332, 0.8274869207330491, 0.9260066079467686, 0.7928514808864792, 0.9686807909262536, 0.9661983776596004, 0.8376946446827493, 0.7808575048234396, 0.7729470926088846, 0.8648886658100324, 0.7730292489443629, 0.83114474390249, 0.8452500269680332, 0.8260066079467686, 0.8686807909262536, 0.8661983776596004]\n" + ] + } + ], "source": [ - "### 14. Use a regular expression to find and extract all the phone numbers from the text below." + "values=[value for index in data_frame for value in data_frame[index] if 0.7 <= value >= 0.75]\n", + "print(values)\n" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 78, "metadata": {}, "outputs": [], "source": [] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 15. Use a regular expression to find and extract all the formatted numbers (both social security and phone) from the text below." - ] - }, { "cell_type": "code", "execution_count": null, @@ -349,7 +813,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.0" + "version": "3.7.4" } }, "nbformat": 4, From bf1684e0235ad83c3788529024be75742463d630 Mon Sep 17 00:00:00 2001 From: evill013 <53126587+evill013@users.noreply.github.com> Date: Thu, 26 Sep 2019 12:23:17 -0400 Subject: [PATCH 08/38] SOLVED --- .../lab-advanced-regex/your-code/main.ipynb | 849 ++++++------------ 1 file changed, 286 insertions(+), 563 deletions(-) diff --git a/module-1/lab-advanced-regex/your-code/main.ipynb b/module-1/lab-advanced-regex/your-code/main.ipynb index 052e09937..6a9361765 100644 --- a/module-1/lab-advanced-regex/your-code/main.ipynb +++ b/module-1/lab-advanced-regex/your-code/main.ipynb @@ -4,790 +4,513 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# List Comprehensions Lab\n", + "# Advanced Regular Expressions Lab\n", "\n", - "Complete the following set of exercises to solidify your knowledge of list comprehensions." + "Complete the following set of exercises to solidify your knowledge of regular expressions." ] }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ - "import os\n", - "import numpy as np\n", - "import pandas as pd" + "import re" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 1. Use a list comprehension to create and print a list of consecutive integers starting with 1 and ending with 50." + "### 1. Use a regular expression to find and extract all vowels in the following text." ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"This is going to be a sentence with a good number of vowels in it.\"\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]\n" + "['i', 'i', 'o', 'i', 'o', 'e', 'a', 'e', 'e', 'e', 'i', 'a', 'o', 'o', 'u', 'e', 'o', 'o', 'e', 'i', 'i']\n" ] } ], "source": [ - "lst= [x for x in range(1,51)]\n", - "print(lst)" + "print(re.findall('[aeiou]',text))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 2. Use a list comprehension to create and print a list of even numbers starting with 2 and ending with 200." + "### 2. Use a regular expression to find and extract all occurrences and tenses (singular and plural) of the word \"puppy\" in the text below." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"The puppy saw all the rest of the puppies playing and wanted to join them. I saw this and wanted a puppy of my own!\"" ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200]\n" + "['puppy', 'puppy']\n" ] } ], "source": [ - "even_lst=[x for x in range(2,201,2)]\n", - "print (even_lst)" + "print(re.findall('[P-p]uppys?',text))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 3. Use a list comprehension to create and print a list containing all elements of the 10 x 4 Numpy array below." + "### 3. Use a regular expression to find and extract all tenses (present and past) of the word \"run\" in the text below." ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 23, "metadata": {}, "outputs": [], "source": [ - "a = np.array([[0.84062117, 0.48006452, 0.7876326 , 0.77109654],\n", - " [0.44409793, 0.09014516, 0.81835917, 0.87645456],\n", - " [0.7066597 , 0.09610873, 0.41247947, 0.57433389],\n", - " [0.29960807, 0.42315023, 0.34452557, 0.4751035 ],\n", - " [0.17003563, 0.46843998, 0.92796258, 0.69814654],\n", - " [0.41290051, 0.19561071, 0.16284783, 0.97016248],\n", - " [0.71725408, 0.87702738, 0.31244595, 0.76615487],\n", - " [0.20754036, 0.57871812, 0.07214068, 0.40356048],\n", - " [0.12149553, 0.53222417, 0.9976855 , 0.12536346],\n", - " [0.80930099, 0.50962849, 0.94555126, 0.33364763]])\n" + "text = \"I ran the relay race the only way I knew how to run it.\"" ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 24, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['ran', 'run']\n" + ] + } + ], + "source": [ + "print(re.findall('[R-r]an|[R-r]un',text))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 4. Use a regular expression to find and extract all words that begin with the letter \"r\" from the previous text." + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['ran', 'relay', 'race', 'run']\n" + ] + } + ], + "source": [ + "print(re.findall('r[a-z]+',text))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 5. Use a regular expression to find and substitute the letter \"i\" for the exclamation marks in the text below." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"Th!s !s a sentence w!th spec!al characters !n !t.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "This is a sentence with special characters in it.\n" + ] + } + ], + "source": [ + "print(text.replace(\"!\",\"i\"))\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 6. Use a regular expression to find and extract words longer than 4 characters in the text below." + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"This sentence has words of varying lengths.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "[0.84062117,\n", - " 0.48006452,\n", - " 0.7876326,\n", - " 0.77109654,\n", - " 0.44409793,\n", - " 0.09014516,\n", - " 0.81835917,\n", - " 0.87645456,\n", - " 0.7066597,\n", - " 0.09610873,\n", - " 0.41247947,\n", - " 0.57433389,\n", - " 0.29960807,\n", - " 0.42315023,\n", - " 0.34452557,\n", - " 0.4751035,\n", - " 0.17003563,\n", - " 0.46843998,\n", - " 0.92796258,\n", - " 0.69814654,\n", - " 0.41290051,\n", - " 0.19561071,\n", - " 0.16284783,\n", - " 0.97016248,\n", - " 0.71725408,\n", - " 0.87702738,\n", - " 0.31244595,\n", - " 0.76615487,\n", - " 0.20754036,\n", - " 0.57871812,\n", - " 0.07214068,\n", - " 0.40356048,\n", - " 0.12149553,\n", - " 0.53222417,\n", - " 0.9976855,\n", - " 0.12536346,\n", - " 0.80930099,\n", - " 0.50962849,\n", - " 0.94555126,\n", - " 0.33364763]" + "['This', 'sentence', 'words', 'varying', 'lengths']" ] }, - "execution_count": 5, + "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "nump_lst = []\n", - "\n", - "for x in a:\n", - " for elmt in x:\n", - " nump_lst.append(elmt)\n", - "\n", - "nump_lst" + "re.findall('[A-Z][a-z]{3,}|[a-z]{4,}',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 4. Add a condition to the list comprehension above so that only values greater than or equal to 0.5 are printed." + "### 7. Use a regular expression to find and extract all occurrences of the letter \"b\", some letter(s), and then the letter \"t\" in the sentence below." ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 63, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"I bet the robot couldn't beat the other bot with a bat, but instead it bit me.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "[0.84062117,\n", - " 0.7876326,\n", - " 0.77109654,\n", - " 0.81835917,\n", - " 0.87645456,\n", - " 0.7066597,\n", - " 0.57433389,\n", - " 0.92796258,\n", - " 0.69814654,\n", - " 0.97016248,\n", - " 0.71725408,\n", - " 0.87702738,\n", - " 0.76615487,\n", - " 0.57871812,\n", - " 0.53222417,\n", - " 0.9976855,\n", - " 0.80930099,\n", - " 0.50962849,\n", - " 0.94555126]" + "['bet', 'bot', 'beat', 'bot', 'bat', 'but', 'bit']" ] }, - "execution_count": 6, + "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "nump_list=[]\n", - "\n", - "for sublist in a:\n", - " for i in sublist:\n", - " if i >= 0.5:\n", - " nump_list.append(i)\n", - " \n", - "nump_list" + "re.findall('b[a-z]+t',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 5. Use a list comprehension to create and print a list containing all elements of the 5 x 2 x 3 Numpy array below." + "### 8. Use a regular expression to find and extract all words that contain either \"ea\" or \"eo\" in them." ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 100, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"During many of the peaks and troughs of history, the people living it didn't fully realize what was unfolding. But we all know we're navigating breathtaking history: Nearly every day could be — maybe will be — a book.\"\n" + ] + }, + { + "cell_type": "code", + "execution_count": 101, "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.55867166, 0.06210792, 0.08147297, 0.82579068, 0.91512478, 0.06833034, 0.05440634, 0.65857693, 0.30296619, 0.06769833, 0.96031863, 0.51293743, 0.09143215, 0.71893382, 0.45850679, 0.58256464, 0.59005654, 0.56266457, 0.71600294, 0.87392666, 0.11434044, 0.8694668, 0.65669313, 0.10708681, 0.07529684, 0.46470767, 0.47984544, 0.65368638, 0.14901286, 0.23760688, 0.33364763]\n" - ] + "data": { + "text/plain": [ + "['pea', 'peo', 'rea', 'brea', 'Nea']" + ] + }, + "execution_count": 101, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "b = np.array([[[0.55867166, 0.06210792, 0.08147297],\n", - " [0.82579068, 0.91512478, 0.06833034]],\n", - "\n", - " [[0.05440634, 0.65857693, 0.30296619],\n", - " [0.06769833, 0.96031863, 0.51293743]],\n", - "\n", - " [[0.09143215, 0.71893382, 0.45850679],\n", - " [0.58256464, 0.59005654, 0.56266457]],\n", - "\n", - " [[0.71600294, 0.87392666, 0.11434044],\n", - " [0.8694668 , 0.65669313, 0.10708681]],\n", - "\n", - " [[0.07529684, 0.46470767, 0.47984544],\n", - " [0.65368638, 0.14901286, 0.23760688]]])\n", - "\n", - "new_nplst=[i for lst in b for sublist in lst for i in sublist]\n", - "\n", - "new_nplst.append(i)\n", - "\n", - "print(new_nplst)" + "re.findall('\\w+[e][a]|\\w[e][o]',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 5. Add a condition to the list comprehension above so that the last value in each subarray is printed, but only if it is less than or equal to 0.5." + "### 9. Use a regular expression to find and extract all the capitalized words in the text below individually." + ] + }, + { + "cell_type": "code", + "execution_count": 116, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"Teddy Roosevelt and Abraham Lincoln walk into a bar.\"" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 117, "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.08147297, 0.06833034, 0.30296619, 0.45850679, 0.11434044, 0.10708681, 0.47984544, 0.23760688, 0.33364763]\n" - ] + "data": { + "text/plain": [ + "['Teddy', 'Roosevelt', 'Abraham', 'Lincoln']" + ] + }, + "execution_count": 117, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "new_numplst=[sublist[-1] for lst in b for sublist in lst if sublist[-1] <= 0.5]\n", - "\n", - "new_numplst.append(sublist[-1])\n", - "\n", - "print(new_numplst)" + "re.findall('[A-Z][a-z]+',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 6. Use a list comprehension to select and print the names of all CSV files in the */data* directory." + "### 10. Use a regular expression to find and extract all the sets of consecutive capitalized words in the text above." ] }, { "cell_type": "code", - "execution_count": 9, - "metadata": { - "scrolled": true - }, + "execution_count": 118, + "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "['sample_file_0.csv', 'sample_file_1.csv', 'sample_file_2.csv', 'sample_file_3.csv', 'sample_file_4.csv', 'sample_file_5.csv', 'sample_file_6.csv', 'sample_file_7.csv', 'sample_file_8.csv', 'sample_file_9.csv']\n" - ] + "data": { + "text/plain": [ + "['Teddy Roosevelt', 'Abraham Lincoln']" + ] + }, + "execution_count": 118, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "files = os.listdir(\"../data\")\n", - "\n", - "file_lst=[]\n", - "for f in files:\n", - " if f.endswith(\".csv\"):\n", - " file_lst.append(f)\n", - "\n", - "print(file_lst)\n" + "re.findall('[A-Z][a-z]+ [A-Z][a-z]+',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 7. Use a list comprehension and the Pandas `read_csv` and `concat` methods to read all CSV files in the */data* directory and combine them into a single data frame. Display the top 10 rows of the resulting data frame." + "### 11. Use a regular expression to find and extract all the quotes from the text below.\n", + "\n", + "*Hint: This one is a little more complex than the single quote example in the lesson because there are multiple quotes in the text.*" + ] + }, + { + "cell_type": "code", + "execution_count": 123, + "metadata": {}, + "outputs": [], + "source": [ + "text = 'Roosevelt says to Lincoln, \"I will bet you $50 I can get the bartender to give me a free drink.\" Lincoln says, \"I am in!\"'\n" ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 125, "metadata": {}, "outputs": [ { "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
012345678910111213141516171819
00.7347510.1953620.7343090.5981840.7634330.2634340.8680660.0580920.7535020.5875130.3116080.1783560.1829220.1476310.3911880.8160490.7490680.2932600.9378280.880858
10.7726070.4453910.2496420.7879220.5985830.8272380.6241260.6015240.6887530.3388700.0815950.4714740.2674430.4533510.8007160.0457490.6837930.3897890.0167870.503695
20.2264280.2687640.6942620.6223350.0638430.1226830.8156250.5845420.0325940.5897750.7643500.6509730.5657050.6917840.2652230.7390310.5603940.3348020.5176940.646110
30.3627480.4954300.1138760.5941490.6125220.6252040.8640500.2602790.5288730.1680430.7159290.6770140.1757350.6323700.9267150.0856750.1205250.1417460.7711440.489660
40.0334150.3404330.4649710.3637370.0258150.4341290.4151630.8922100.3817010.4152640.7908010.6969300.8197510.9440290.8699650.0417230.8191400.6760510.1093490.872947
00.2768270.2600540.9423970.1131870.7813550.4757400.1520610.2503240.1470780.1629840.9770250.5096190.5932120.9118390.2576450.3864570.6969320.0691620.9522910.286542
10.9958850.1583810.2442740.9621630.6519000.9306650.5771900.0879140.9602610.5808400.1946160.6614590.6740850.0493260.7858030.3156450.4953550.2321350.5493240.572232
20.6419170.8210550.3924370.7826170.5107620.4283200.0173240.6807200.3404120.4625130.7857760.2519490.0328470.9957000.8165630.7356920.4359980.4304110.5317570.489528
30.8065320.5692580.1481750.8099870.4596320.7357620.7306640.9345020.0803220.7635020.3985040.0276370.4096650.9428460.1332560.1571580.9294460.4027910.6859760.246594
40.3111850.5011650.3659790.7828070.7767950.7971990.7919460.8471570.7718110.2339440.5223440.0530300.2085510.8243540.5885670.6043410.2329640.2291090.0228810.479022
\n", - "
" - ], "text/plain": [ - " 0 1 2 3 4 5 6 \\\n", - "0 0.734751 0.195362 0.734309 0.598184 0.763433 0.263434 0.868066 \n", - "1 0.772607 0.445391 0.249642 0.787922 0.598583 0.827238 0.624126 \n", - "2 0.226428 0.268764 0.694262 0.622335 0.063843 0.122683 0.815625 \n", - "3 0.362748 0.495430 0.113876 0.594149 0.612522 0.625204 0.864050 \n", - "4 0.033415 0.340433 0.464971 0.363737 0.025815 0.434129 0.415163 \n", - "0 0.276827 0.260054 0.942397 0.113187 0.781355 0.475740 0.152061 \n", - "1 0.995885 0.158381 0.244274 0.962163 0.651900 0.930665 0.577190 \n", - "2 0.641917 0.821055 0.392437 0.782617 0.510762 0.428320 0.017324 \n", - "3 0.806532 0.569258 0.148175 0.809987 0.459632 0.735762 0.730664 \n", - "4 0.311185 0.501165 0.365979 0.782807 0.776795 0.797199 0.791946 \n", - "\n", - " 7 8 9 10 11 12 13 \\\n", - "0 0.058092 0.753502 0.587513 0.311608 0.178356 0.182922 0.147631 \n", - "1 0.601524 0.688753 0.338870 0.081595 0.471474 0.267443 0.453351 \n", - "2 0.584542 0.032594 0.589775 0.764350 0.650973 0.565705 0.691784 \n", - "3 0.260279 0.528873 0.168043 0.715929 0.677014 0.175735 0.632370 \n", - "4 0.892210 0.381701 0.415264 0.790801 0.696930 0.819751 0.944029 \n", - "0 0.250324 0.147078 0.162984 0.977025 0.509619 0.593212 0.911839 \n", - "1 0.087914 0.960261 0.580840 0.194616 0.661459 0.674085 0.049326 \n", - "2 0.680720 0.340412 0.462513 0.785776 0.251949 0.032847 0.995700 \n", - "3 0.934502 0.080322 0.763502 0.398504 0.027637 0.409665 0.942846 \n", - "4 0.847157 0.771811 0.233944 0.522344 0.053030 0.208551 0.824354 \n", - "\n", - " 14 15 16 17 18 19 \n", - "0 0.391188 0.816049 0.749068 0.293260 0.937828 0.880858 \n", - "1 0.800716 0.045749 0.683793 0.389789 0.016787 0.503695 \n", - "2 0.265223 0.739031 0.560394 0.334802 0.517694 0.646110 \n", - "3 0.926715 0.085675 0.120525 0.141746 0.771144 0.489660 \n", - "4 0.869965 0.041723 0.819140 0.676051 0.109349 0.872947 \n", - "0 0.257645 0.386457 0.696932 0.069162 0.952291 0.286542 \n", - "1 0.785803 0.315645 0.495355 0.232135 0.549324 0.572232 \n", - "2 0.816563 0.735692 0.435998 0.430411 0.531757 0.489528 \n", - "3 0.133256 0.157158 0.929446 0.402791 0.685976 0.246594 \n", - "4 0.588567 0.604341 0.232964 0.229109 0.022881 0.479022 " + "[' \"', '\" ', ' \"', '\"']" ] }, - "execution_count": 10, + "execution_count": 125, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "data_frame=pd.concat([pd.read_csv(\"../data/\"+ f) for f in file_lst ])\n", - "pd.DataFrame(data_frame) \n", - "#df.to_csv(r'../data/test.csv')\n", - "\n", - "data_frame.head(10)" + "re.findall(' ?\" ?',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 8. Use a list comprehension to select and print the column numbers for columns from the data set whose median is less than 0.48." + "### 12. Use a regular expression to find and extract all the numbers from the text below." + ] + }, + { + "cell_type": "code", + "execution_count": 128, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"There were 30 students in the class. Of the 30 students, 14 were male and 16 were female. Only 10 students got A's on the exam.\"\n" ] }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 129, "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "['1', '9', '12']\n" - ] + "data": { + "text/plain": [ + "['30', '30', '14', '16', '10']" + ] + }, + "execution_count": 129, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "selected_columns = [col for col in data_frame if data_frame[col].mean() < 0.48]\n", - "print(selected_columns)\n" + "re.findall('\\d?\\d',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 9. Use a list comprehension to add a new column (20) to the data frame whose values are the values in column 19 minus 0.1. Display the top 10 rows of the resulting data frame." + "### 13. Use a regular expression to find and extract all the social security numbers from the text below." + ] + }, + { + "cell_type": "code", + "execution_count": 141, + "metadata": {}, + "outputs": [], + "source": [ + "text = \"\"\"\n", + "Henry's social security number is 876-93-2289 and his phone number is (847)789-0984.\n", + "Darlene's social security number is 098-32-5295 and her phone number is (987)222-0901.\n", + "\"\"\"" ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 142, "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "0 0.780858\n", - "1 0.403695\n", - "2 0.546110\n", - "3 0.389660\n", - "4 0.772947\n", - "0 0.186542\n", - "1 0.472232\n", - "2 0.389528\n", - "3 0.146594\n", - "4 0.379022\n", - "0 0.703393\n", - "1 0.864889\n", - "2 0.213071\n", - "3 0.773029\n", - "4 0.568447\n", - "0 0.113938\n", - "1 0.210951\n", - "2 0.296453\n", - "3 -0.073048\n", - "4 0.831145\n", - "0 0.472150\n", - "1 -0.039694\n", - "2 -0.051260\n", - "3 0.460742\n", - "4 0.077401\n", - "0 0.845250\n", - "1 0.015311\n", - "2 0.482192\n", - "3 0.428075\n", - "4 0.727487\n", - "0 0.451858\n", - "1 0.586294\n", - "2 0.103699\n", - "3 0.639383\n", - "4 0.567051\n", - "0 0.195344\n", - "1 0.477262\n", - "2 -0.097404\n", - "3 0.826007\n", - "4 0.692851\n", - "0 0.868681\n", - "1 0.866198\n", - "2 0.624486\n", - "3 0.737695\n", - "4 0.298914\n", - "0 0.249845\n", - "1 0.416899\n", - "2 0.097592\n", - "3 0.348124\n", - "4 0.488319\n", - "Name: 20, dtype: float64\n" - ] + "data": { + "text/plain": [ + "['876-93-2289', '098-32-5295']" + ] + }, + "execution_count": 142, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "data_frame ['20']= [cell-0.1 for cell in data_frame['19']]\n", - "print(data_frame['20'])" + "re.findall('\\d+-\\d{2}-\\d+',text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 10. Use a list comprehension to extract and print all values from the data set that are between 0.7 and 0.75." + "### 14. Use a regular expression to find and extract all the phone numbers from the text below." ] }, { "cell_type": "code", - "execution_count": 86, + "execution_count": 146, "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0.7726065618884856, 0.9958848354862676, 0.8065316312867377, 0.948664301130462, 0.9341361023782744, 0.7570384010450522, 0.9013813844892271, 0.7638227991824795, 0.9561683913109984, 0.8091290158551179, 0.9360652845266276, 0.8955436373659004, 0.8210553167704127, 0.8039263164643641, 0.9189637493493394, 0.8162825701475874, 0.8965896767032822, 0.7909438089282793, 0.8545959675361845, 0.7676037591340263, 0.9551964568754536, 0.8759310591478083, 0.9423969993645016, 0.9182696050929382, 0.9166552105313126, 0.9540565244826162, 0.8026630492814062, 0.8333676394186291, 0.7977301593436705, 0.8923284352936591, 0.9593545317726128, 0.8505283589030188, 0.807188895572174, 0.7897963029599847, 0.9357868966513616, 0.7879222598307901, 0.9621633670380298, 0.7826166033304552, 0.8099868091360504, 0.7828069868371588, 0.7752339688339761, 0.8533873364185985, 0.8376860918849494, 0.9771886139272962, 0.92192270004039, 0.9752685500281912, 0.8177851323554917, 0.8578024096347506, 0.9184018818607405, 0.7525175693305356, 0.8077364370775799, 0.8275552972468996, 0.929163515820231, 0.8511625939208394, 0.7634334207035142, 0.7813550645836627, 0.7767950291165509, 0.7551203080129055, 0.9117988168842972, 0.8034908279201619, 0.7606875740566486, 0.8835049031816071, 0.7618047467529835, 0.8261716403125922, 0.8514243308071603, 0.8272382506031565, 0.9306647449477456, 0.7971987987858091, 0.9716089362466106, 0.8190320451738211, 0.7642091848730479, 0.8293465033900734, 0.9577397691597084, 0.8869213339995774, 0.8155857108157181, 0.7863019458267585, 0.8397564083916338, 0.8889377387462668, 0.962085889248278, 0.8998323533564464, 0.8680663549990865, 0.8156248348531779, 0.8640498341918672, 0.7919455450866845, 0.9102079618289258, 0.9414125645626396, 0.9724292380028491, 0.9447582771063441, 0.9809157775823522, 0.9101860556639954, 0.9684654942198184, 0.9803877212263026, 0.9600664601991564, 0.9960524391565686, 0.7734736363278485, 0.9884082519395649, 0.8922104131249807, 0.9345023904998434, 0.8471569562536997, 0.8712035961024672, 0.8752586492257234, 0.9868895989992108, 0.8083697481962028, 0.7747604230976926, 0.7778847226476526, 0.7660061435457575, 0.7742685105146389, 0.9509701540481998, 0.9373500528853604, 0.8556734272408868, 0.7643789400964608, 0.9454621278249584, 0.9215078483469727, 0.9807003100165752, 0.8790198844792824, 0.9820737039870038, 0.7535023641545986, 0.9602606770551084, 0.7718114664973373, 0.9323875156439116, 0.9384436225941936, 0.898233985786227, 0.7756233623158909, 0.7926918683952956, 0.9736146616584688, 0.9985283282262558, 0.9123481445545524, 0.8769548324923689, 0.8887221433812952, 0.7635021345768744, 0.8675113261037845, 0.9569491000630076, 0.7831069316157523, 0.847337370368865, 0.8958104669248041, 0.8921827161950341, 0.8827620542316649, 0.8923825308653724, 0.9227421284008488, 0.7643495353716243, 0.7908008486938785, 0.9770254336405146, 0.7857758997289559, 0.8870537312874786, 0.8133086138445641, 0.8117399389362686, 0.7881982261001286, 0.7500072865719334, 0.7687252329274414, 0.8324307593657718, 0.8895011009135932, 0.8646641441272116, 0.8986325771219401, 0.7733197144915908, 0.8585132170719993, 0.848865583423898, 0.7661529839830151, 0.99348146349248, 0.924226259547408, 0.7903119731466789, 0.9622141734634174, 0.9021571279934334, 0.9440772327200532, 0.7765520162518851, 0.8613334765924818, 0.8197511267505602, 0.9153418393802571, 0.8564368122654824, 0.8478796266730341, 0.940414111517426, 0.8412100553830514, 0.8984636884271819, 0.9316232786251768, 0.7565927719674911, 0.9082883777050392, 0.8462078521626053, 0.944028613970866, 0.9118389221659232, 0.9957004548575084, 0.942845729801833, 0.8243542373861713, 0.7657571843839385, 0.8508208979364613, 0.8488117117792991, 0.8789220036417789, 0.8323370924450664, 0.7734564609970822, 0.9532282003696234, 0.9407298955058142, 0.8247887205530644, 0.991854969105204, 0.8415754262938553, 0.7884974781268034, 0.8409380093100868, 0.7702954351316351, 0.9084985510834864, 0.7561716566060438, 0.8007161479739965, 0.9267149333179154, 0.8699647196644326, 0.7858030110348141, 0.8165630269275471, 0.9312638809606996, 0.948127217802303, 0.7984665743201926, 0.9600301286589044, 0.9333774298018972, 0.8210990892928189, 0.8446554617758595, 0.9841824171540198, 0.83437813647431, 0.8160493911980281, 0.97977691003667, 0.8489259164852261, 0.9696566391533, 0.9714518998942214, 0.7890782517599441, 0.9572423023495398, 0.8108067566208851, 0.9721353959388824, 0.8186063635991061, 0.87723888899012, 0.7669665288649764, 0.9519073414777496, 0.7660331528173858, 0.9578236531102852, 0.7727975101727471, 0.9854608748806004, 0.8191396818327873, 0.9294460292904924, 0.9547833146802479, 0.8227749935788521, 0.9928973285511736, 0.9475593173591972, 0.8938414589624106, 0.954667818131447, 0.8069639236247445, 0.9736839804909522, 0.9683240132606706, 0.7581961610392675, 0.8589556278865201, 0.8866435207534104, 0.7731550667497674, 0.8561348022144614, 0.7528657050578258, 0.8073996779008903, 0.9045926452518674, 0.9822559636900514, 0.792803388861299, 0.867894228815596, 0.93782845947314, 0.7711443458405843, 0.9522914787919003, 0.8415392556348382, 0.8770484476450304, 0.9628267209870766, 0.8303889161870508, 0.8160364023085047, 0.8855156771852954, 0.7739529437132698, 0.8036286169027608, 0.7944259173248532, 0.8913815362537028, 0.9846278416523276, 0.7978360408802792, 0.9315422182707142, 0.8478184311696535, 0.8808575048234396, 0.8729470926088846, 0.8033930409580708, 0.9648886658100324, 0.8730292489443628, 0.93114474390249, 0.9452500269680332, 0.8274869207330491, 0.9260066079467686, 0.7928514808864792, 0.9686807909262536, 0.9661983776596004, 0.8376946446827493, 0.7808575048234396, 0.7729470926088846, 0.8648886658100324, 0.7730292489443629, 0.83114474390249, 0.8452500269680332, 0.8260066079467686, 0.8686807909262536, 0.8661983776596004]\n" - ] + "data": { + "text/plain": [ + "['(847)789-0984', '(987)222-0901']" + ] + }, + "execution_count": 146, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "values=[value for index in data_frame for value in data_frame[index] if 0.7 <= value >= 0.75]\n", - "print(values)\n" + "re.findall('\\W\\d{3}\\W\\d{3}-\\d+',text)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 15. Use a regular expression to find and extract all the formatted numbers (both social security and phone) from the text below." ] }, { "cell_type": "code", - "execution_count": 78, + "execution_count": 150, "metadata": {}, - "outputs": [], - "source": [] + "outputs": [ + { + "data": { + "text/plain": [ + "[' 876-93-2289', '(847)789-0984', ' 098-32-5295', '(987)222-0901']" + ] + }, + "execution_count": 150, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "re.findall('\\W?\\d+\\W?\\d+-\\d+',text)" + ] }, { "cell_type": "code", From df579e519ff753862a86f58563dd725eb5ba38aa Mon Sep 17 00:00:00 2001 From: evill013 <53126587+evill013@users.noreply.github.com> Date: Thu, 26 Sep 2019 12:25:41 -0400 Subject: [PATCH 09/38] SOLVED --- .../your-code/Q1.ipynb | 108 ++- .../your-code/Q2.ipynb | 125 ++- .../your-code/main.ipynb | 819 ++++++++++++++++-- 3 files changed, 946 insertions(+), 106 deletions(-) diff --git a/module-1/lab-functional-programming/your-code/Q1.ipynb b/module-1/lab-functional-programming/your-code/Q1.ipynb index 8b07d3db6..557d8e79f 100644 --- a/module-1/lab-functional-programming/your-code/Q1.ipynb +++ b/module-1/lab-functional-programming/your-code/Q1.ipynb @@ -19,24 +19,32 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "# Import required libraries\n", - "\n", + "import os\n", "# Define function\n", "def get_bow_from_docs(docs, stop_words=[]):\n", - " \n", + "\n", " # In the function, first define the variables you will use such as `corpus`, `bag_of_words`, and `term_freq`.\n", - " \n", - " \n", - " \n", + " #docs = ['doc1.txt', 'doc2.txt', 'doc3.txt']\n", + " bag_of_words = ['a', 'am', 'at', 'cool', 'i', 'ironhack', 'is', 'love', 'student']\n", + " term_freq = [\n", + " [0, 0, 0, 1, 0, 1, 1, 0, 0],\n", + " [0, 0, 0, 0, 1, 1, 0, 1, 0],\n", + " [1, 1, 1, 0, 1, 1, 0, 0, 1],\n", + "] \n", " \"\"\"\n", " Loop `docs` and read the content of each doc into a string in `corpus`.\n", " Remember to convert the doc content to lowercases and remove punctuation.\n", " \"\"\"\n", - "\n", + " corpus = []\n", + " for file in docs:\n", + " lines=open(file).readline()\n", + " corpus.append(lines)\n", + " corpus=[i.replace('.','').lower() for i in corpus]\n", " \n", " \n", " \"\"\"\n", @@ -45,23 +53,26 @@ " In addition, check if each term is in the `stop_words` array. Only append the term to `bag_of_words`\n", " if it is not a stop word.\n", " \"\"\"\n", - "\n", - " \n", - " \n", + " bag_of_words = []\n", + " for str in corpus:\n", + " split_str=str.split()\n", + " for word in split_str:\n", + " if word not in bag_of_words:\n", + " if word not in stop_words:\n", + " bag_of_words.append(word)\n", + " \n", + " \n", " \n", " \"\"\"\n", " Loop `corpus` again. For each doc string, count the number of occurrences of each term in `bag_of_words`. \n", " Create an array for each doc's term frequency and append it to `term_freq`.\n", " \"\"\"\n", + " term_freq = []\n", "\n", - " \n", - " \n", - " # Now return your output as an object\n", - " return {\n", - " \"bag_of_words\": bag_of_words,\n", - " \"term_freq\": term_freq\n", - " }\n", - " " + " for sentence in corpus:\n", + " term_freq.append([(sentence.split()).count(term) for term in bag_of_words])\n", + "\n", + " return {\"bag of words\":bag_of_words,\"term_freq\":term_freq}\n" ] }, { @@ -75,13 +86,23 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 24, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'bag of words': ['ironhack', 'is', 'cool', 'i', 'love', 'am', 'a', 'student', 'at'], 'term_freq': [[1, 1, 1, 0, 0, 0, 0, 0, 0], [1, 0, 0, 1, 1, 0, 0, 0, 0], [1, 0, 0, 1, 0, 1, 1, 1, 1]]}\n" + ] + } + ], "source": [ - "# Define doc paths array\n", - "docs = []\n", - "\n", + "# Define doc paths list\n", + "path = '../../lab-string-operations/your-code/'\n", + "files = ['doc1.txt', 'doc2.txt', 'doc3.txt']\n", + "docs = [path + f for f in files]\n", + "stop_words\n", "# Obtain BoW from your function\n", "bow = get_bow_from_docs(docs)\n", "\n", @@ -100,12 +121,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 25, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "frozenset({'fill', 'find', 'beside', 'became', 'besides', 'noone', 'up', 'less', 'we', 'nowhere', 'too', 'under', 'within', 'hereby', 'whereupon', 'detail', 'above', 'i', 'than', 'yourselves', 'whereafter', 'of', 'ourselves', 'thin', 'it', 'un', 'own', 'whose', 'anyway', 'amount', 'becomes', 'most', 'thick', 'me', 'side', 'only', 'bill', 'thence', 'keep', 'one', 'per', 'many', 'both', 'anyhow', 'against', 'nothing', 'this', 'wherever', 'each', 'some', 'something', 'please', 'off', 'do', 'rather', 'by', 'which', 'from', 'ltd', 'beforehand', 'along', 'everywhere', 'must', 'and', 'eg', 'their', 'without', 'even', 'but', 'if', 'my', 'whence', 'being', 'once', 'seem', 'part', 'therefore', 'either', 'indeed', 'may', 'cant', 'anywhere', 'already', 'found', 'see', 'seemed', 'ie', 'same', 'everyone', 'the', 'still', 'while', 'herself', 'three', 'was', 'over', 'during', 'much', 'what', 'these', 'for', 'has', 'not', 'onto', 'moreover', 'sometimes', 'also', 'are', 'have', 'how', 'never', 'or', 'bottom', 'via', 'themselves', 'when', 'might', 'amongst', 'she', 'ever', 'always', 'upon', 'whole', 'yourself', 'therein', 'hasnt', 'at', 'his', 'towards', 'itself', 'seeming', 'inc', 'anything', 'where', 'since', 'between', 'other', 'together', 'namely', 'twenty', 'ours', 'whereas', 'done', 'you', 'none', 'your', 'around', 'nor', 'among', 'nevertheless', 'serious', 'an', 'us', 'until', 'whither', 'name', 'whenever', 'hence', 'behind', 'call', 'take', 'more', 'her', 'former', 'whatever', 'before', 'any', 'move', 'fifty', 'others', 'throughout', 'been', 'due', 'he', 'a', 'sixty', 'below', 'out', 'latterly', 'those', 'further', 'almost', 'again', 'someone', 'thereupon', 'on', 'full', 'get', 'cannot', 'system', 'de', 'eleven', 'who', 'otherwise', 'hereafter', 'become', 'next', 'about', 'whereby', 'forty', 'hers', 'that', 'anyone', 'somewhere', 'though', 'every', 'after', 'should', 'becoming', 'elsewhere', 'except', 'enough', 'whoever', 'empty', 'him', 'top', 'made', 'put', 'now', 'eight', 'describe', 'well', 'six', 'meanwhile', 'go', 'had', 'four', 'latter', 'perhaps', 'there', 'wherein', 'can', 'seems', 'couldnt', 'nine', 'here', 'third', 'few', 'another', 'con', 'mostly', 'thereafter', 'sometime', 'whether', 'first', 'myself', 'am', 'through', 'would', 'as', 'else', 'will', 'such', 'whom', 'thereby', 'front', 'all', 'to', 'were', 'down', 'nobody', 'fire', 're', 'because', 'give', 'show', 'with', 'interest', 'be', 'is', 'why', 'several', 'very', 'mill', 'fifteen', 'least', 'formerly', 'last', 'thru', 'no', 'two', 'beyond', 'somehow', 'yet', 'its', 'cry', 'often', 'them', 'afterwards', 'although', 'then', 'they', 'sincere', 'our', 'so', 'everything', 'in', 'back', 'thus', 'however', 'twelve', 'ten', 'yours', 'could', 'etc', 'hundred', 'into', 'across', 'mine', 'amoungst', 'hereupon', 'herein', 'co', 'toward', 'alone', 'himself', 'neither', 'five'})\n" + ] + } + ], "source": [ "from sklearn.feature_extraction import stop_words\n", - "print(stop_words.ENGLISH_STOP_WORDS)" + "stop_words= stop_words.ENGLISH_STOP_WORDS\n", + "print(stop_words)" ] }, { @@ -128,13 +158,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 26, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'bag of words': ['ironhack', 'cool', 'love', 'student'], 'term_freq': [[1, 1, 0, 0], [1, 0, 1, 0], [1, 0, 0, 1]]}\n" + ] + } + ], "source": [ - "bow = get_bow_from_docs(bow, stop_words.ENGLISH_STOP_WORDS)\n", - "\n", - "print(bow)" + "bow2 = get_bow_from_docs(docs, stop_words)\n", + "print(bow2)" ] }, { @@ -146,6 +183,13 @@ "```{'bag_of_words': ['ironhack', 'cool', 'love', 'student'], 'term_freq': [[1, 1, 0, 0], [1, 0, 1, 0], [1, 0, 0, 1]]}```" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "code", "execution_count": null, @@ -170,7 +214,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.6" + "version": "3.7.4" } }, "nbformat": 4, diff --git a/module-1/lab-functional-programming/your-code/Q2.ipynb b/module-1/lab-functional-programming/your-code/Q2.ipynb index f50f442f7..96e1f9675 100644 --- a/module-1/lab-functional-programming/your-code/Q2.ipynb +++ b/module-1/lab-functional-programming/your-code/Q2.ipynb @@ -15,12 +15,60 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 27, "metadata": {}, "outputs": [], "source": [ + "import re\n", + "import os\n", "# Define your string handling functions below\n", - "# Minimal 3 functions\n" + "# Minimal 3 functions\n", + "#corpus=\"
  • UX/UI...Design Bootcamp? (Full-Time)