Uploaded Test Slides #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open

BenBrandt1 wants to merge 2 commits into ernbilen:main from BenBrandt1:main

+428 −0

BenBrandt_DATA400_Idea.md

-Original file line number
+Diff line change
@@ -0,0 +1,19 @@
+    # DATA400 Mini-Project Idea 1
+    ## Ben Brandt
+    For my mini-project, I plan on scraping housing data from Zillow from my zipcode and some neighboring zipcodes in order to predict the price of houses.
+    ### Tractable Data
+    In terms of the kind of data I want, I want the typical information that someone looking for a home might consider when making a decision about buying a house. All of this information is available directly on the Zillow website. This information includes numerical values such as the number of bedrooms, number of bathrooms, square footage, and lot acreage. Additionally, there is a categorical variable present that I would like to include as well which is the kind of house (i.e townhouse, single family residence, etc.). This all together is used for as predictor variables for our response variable which would be the price of the house.
+    ### Data Retrieval
+    The way I anticipate getting this data is by simply scraping it off the website myself. Using a list of zipcodes, I can navigate through all the available homes for each zipcode and use the XPath to scrape all the necessary data I might need. In practice, this would be very similar to the DATA200 GoodReads project.
+    ### Specification of Model
+    I think the best model to use for this project is a Random Forest. This model is very versatile and is usable in most situations. I think in this case particularly it will better highlight the importance of certain variables over others. And since Random Forests deal with multicollinearity, there is less concern about variables being correlated with one another.
+    ### Implications of Stakeholders
+    I think there are a lot of stakeholders at hand that can be affected by this project. The first and most obvious of which is people who are currently looking for a home in the Northern Virginia area who are trying to find an estimated price given their specifications. A second potential stakeholder would be the people selling their homes, they can use the model to see what their own home is worth and determine a price accordingly. A third stakeholder would be the realty companies selling these homes. They can use the model to better understand what types of homes are being sold and at what price points and can change their realty strategy accordingly.
+    ### Ethical, Legal, and Societal Implications
+    There are definitely some legal and societal implications of doing this project both positive and negative. In terms of legality, Zillow likely is not a fan of people scraping their site and it might violate the terms of service. There is also the question of violating homeowners privacy but I think as long as addresses and photos are not included that should minimize any direct exposure to homeowners. Some negative societal impacts that could be at play is the potential that there might end up bringing reduced affordability in certain areas or bulk investing. However, there is also societal upsides that are also important to consider such as affordability transparency and helping buyers and renters make better decisions about the kind of home they buy.

BenBrandt_DATA400_Idea.md.html

-Original file line number
+Diff line change
@@ -0,0 +1,16 @@
+    <!DOCTYPE html><html><head><meta charset="utf-8"><title>BenBrandt_DATA400_Idea.md</title><style></style></head><body id="preview">
+    <h1 class="code-line" data-line-start=0 data-line-end=1 ><a id="DATA400_MiniProject_Idea_1_0"></a>DATA400 Mini-Project Idea 1</h1>
+    <h2 class="code-line" data-line-start=1 data-line-end=2 ><a id="Ben_Brandt_1"></a>Ben Brandt</h2>
+    <p class="has-line-data" data-line-start="3" data-line-end="4">For my mini-project, I plan on scraping housing data from Zillow from my zipcode and some neighboring zipcodes in order to predict the price of houses.</p>
+    <h3 class="code-line" data-line-start=5 data-line-end=6 ><a id="Tractable_Data_5"></a>Tractable Data</h3>
+    <p class="has-line-data" data-line-start="6" data-line-end="7">In terms of the kind of data I want, I want the typical information that someone looking for a home might consider when making a decision about buying a house. All of this information is available directly on the Zillow website. This information includes numerical values such as the number of bedrooms, number of bathrooms, square footage, and lot acreage. Additionally, there is a categorical variable present that I would like to include as well which is the kind of house (i.e townhouse, single family residence, etc.). This all together is used for as predictor variables for our response variable which would be the price of the house.</p>
+    <h3 class="code-line" data-line-start=8 data-line-end=9 ><a id="Data_Retrieval_8"></a>Data Retrieval</h3>
+    <p class="has-line-data" data-line-start="9" data-line-end="10">The way I anticipate getting this data is by simply scraping it off the website myself. Using a list of zipcodes, I can navigate through all the available homes for each zipcode and use the XPath to scrape all the necessary data I might need. In practice, this would be very similar to the DATA200 GoodReads project.</p>
+    <h3 class="code-line" data-line-start=11 data-line-end=12 ><a id="Specification_of_Model_11"></a>Specification of Model</h3>
+    <p class="has-line-data" data-line-start="12" data-line-end="13">I think the best model to use for this project is a Random Forest. This model is very versatile and is usable in most situations. I think in this case particularly it will better highlight the importance of certain variables over others. And since Random Forests deal with multicollinearity, there is less concern about variables being correlated with one another.</p>
+    <h3 class="code-line" data-line-start=14 data-line-end=15 ><a id="Implications_of_Stakeholders_14"></a>Implications of Stakeholders</h3>
+    <p class="has-line-data" data-line-start="15" data-line-end="16">I think there are a lot of stakeholders at hand that can be affected by this project. The first and most obvious of which is people who are currently looking for a home in the Northern Virginia area who are trying to find an estimated price given their specifications. A second potential stakeholder would be the people selling their homes, they can use the model to see what their own home is worth and determine a price accordingly. A third stakeholder would be the realty companies selling these homes. They can use the model to better understand what types of homes are being sold and at what price points and can change their realty strategy accordingly.</p>
+    <h3 class="code-line" data-line-start=17 data-line-end=18 ><a id="Ethical_Legal_and_Societal_Implications_17"></a>Ethical, Legal, and Societal Implications</h3>
+    <p class="has-line-data" data-line-start="18" data-line-end="19">There are definitely some legal and societal implications of doing this project both positive and negative. In terms of legality, Zillow likely is not a fan of people scraping their site and it might violate the terms of service. There is also the question of violating homeowners privacy but I think as long as addresses and photos are not included that should minimize any direct exposure to homeowners. Some negative societal impacts that could be at play is the potential that there might end up bringing reduced affordability in certain areas or bulk investing. However, there is also societal upsides that are also important to consider such as affordability transparency and helping buyers and renters make better decisions about the kind of home they buy.</p>
+    </body></html>

presentations/test.Rmd

-Original file line number
+Diff line change
@@ -0,0 +1,28 @@
+    ---
+    title: "My First Presentation"
+    subtitle: "⚔<br/>with xaringan"
+    author: "Ben Brandt"
+    institute: "Dickinson College"
+    date: "Sys.Date()"
+    output:
+      xaringan::moon_reader:
+        css: xaringan-themer.css
+        lib_dir: libs
+        nature:
+          highlightStyle: github
+          highlightLines: true
+          countIncrementalSlides: false
+    ---
+    ```{r xaringan-themer, include=FALSE, warning=FALSE}
+    library(xaringanthemer)
+    style_mono_accent(
+      base_color = "#1c5253",
+      header_font_google = google_font("Josefin Sans"),
+      text_font_google   = google_font("Montserrat", "300", "300i"),
+      code_font_google   = google_font("Fira Mono")
+    )
+    ```
+    # My First Slide in Xaringan
+    **test slide**
+    ---

presentations/test.html

-Original file line number
+Diff line change
@@ -0,0 +1,172 @@
+    <!DOCTYPE html>
+    <html lang="" xml:lang="">
+      <head>
+        <title>My First Presentation</title>
+        <meta charset="utf-8" />
+        <meta name="author" content="Ben Brandt" />
+        <script src="libs/header-attrs-2.30/header-attrs.js"></script>
+        <link rel="stylesheet" href="xaringan-themer.css" type="text/css" />
+      </head>
+      <body>
+        <textarea id="source">
+    class: center, middle, inverse, title-slide
+    .title[
+    # My First Presentation
+    ]
+    .subtitle[
+    ## ⚔<br/>with xaringan
+    ]
+    .author[
+    ### Ben Brandt
+    ]
+    .institute[
+    ### Dickinson College
+    ]
+    .date[
+    ### Sys.Date()
+    ]
+    ---
+    # My First Slide in Xaringan
+    **test slide**
+    ---
+        </textarea>
+    <style data-target="print-only">@media screen {.remark-slide-container{display:block;}.remark-slide-scaler{box-shadow:none;}}</style>
+    <script src="https://remarkjs.com/downloads/remark-latest.min.js"></script>
+    <script>var slideshow = remark.create({
+      "highlightStyle": "github",
+      "highlightLines": true,
+      "countIncrementalSlides": false
+    });
+    if (window.HTMLWidgets) slideshow.on('afterShowSlide', function (slide) {
+      window.dispatchEvent(new Event('resize'));
+    });
+    (function(d) {
+      var s = d.createElement("style"), r = d.querySelector(".remark-slide-scaler");
+      if (!r) return;
+      s.type = "text/css"; s.innerHTML = "@page {size: " + r.style.width + " " + r.style.height +"; }";
+      d.head.appendChild(s);
+    })(document);
+    (function(d) {
+      var el = d.getElementsByClassName("remark-slides-area");
+      if (!el) return;
+      var slide, slides = slideshow.getSlides(), els = el[0].children;
+      for (var i = 1; i < slides.length; i++) {
+        slide = slides[i];
+        if (slide.properties.continued === "true" || slide.properties.count === "false") {
+          els[i - 1].className += ' has-continuation';
+        }
+      }
+      var s = d.createElement("style");
+      s.type = "text/css"; s.innerHTML = "@media print { .has-continuation { display: none; } }";
+      d.head.appendChild(s);
+    })(document);
+    // delete the temporary CSS (for displaying all slides initially) when the user
+    // starts to view slides
+    (function() {
+      var deleted = false;
+      slideshow.on('beforeShowSlide', function(slide) {
+        if (deleted) return;
+        var sheets = document.styleSheets, node;
+        for (var i = 0; i < sheets.length; i++) {
+          node = sheets[i].ownerNode;
+          if (node.dataset["target"] !== "print-only") continue;
+          node.parentNode.removeChild(node);
+        }
+        deleted = true;
+      });
+    })();
+    // add `data-at-shortcutkeys` attribute to <body> to resolve conflicts with JAWS
+    // screen reader (see PR #262)
+    (function(d) {
+      let res = {};
+      d.querySelectorAll('.remark-help-content table tr').forEach(tr => {
+        const t = tr.querySelector('td:nth-child(2)').innerText;
+        tr.querySelectorAll('td:first-child .key').forEach(key => {
+          const k = key.innerText;
+          if (/^[a-z]$/.test(k)) res[k] = t;  // must be a single letter (key)
+        });
+      });
+      d.body.setAttribute('data-at-shortcutkeys', JSON.stringify(res));
+    })(document);
+    (function() {
+      "use strict"
+      // Replace <script> tags in slides area to make them executable
+      var scripts = document.querySelectorAll(
+        '.remark-slides-area .remark-slide-container script'
+      );
+      if (!scripts.length) return;
+      for (var i = 0; i < scripts.length; i++) {
+        var s = document.createElement('script');
+        var code = document.createTextNode(scripts[i].textContent);
+        s.appendChild(code);
+        var scriptAttrs = scripts[i].attributes;
+        for (var j = 0; j < scriptAttrs.length; j++) {
+          s.setAttribute(scriptAttrs[j].name, scriptAttrs[j].value);
+        }
+        scripts[i].parentElement.replaceChild(s, scripts[i]);
+      }
+    })();
+    (function() {
+      var links = document.getElementsByTagName('a');
+      for (var i = 0; i < links.length; i++) {
+        if (/^(https?:)?\/\//.test(links[i].getAttribute('href'))) {
+          links[i].target = '_blank';
+        }
+      }
+    })();
+    // adds .remark-code-has-line-highlighted class to <pre> parent elements
+    // of code chunks containing highlighted lines with class .remark-code-line-highlighted
+    (function(d) {
+      const hlines = d.querySelectorAll('.remark-code-line-highlighted');
+      const preParents = [];
+      const findPreParent = function(line, p = 0) {
+        if (p > 1) return null; // traverse up no further than grandparent
+        const el = line.parentElement;
+        return el.tagName === "PRE" ? el : findPreParent(el, ++p);
+      };
+      for (let line of hlines) {
+        let pre = findPreParent(line);
+        if (pre && !preParents.includes(pre)) preParents.push(pre);
+      }
+      preParents.forEach(p => p.classList.add("remark-code-has-line-highlighted"));
+    })(document);</script>
+    <script>
+    slideshow._releaseMath = function(el) {
+      var i, text, code, codes = el.getElementsByTagName('code');
+      for (i = 0; i < codes.length;) {
+        code = codes[i];
+        if (code.parentNode.tagName !== 'PRE' && code.childElementCount === 0) {
+          text = code.textContent;
+          if (/^\\\((.|\s)+\\\)$/.test(text) || /^\\\[(.|\s)+\\\]$/.test(text) ||
+              /^\$\$(.|\s)+\$\$$/.test(text) ||
+              /^\\begin\{([^}]+)\}(.|\s)+\\end\{[^}]+\}$/.test(text)) {
+            code.outerHTML = code.innerHTML;  // remove <code></code>
+            continue;
+          }
+        }
+        i++;
+      }
+    };
+    slideshow._releaseMath(document);
+    </script>
+    <!-- dynamically load mathjax for compatibility with self-contained -->
+    <script>
+    (function () {
+      var script = document.createElement('script');
+      script.type = 'text/javascript';
+      script.src  = 'https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-MML-AM_CHTML';
+      if (location.protocol !== 'file:' && /^https?:/.test(script.src))
+        script.src  = script.src.replace(/^https?:/, '');
+      document.getElementsByTagName('head')[0].appendChild(script);
+    })();
+    </script>
+      </body>
+    </html>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uploaded Test Slides #12

Diff view

Diff view

There are no files selected for viewing

Uh oh!

Uploaded Test Slides #12

Are you sure you want to change the base?

Uploaded Test Slides #12

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!