Fall 2020

Agenda

  • Introduction
    • Syllabus
    • Assignments
      • Homework
      • Labs
      • Data Project
      • Final exam
      • Meetup Presentation
    • The DATA606 R Package
    • Using R Markdown

Introduction

A little about me:

  • Assistant Professor at CUNY in Data Science and Information Systems
  • Principal Investigator for a Department of Education Grant (part of their FIPSE First in the World program) to develop a Diagnostic Assessment and Achievement of College Skills (www.DAACS.net)
  • Authored over a dozen R packages including:
  • Specialize in propensity score methods. Three new methods/R packages developed include:

Also a Father…

Runner…

And photographer.

Syllabus

Syllabus and course materials are here: https://fall2020.data606.net

The site is built using the Blogdown R package and hosted on Github. Each page of the site has a “Improve this page” link at the bottom right, use that to start a pull request on Github.

We will use Blackboard primary only for submitting assignments. Please submit:

  • A PDF or link to the built HTML (e.g. Rpubs, Github)

PDFs are preferred for the homework as there is some LaTeX formatting in the R markdown files. The tineytex R package helps with install LaTeX, but you can also install LaTeX using (MiKTeX (for Windows) and BasicTeX (for Mac) See this page for more information: https://fall2020.data606.net/course-overview/software/

Meetups

We will have meetups on Wednesday evenings at 8:30pm.

Meetups will be recorded and made available the next day on the course website.

Though attendance is not strictly required, I expect everyone to watch them at during the week. I use the meetups to convey important information and announcements. Students who attend the meetups tend to do well on the assignments.

Please note: Students who participate in this class with their camera on or use a profile image are agreeing to have their video or image recorded solely for the purpose of creating a record for students enrolled in the class to refer to, including those enrolled students who are unable to attend live. If you are unwilling to consent to have your profile or video image recorded, be sure to keep your camera off and do not use a profile image. Likewise, students who un-mute during class and participate orally are agreeing to have their voices recorded. If you are not willing to consent to have your voice recorded during class, you will need to keep your mute button activated and communicate exclusively using the “chat” feature, which allows students to type questions and comments live.

Start End Topic
Wednesday, August 26, 2020 Sunday, August 30, 2020 Chapter 1 - Intro to Data
Monday, August 31, 2020 Sunday, September 06, 2020 Chapter 2 - Summarizing Data
Monday, September 07, 2020 Sunday, September 13, 2020 Chapter 3 - Probability
Monday, September 14, 2020 Sunday, September 27, 2020 Chapter 4 - Distributions
Monday, September 28, 2020 Sunday, October 04, 2020 Chapter 5 - Foundation for Inference
Monday, October 05, 2020 Sunday, October 11, 2020 Chapter 6 - Inference for Categorical Data
Monday, October 12, 2020 Sunday, October 18, 2020 Chapter 7 - Inference for Numerical Data
Monday, October 19, 2020 Sunday, November 01, 2020 Chapter 8 - Linear Regression
Monday, November 02, 2020 Sunday, November 29, 2020 Chapter 9 - Multiple and Logistic Regression
Monday, November 30, 2020 Sunday, December 06, 2020 Intro to Bayesian Analysis
Wednesday, December 09, 2020 Sunday, December 13, 2020 Final Exam

Assignments

  • DAACS (5%)
  • Homework (20%)
  • Labs (40%)
    • Labs are designed to introduce to you doing statistics with R.
    • Answer the questions in the main text as well as the “On Your Own” section.
  • Data Project (15%)
    • This allows you to analyze a dataset of your choosing. Projects will be shared with the class. This provides an opportunity for everyone to see different approaches to analyzing different datasets.
    • Proposal is due October 25th (5%); Final project is due December 9th (15%).
  • Final exam (15%)
  • Meetup Presentation (5%)
    • Present one practice problem during our weekly meetups. Signup using the Google Spreadsheet.
    • Please select odd number questions only!

Communication

The DATA606 R Package

The package can be installed from Github using the devtools package.

devtools::install_github('jbryer/DATA606')

Download the Setup.R script here: https://github.com/jbryer/DATA606Fall2020/blob/master/R/Setup.R

Important Functions

  • library('DATA606') - Load the package
  • vignette(package='DATA606') - Lists vignettes in the DATA606 package
  • vignette('os4') - Loads a PDF of the OpenIntro Statistics book
  • data(package='DATA606') - Lists data available in the package
  • getLabs() - Returns a list of the available labs
  • viewLab('Lab1') - Opens Lab1 in the default web browser
  • startLab('Lab1') - Starts Lab1 (copies to getwd()), opens the Rmd file
  • shiny_demo() - Lists available Shiny apps

Using R Markdown

R Markdown files are provided for all the labs and homework.

  • You can download R markdown template files for the homework by right clicking and selecting “Save file as…” from the Homework page.
  • You can start a lab using the DATA606::startLab function.

However, creating new R Markdown files in RStudio can be done by clicking File > New File > R Markdown.

For more information about R Markdown, check out the RStudio page at https://rmarkdown.rstudio.com/