STAT 427/627 Statistical Machine Learning
Fall 2024 -- Tuesday 5:30 - 8:00 pm (section 001), Wednesday 5:30 - 8:00 pm (section 002)

Instructor: Michael Baron
Office: Meyers Building, DMTI-106D (East Campus)
Phone: (202) 885-3130
Email: baron at american.edu
Office hours: Tuesday and Wednesday 4:00-5:15 pm in DMTI 106D
Teaching Assistant:TBD
TA email: TBD
TA office hours:TBD

R corner

Download R from this site and install it in your system.
The same site also contains various R manuals and other help.

      Classroom labs

Datasets

Data sources for the final project

  • A good collection of real data sets suitable for this project is in the Machine Learning UCI Repository.
  • A diverse collection of datasets from Hawkes prepared for your projects and supplied with project ideas.
  • A huge collection of data sets is linked to this data mining metasite.
  • If you are interested, you may get tons of Government data.
  • Air and space exploration? Here are NASA data bases.

    Antiracism and Social Justice

  • Detailed demographics data in the U.S.A. from the US Census Bureau
  • COVID-19 and Race and Ethnicity from the US Centers for Disease Control and Prevention
  • Notice the COVID-19 and Race and Ethnicity from the COVID tracking project
  • Income disparity from the US Census Bureau
  • Poverty data from the US Census Bureau
  • Health insurance coverage from the US Census Bureau
  • Household income from the US Census Bureau
  • Race and Economic Opportunity Data Tables from the US Census Bureau
  • Labor Force Statistics from the Current Population Survey from the US Bureau of Labor Statistics
  • Race and Origin of Victims and Offenders, the National Crime Victimization Survey from the US Dept of Justice Office of Justice Programs
  • Racial profiling, arrests, citations, warnings - police data from the US Data.gov
  • Unemployment, poverty, educational attainment for the U.S. States and counties from the US Dept. of Agriculture
  • Data sources for studies on racial justice and health equity from the UCLA Center for the Study of Racism, Social Justice & Health.

    COVID-19 data

  • Humanitarian Data Exchange (HDX) is a metasite that publishes and updates complete COVID-19 data from the World Health Organization, Metabiota, Global Health 50/50, Assessment Capacities Project (ACAPS), and others. Location: https://data.humdata.org/, https://data.humdata.org/event/covid-19

  • Johns Hopkins University COVID-19 detailed up-to-date data on confirmed infected, recovered, tested, and fatal cases by countries, states, and main locations of the outbreak are published on HDX and Github. Location: https://data.humdata.org/dataset/novel-coronavirus-2019-ncov-cases, https://github.com/CSSEGISandData/COVID-19

  • GitHub, Inc., publishes and updates data bases and accompanying software packages on the on the COVID-19 pandemic outbreak. Location:
    https://github.com/datasets/covid-19,
    https://github.com/github/covid19-dashboard,
    https://github.com/ImperialCollegeLondon/covid19model,
    https://github.com/neherlab/covid19_scenarios,
    https://github.com/nytimes/covid-19-data
    and by country: Italy, Japan, India, etc.

  • CEBM/Oxford data

    Handouts

    Food for thought



    Questions/comments/suggestions? Write to baron@american.edu