INTA 6450: Data Analytics and Security

Instructional Team

Jeff Borowitz
Jeff Borowitz
Instructor
Emma Shumway
Emma Shumway
Head TA

Overview

This course explores the foundations of big data and data analytics, including its foundations in computing technology and statistics. It explores the nature of underlying technical challenges and statistical assumptions used to understand relationships in a variety of applied fields, with a focus on the fields of fraud detection and communication monitoring. Engages with the social implications of increased knowledge, surveillance, and behavioral prediction made possible by big data, and the ethical tradeoffs faced. While the course includes an analytics project, no prior technical experience is required.

[Update: New Data Analytics Project on Industry Data]

The course has a data partnership with the mobile app security company NowSecure. This partnership, which will require students to sign an NDA, allows students to develop and hone their data analytics skills on real world cybersecurity data.

NowSecure scans all Android and iOS apps available on public app stores in by performing automated using static, dynamic, interactive and API Security testing code analysis as well as dynamic tooling. The dynamic tools involve running each app through a series of automated actions on actual phone hardware and recording detailed logs and results across a variety of components. 

INTA 6450 students can choose to complete a course project using data analytics to find new security vulnerabilities and/or patterns in vulnerabilities across real apps. Students will gain detailed, hands-on experience with a fast growing part of cybersecurity - mobile app security. They can potentially find real and important security vulnerabilities affecting the apps installed on millions of devices.

Students choosing to work on these projects will have the opportunity to ask questions and get feedback from NowSecure professionals on everything from data definitions to issues, challenges, and extensions related to their project.

This course is not foundational and does not count toward any specializations at present, but it can be counted as a free elective.

Course Goals

Once completed, you should have the following capabilities:

  1. Familiarity and exposure
    • Demonstrate familiarity with hardware trends underlying the rise of big data
    • Demonstrate familiarity with software trends underlying the rise of big data
    • List specific links between big data technologies that affect our security as a society
  2. Reasoning about computing technology and about models
    • Articulate a strategy for defining and algorithmic finding a specific type of wrongdoing
    • Identify problems that technologies will likely solve in the future
    • Identify problems that technologies likely can't solve in the future
  3. Technical execution of code
    • Use R programming language to perform statistical analysis
    • Use Python to find the most common words in a book
    • Use Python to query an email database
    • Use Python to algorithmically identify emails of interest

Sample Syllabi

Summer 2024 syllabus (PDF)
Spring 2024 syllabus (PDF)
Spring 2022 syllabus (PDF)

Note: Sample syllabi are provided for informational purposes only. For the most up-to-date information, consult the official course documentation.

Course Videos

You can view the lecture videos for this course here.

Before Taking This Class...

Suggested Background Knowledge

This course requires no specific background knowledge. Many students with no programming background and no background in statistics have found this class approachable and learned a lot. At the same time, upper level PhD students in economics and undergrad/masters students in a range of subjects at Georgia Tech who have modest to considerable computing skills have also benefited from the wide-ranging survey of data analytic methods, the open ended project work, and the discussions around how data analytics fit into problems related to security and society.

Technical Requirements and Software

We do some exercises in R and Python. We host Jupyter notebooks for the students for Python, and we’ve had students run their own R/RStudio installations. The course project is to find “wrongdoing” among the 250k Enron emails.

Academic Integrity

All Georgia Tech students are expected to uphold the Georgia Tech Academic Honor Code. This course may impose additional academic integrity stipulations; consult the official course documentation for more information.