Coding Project: Interprocedural Dataflow

Due: December 18th, 4:00 PM

For this assignment, you will extend the dataflow code from your previous work. You will extend the analysis such that it can do a basic interprocedural dataflow analysis over the ICFG, rather than over the CFG. The analysis target will be the same as the prior projects, detailed here.

Note on Academic Integrity / Collaboration

The code should be substantially your own. Although you may use auxiliary libraries, the fundamental tasks described must be performed by code you wrote for this assignment. You may work with 1 partner on this assignment. For credit on the assignment, each submission should be no more than incidentally similar in design to any other and each member of a submitting team must be able to explain their solution in detail. Each member of the team should be indicated in a file called COLLAB.txt included in the submitted tarball, which should also disclose any references or third-party code used as part of completing the assignment.

Analysis Details

Your analysis should consist of a flow-sensitive, context-insensitive, interprocedural dataflow analysis. The analysis should be a complete leak detector (i.e. no false negatives). The analysis is very coarse: it will simply build the gmod/gref sets for any global variables, and consider (only) those variables to hold secret data on a function call. An input with a leak detected should cause the program to print the single line leak

SOURCE and SINK

The analysis will still consider the only source of secret data to be an externally-defined special function named SOURCE and the only sink of data to be an externally-defined special function named SINK (these functions have no other behavior). These two functions will be the only externally-defined functions, all others will be declared in the input file.

Consider the following examples:

Example 1

@a = global i32 0

define void @foo() {
  store i32 1, ptr @a, align 4
  ret void
}

define dso_local i32 @main() {
  call void @foo()
  %1 = load i32, ptr @a, align 4
  call void @SINK(i32 %1)
  ret i32 0
}

declare i32 @SOURCE(...)
declare void @SINK(i32)
This code should print out FLOW, because the analysis would detect that a is in the gmod set of foo, and a flows to SINK in main. Note that this is a false positive, because although a is modified, it does not contain SOURCE data - that's ok.

Example 2

@a = global i32 0

define void @foo() {
  ret void
}

define dso_local i32 @main() {
  call void @foo()
  %1 = load i32, ptr @a, align 4
  call void @SINK(i32 %1)
  ret i32 0
}

declare i32 @SOURCE(...)
declare void @SINK(i32)
This code should print out NO FLOW, because the analysis would detect that although there is a call to foo, the only value that flows to SINK is a, which is not in the gmod set of foo.

Example 3

@b = global i32 0, align 4
@a = global i32 0, align 4

define i32 @main() {
  %1 = call i32 (...) @SOURCE()
  store i32 %1, ptr @b, align 4
  %2 = load i32, ptr @a, align 4
  call void @SINK(i32 noundef %2)
  ret i32 0
}

declare i32 @SOURCE(...)
declare void @SINK(i32 noundef)

This code should print out NO FLOW, because although there is a call to SOURCE, it only puts secret data in b, and it the variable a that flows to SINK, not b.

Your Submission

You should submit a gzipped tarball named P4.tgz containing a single directory named P4. Within P4 you should include whatever code you need to complete the project as well as your COLLAB.txt.

For the purpose of grading, your code should be able to be tested with the following sequence of commands on the cycle servers:

tar -xf P4.tgz
cd P4
make
./dataflow test1.ll
where test1.ll is a file containing code in our analysis target language again, referenced here.

Command-line compliance

As described above, you may use any coding language of your choice to complete this assignment. However, the above sequence of commands must work without error for full credit. For example, if you are using python, you might include a Makefile that immediately exits 0 (since there is no code to build). Furthermore, you should either include a helper script named graph that launches the python interpreter with its argument.