Tuesday, December 4, 2012

Advanced Algorithms

Course Description


Algorithm design and analysis is a fundamental and important part of computer science. This course introduces students to advanced techniques for the design and analysis of algorithms, and explores a variety of applications.

The topics and applications that we plan to cover include hashing, bloom filters, scheduling, network design, online load balancing, algorithms in machine learning, boosting (in the context of learning), Markov chains and the MCMC method, byzantine agreement, internet algorithms, and nearest neighbor algorithms. Enroute, we will encounter various useful ideas, including randomization, probabilistic analysis, amortized analysis, competitive analysis, eigenvalues, linear and semi-definite programming, high dimensional geometry, and random walks.

Download Lectures in PDF 

1.  09/05 W   [PDF]   Intro, greedy algorithms: scheduling, MST. (K&T §4, §5) 
2.  09/07 F     [PDF]   Set cover, Divide & Conquer, Dynamic programming. (K&T §5, §6, §11.3) 
3.  09/10 M   [PDF]   Dynamic programming. (K&T §6) 
4.  09/12 W   [PDF]   Network flow. (K&T §7) 
5.  09/14 F     [PDF]   Network flow applications, matchings. (K&T §7) 
6.  09/17 M   [PDF]   Randomized algorithms, Karger's min-cut algorithm. (K&T §13) 
         Here are some lecture notes by Avrim Blum on how to speed-up Karger's algorithm. 
7.  09/19 W   [PDF]   Randomized load balancing and hashing. (K&T §13.10, §13.6, M&R §8.4, §8.5) 
8.  09/21 F     [PDF]   Bloom filters, NP-completeness. (M&R §8.4, §8.5, M&U §5.5) 
         See also, this survey on the applications of bloom filters by Broder & Mitzenmacher. 
9.  10/01 M   [PDF]   NP-completeness contd., Approximation algorithms. (K&T §8, Vaz. §1) 
10.         10/03 W   [PDF]   Approximation via local search. 
11.         10/08 M   [PDF]   Linear programming, LP rounding. (Vaz. §14) 
12.         10/10 W   [PDF]   Randomized rounding, concentration bounds. (M&R §3.2, §4.1, §4.2) 
13.         10/15 M   [PDF]   Randomized rounding (contd.), LP duality. (M&R §4.2, Vaz. §12) 
14.         10/17 W   [PDF]   LP duality, Primal-dual algorithms. (Vaz. §12, 15) 
15.         10/19 F     [PDF]   Primal-dual algorithms. (Vaz. §15) 
16.         10/22 M   [PDF]   Semi-definite Programming. (Vaz. §26) 
17.         10/24 W   [PDF]   SDP (contd.), Streaming algorithms. 
18.         10/26 F     [PDF]   Streaming algorithms (contd.).          See this survey by Muthu Muthukrishnan for some motivation behind, and math used in, streaming algorithms. 
19.         10/29 M   [PDF]   Online algorithms & competitive analysis. (B&E-Y §1) 
         Here is a nice presentation by Pat Riley & Elly Winner about different approaches to evaluating online algorithms. 
20.         10/31 W   [PDF]   Caching/Paging, k-server problem. (B&E-Y §3, §4, §10) 
21.         11/05 M   [PDF]   Caching lower bound based on Yao's principle, Work function algorithm. (B&E-Y §8.4, §10, §12) 
         For a complete analysis of the work function and other k-server algorithms, see these detailed lecture notes (lectures 5-9) by Yair Bartal. 
22.         11/07 W   [PDF]   Work function (contd.), Online learning: regret minimization & the weighted majority algorithm. 
23.         11/12 M   [PDF]   Mistake bound model, winnow & perceptron algorithms. 
24.         11/14 W   [PDF]   MB model contd., PAC model. (K&V §1, §2) 
25.         11/16 F     [PDF]   PAC model, Occam's razor. (K&V §1, §2) 
26.         11/19 M   [PDF]   Boosting in the PAC framework. (K&V §4) 
27.         11/26 M   [PDF]   Random Walks & Markov chains. Cover time, hitting time. (M&R §6) 
28.         12/07 F     [PDF]   Random Walks & Markov chains: the resistance method, and mixing time. (M&R §6) 

Thursday, March 22, 2012

Scientific Programming

Objectives: This course provide a general overview of programming concepts, and specific introduction to Matlab, IDL(Interactive Data Language) and FORTRAN Language.  

I. INTRODUCTORY CONCEPTS 

 Week 1
Introduction: Unix background/overview, basic data structures.
PPT 1 | PDF
PPT 2 | PDF 

Week 2
Working in a Unix(-like) environment: Unix terminology, essential commands, shells(what they are/types/scripting),text editor(s).
PPT 2 | PDF 

Week 3
Network related topics: tunneling, vpn, ssh, ftp, scp, creating/maintaining a simple webpage (html), brief look at data format types.
The first part of class I'll cover the Unix/Linux commands left from last lecture.
The second part will be a demo illustrating many of these commands to transfer and work with data over a network.
Part of the process involves a nice opportunity to use a shell script.

PPT | PDF 

Week 4
Introduction to computer science: discrete logic, problem solving methodology, algorithm design, program structure, good programming practices, development/debugging tips.
PPT 

II. FORTRAN 90/95

Week 5

Introduction: Creating a basic program in Fortran.
Key | PDF | PPT
Example code
ILikePie.f90ClimateModel.f90TestingInts.f90Statistics.f90
Homework #1 PDF

Week 6
Step 2: Control Structures and Multi-Dimensional Arrays.
Key | PDF | PPT
Example code
OscarsIF.f90OscarsCase.f90OscarCount.f90MultiArray.f90OscarCount2.f90OscarCount3.f90
Homework #2 PDF

Week 7

Week 8

III. IDL: INTERACTIVE DATA LANGUAGE

Week 9
Week 9
Introduction to IDL.
Key | PDF | PPT
Example code

Week 11
Contour Mapping.
Key | PDF | PPT
Example code
cbar.procindex.procontour1.procontour2.procontour3.pro
legend.promake_postscript.proplotting_colors.prowrapping.pro
Data sets
data.txt is a 41 x 41 block of data representing "elevation" the data set covers the area 24.0 N to 48 N and -122.0 W to -72 W with 0.6 degrees between each grid point in the meridional direction and 0.75 degrees between each grid poin in the zonal direction.
This dataset was obtained from D. Fanning

tas_gfdl.nc is a global surface temperature dataset from the Climate of the 20th Century simulations produced with the GFDL CM 2.0 GCM. This dataset was obtained from the PCMDI website. 

Week 12
Optimization and Statistical Techniques.
Key | PDF | PPT
Example code
cindex.procolors_anomalies.proeof_example.proglobal_temp.propower_spec_example.pro
read_had_global.proread_sun_spot.prorm_seasonal_cycle.prormv_season.pro
temp_trend.pro temp_vs_sunspot.pro
Data sets
GP_precip.txt is timeseires of monthly mean Great Plains Preciptation
Extending from 1901-2002. This data set was derived from the CRU TS 2.1 preciptation data set.
hadcrut3gl.txt is a global combined land and  marine [sea surface temperature (SST) anomalies from HadSST2, see Rayner et al., 2006] temperature anomalies on a 5 x 5 degree grid-box basis.
anomalies are taken relative to the 1961-1990 base. The timeseires extends from
1850-present. For each year, the first row represents the monthly mean temperature anomalies and the second row represents the percent of the globe with data.
nino3.4_index.txt is a timeseires of the Nino 3.4 index. The timeseires extends from 1870-2002
sunspot_activity is a timeseries of the monthly mean international sunspot number from
http://www.ngdc.noaa.gov/stp/SOLAR/ftpsunspotnumber.html  The timeseries extends from 1850-present.
zg_ukmo.nc is a global geopotential height dataset from the Climate of the 20th Century simulations produced with the HadCM3 GCM. This dataset was obtained from the PCMDI website.

IV. MATLAB: MATRIX LABORATORY

Week 13
Introduction to MATLAB.
PDF | PPT

Week 14
Week 15
Week 16
MATLAB Extras.
PDF | PPT
Example code
AIRC_map_all.mAIRC_map_all.matAIRC_map_all.tif