SIO 221B, Data Analysis
Professor Sarah Gille
Homework #5, problem 3
I found the data set for pressure in Darwin, Australia from 1980 to 2002. It is frequently
used in the Southern Oscillation Index, pertaining to the El Nino cycle. The data clearly
has an annual trend, and appears to have some slower oscillation, as well. First I
removed the linear trend from the data by fitting a best-fit sinusoidal solution to the data.
For my model, I used:
M1: average trend 9.8830
M2: cos(2?t) -3.3265
M3: sin(2?t) -0.5529
The fit (figure 2) appears very good! However, the L2 norm of the model misfit is
colossal: 1,685. (what are units here?)
Then I wanted to find what the longer frequency oscillation was. The remaining signal
did not offer many clues, so I made a massive model, using sin and cos with a period of 2
to 20 years.
The model parameters are: for Cosine:
Period (y) M
2.0000 0.0671
3.0000 -0.0967
4.0000 -0.0156
5.0000 -0.1597
6.0000 0.0193
7.0000 0.0441
8.0000 0.0370
9.0000 -0.0730
10.0000 0.0207
11.0000 -0.0101
12.0000 -1.4716
13.0000 -6.1314
14.0000 -15.0349
15.0000 -35.8864
16.0000 95.2218
17.0000 -112.3644
18.0000 -104.2515
19.0000 -51.9291
20.0000 5.6438
model parameters for Sine:
Period (y) M
2.0000 -0.0967
3.0000 0.0622
4.0000 0.0250
5.0000 -0.0167
6.0000 -0.0806
7.0000 0.1052
8.0000 0.0606
9.0000 -0.0985
10.0000 0.1068
11.0000 -0.2496
12.0000 0.0171
13.0000 -1.6673
14.0000 14.6509
15.0000 -39.2123
16.0000 27.8723
17.0000 68.7997
18.0000 -52.8989
19.0000 34.2019
20.0000 -13.8460
It looks like the largest variation is in the period range of 16-18 years. The fit of the
model clearly doesn't represent the large fluctuations in pressure difference. And the
leftover data doesn't appear to have much periodicity remaining.
The model misfit for this second model is now 1,510, about the same.