Back to Contents!
Next: One-Dimensional Motion Up: Working with Maple Previous: Quadratic and Poly...

Dealing with Data I: A ``simple'' linear fit

The first true test of any scientific theory is whether or not people can use it to make accurate predictions. Calculus, being the study of quantities that change, provides the language and the mathematical tools to discuss and understand change in a precise, quantitative way. An important prerequisite to using calculus to analyze ``real-world'' situations is having a good understanding of the basic ``elementary'' functions (polynomials, logarithms, trigonometric functions, and all their compositions, inverses, etc.).

With an understanding of the calculus of the basic functions, it is often possible to formulate mathematical models of (idealized versions of) phenomena in one of two ways: First, enough might be understood about the phenomenon so that a mathematical formulation of it is directly attainable. For example, Newton's second law of motion, that force is the derivative of momentum, where momentum is the product of mass and velocity, is such a model. At the other extreme are models which are derived purely empirically -- a large quantity of data is collected and one searches for the appropriate formula to match the data with reasonable accuracy. Many economic models are derived in this manner.

More often, however, mathematical models are developed with a combination of the two approaches: one has some basic understanding of a phenomenon, enough to restrict the class of functions appropriate to model it. Very often, one knows enough so that the functions are determined except for a few parameters, like the coefficients of a polynomial, or some other kind of multiplicative factor. Then, experimental data are used to determine the values of the missing parameters. Many of the ``constants'', ``coefficients'' and ``numbers'' one encounters in science (e.g., rate constants of chemical reactions, half-lives of radioactive elements, coefficients of thermal conductivity, the gravitational constant, etc..) started out as the last unknown parameters in a mathematical model, which had to be determined by collecting experimental data.

In several of the problems you will do in this course, you will renew your acquaintance with some of the basic functions of mathematics and science, you will learn some basic data analysis techniques and apply them to discover some basic physical laws, and then to make some predictions of your own.

Linear fits: In many situations, researchers want to understand how some quantity will change when another quantity is varied. A simple example of this might be the following sports-physics experiment: A basketball is dropped from different heights, and the height of the (first) bounce is measured. What is the relationship between the height of the drop and the height of the bounce?

Dr. DeTurck collected the following data in his garage:
Height of Drop (in.) Height of Bounce (in.)
36 25
40 29
40 28.5
44 31.5
44 32
48 35
52 38
56 42
60 46

To get ready for our subsequent analysis, we use Maple to make a list of the drop heights and the corresponding bounce heights:




#Make an ordered list of data points
#
drop:=[36,40,40,44,44,48,52,56,60]:
bounce:=[25,29,28.5,31.5,32,35,38,42,46]:

The square brackets indicate to Maple that the set of numbers is an ordered list. The two statements end with colons, rather than semicolons, so that there will be no output from them (because in this case, Maple would just parrot back the input).

It will be helpful to have Maple make what statisticians call a ``scatter plot'' of the data points. To plot points from a list, Maple expects an ordered lists containing the x-coordinate of the first point followed by the y-coordinate of the first point, followed by the x-coordinate of the second point, etc. Because we will need to do this often, we will define a Maple procedure to take an ordered list of x values and another ordered list of y values and merge them together into a list to which Maple's plot routine can be applied. Here is the definition:


lmerge:=proc(x,y) local i,b;
b:=NULL;
for i from 1 to nops(x) do b:=b,x[i],y[i]; od;
[b]
end:

If you type this in yourself (it is available via the net as described above) -- on Windows machines be sure to use "Shift-Enter" between the lines of the program, so Maple doesn't try to execute what you've typed so far before you are ready.

To test your typing, see if you get the same result:


lmerge(drop,bounce);

This has the drop height and the corresponding bounce height right next to each other, for use with plot. So try plotting the data (the ``style=POINT'' and ``symbol=cross'' is to keep Maple from connecting the dots):


plot(lmerge(drop,bounce), style=POINT, symbol=cross);

The data looks pretty linear, but how do we find the line that ``best'' describes it? There are several different definitions of ``best'' in use. We will be using the so-called ``least-squares'' fit, discussed in class. For our drop-bounce data, the least squares line is obtained as follows: NOTE!!! Ignore the following two lines if you downloaded the leastsquares program.


#Ignore the following 2 lines if you downloaded ``leastsquares".
with(stats,fit);
leastsquare[[x,y],y=a*x+b]([drop,bounce]);
If using Dr. DeTurck's program, type the following instead:

leastsquares(drop,bounce);
For either method, your result should be

Now we can plot the data and the line to see how well Maple did with fitting the data. First, let's assign a function to the equation Maple returned.


fitlin := .8502604294*x - 5.567708941;
Now we can plot both the line and data and store those plots as variables (actually you save the commands that tell Maple how to make the plot).

fitplot := plot(fitlin,x=35..60):
pointplot := plot(lmerge(drop,bounce), style=POINT, symbol=cross):
We can display the plot commands we've saved. Before we do that, we have to have Maple load the display command.

with(plots,display):
display({fitplot,pointplot});

Problem 1: The following table (from the World Almanac) gives the winning heights (in inches) in the Olympic pole vault:
Year Height Year Height
1896 130 1952 179
1900 130 1956 179.5
1904 137.75 1960 185
1908 146 1964 200.75
1912 155.5 1968 212.5
1920 161 1972 216.5
1924 155.5 1976 216.5
1928 165.25 1980 227.5
1932 169.75 1984 226.25
1936 171.25 1988 237.25
1948 169.25 1992 228.25

Fit this data with a least-squares line. What interpretation do you give to the slope of your line? Using your linear model, predict the winning height in the 1996 Olympics... in the 2096 Olympics. According to your model, in what year will pole vaulters be able to ``leap tall buildings in a single bound''? (the Empire State building is 1250 feet tall).

Comment on the reasonableness of your model (including comments about the residuals).

Problem 2: In a Physics 1 experiment, students measure the period of a pendulum (i.e., the amount of time the pendulum takes to swing back and forth) as a function of its length. One group of students obtained the following data:
Length (cm) Period (sec) Length (cm) Period (sec)
6.5 0.51 24.4 1.01
11.0 0.67 26.5 1.08
13.2 0.73 30.6 1.13
15.0 0.79 34.3 1.25
18.1 0.89 37.5 1.28
23.0 0.98 41.5 1.33

As you did in the first problem, find the least squares line that best fits this data. Compute and plot the residuals -- explain why these indicate that a different model is needed.



Next: One-Dimensional Motion Up: Working with Maple Previous: Quadratic and Poly...


larryg@upenn5.hep.upenn.edu
Tue Jul 30 19:35:03 EDT 1996
subsubsection1_1_1_3.html