Welcome to Phrogram Sign in | Join | Help


Use of Phrogram for a Science Fair Project

  •  10-31-2006, 5:22 PM

    Use of Phrogram for a Science Fair Project

    Attachment: 2007sciencefair.zip
    This tale will provide some insight into how a 4th grader can be encouraged, well guided, to use Phrogram to enhance his chances of placing in the local and possibly regional science fair. My son decided that his topic for next year's science fair was going to be "What is the hottest time of day?" Now the normal approach to this would have the child regularly reading a thermometer over a 1-2 week period to acquire the data. But let's take the easy way out and use the measurements that someone else has made. Specifically the local weather station. One can access and download local weather information from local universities or in some cases from the National Weather Service. We looked at what was available and downloaded about 60 years worth of data from the NOAA Climatological Web site. Surf to http://www7.ncdc.noaa.gov/IPS/LCDPubs?action=getstate to get started.

    Now we have some hourly data points. Not the 50-100 points that one might need for the simple project but 307,505 hourly temperature readings covering 33.7 years!! That is something that will catch a science fair judge's attention. So how to set things up for the son to do his analysis? Well Excel might be able to process the data in 10 chunks but it would be intensive to figure out VB and teach him how to do macro's. Logo wasn't going to do much better at handling large sets of numbers but the new kid called KPL looked interesting... Why even a past master of FORTRAN IV could probably figure out enough to guide the son on the proper analysis path. We sat down and played with the examples in KPL 1.1 to begin to understand some of the basics of programming in this modern era. But what we needed for the science fair project was something that would read and write files.  Along came the KPL 2.0 beta's with File I/O and they were used to develop the data analysis program.

    So what's the logic needed to analyze the weather data? One is looking for the hottest hour of the day. So the program needs to open the data file, read a line, extract the date, time, & temp. See if the temp for that line is the hottest for the day. Remember the temp if necessary, read a new line and see if it is a new day. If a new day write out the previous day's hottest temperature.  Keep the cycle churning until all of the data is read and you have an output that consists of a day the time and the hottest temperature.

    Now there is one thing I have overlooked in my discussion so far. One MUST deal with a number of data FORMATting issues. The original data had many, many columns. For a beginning programmer dealing with arrays of numbers is not the easiest thing to grasp. Locating and extracting the 24th member of an array wasn't understood. But one can use a text editor or Excel and remove all of the extra data leaving only a date/time/hour stamp and the temp. That simplifies things alot. We chose to separate the data using spaces but a tab would have been better. Why? Well entries like '-13' left fewer spaces and that tripped up the condition used to extract the temperature string. Running the test program would hang up with an error and one would have to manually locate the errant data line to determine the cause of the problem. But hey that was a useful exercise for him to do. :-) Counting spaces was not much fun but it introduced the concept of debugging. And speaking of debugging some of the data was missing and replaced by "stars." Nothing like doing a number comparison for hi temperature against a character to hang up the program. The solution was to replace all of the missing data with a bogus temperature reading. Since the question was to find the hottest temperature the missing data was replaced by a value of -100.

    How to teach the son the various principles needed to solve the problem? We started out working on the included samples like File IO to read and display a small example version of the dataset. The next example implemented was to work with substrings to properly extract out the year, day, and time by displaying them on the console.  Next the example If/then was implemented to evaluate the logic needed to determine if hour reading was in the current day or a new day. Finally we went back to File IO and learned how to write out the items we were seeing within the console to a file. So now that we had an example that worked on a test data set I sat the son down to apply these things within a NEW program to work on the real dataset. So he stepped thru the process of starting with the default shell program then adding in the read file, comparison, and results components in a manner similar to what was done on the test data set. Hopefully doing it this way provides him a better understanding of what done within the program to find the data within the dataset. Processing the dataset took about 20 seconds.

    Now the analysis of the data isn't completed yet. There are 12279 data points (1 per day) to be massaged. He imported the output file into Excel. I added a binning macro from some obscure website that would allow him to more easily count how many days were the hottest at a particular time point.  After binning he did his usual playing within the graphing functions to find his result.

    Now based upon his graphs he can say that the hottest time of the day here in Indianapolis is around 4pm. A graph of the highest temperature vs day allowed him to show the seasonal variations over the 3 decades. The nature of the graph could be used to imply that Indy resides in a temperate 4 season area. Hypothesis about what the plot would look like for Miami or Anchorage may add bonus points in the eyes of the judges.

    What's next? He may do a follow-up analysis and find the coldest hour of the day. Switch the  -100 to 100 in the datasets and alter the program to locate the lowest number should take only a few minutes to perform.  Once again this would be bonus points to the judges in terms of following up on an initial observation.

    The final aspect of exhibiting the science fair project comes down to presenting and discussing the project with the judges. 4th grade boys are at a distinct disadvantage as the competing girls are generally much better at talking about their work. But then he managed to spend 20 minutes this spring talking to a judge about using reference spots for measuring light intensity from microwave plasma balls. Long live our singed microwave.

    Attached is a recent iteration of his program, a partial data set, and the Excel file being used to plot the results.

    Science rules.

View Complete Thread