Re: group data statistical functions
- From: John C Nash <nashjc uottawa ca>
- To: gnumeric-list gnome org, udippel uniten edu my
- Subject: Re: group data statistical functions
- Date: Sat, 20 Sep 2008 08:46:24 -0400
Having just retired from teaching such things for over 3 decades, I must
admit feeling very tired from trying to get the "formulas" purged from
textbooks.
The problem is that the grouped data can have descriptive statistics
(and order statistics too) that are rather poor approximations to the
actual values from raw data. Thus it is quite important that the user
really does decide which approximation should be used. This is even more
the case when the ranges of the bins are not equal --- and published
statistics are VERY bad this way.
The data presented are integers, so the grouped and raw data should
produce the same results, but we don't know a priori whether data
supposedly at "2" is from integers or numbers anywhere in
(1.5, 2.5] or [1.5, 2.5) or [1,2) even.
A grouped descriptive statistic almost needs a page of documentation per
number. The exam results on any question I gave to otherwise smart
students on this topic were always <50%. It's not rocket science here.
Once you realize what is going on, one can quickly figure out whether it
is worth doing.
Sorry if this seems a bit negative, but the dangers are rather like
providing working flying controls for the kid in the back of the 777.
Offline I can provide ways to do it pretty quickly.
JN
[
Date Prev][
Date Next] [
Thread Prev][Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]