1. 2009 School Certificate results in 6 subjects for 4500 students (27,000 entries)
This data dates from before the start of my studies. It is crude and non-experimental. However, it is of significance since half of the students in these examinations had been schooled with laptops for over a year and half had not. The data includes school (36 schools), gender, postcode, socio-economic status (SES) by postcode plus some 'value-added' data. To this data I have also added LaptopYes/No and also Mac/PC. After an intensive 2 weeks of mutiple regression at ACSPRI in January I have subsequently performed various multiple regressions on this data using PASW Statistics 18 (SPSS) with the prime model being:
Exam score = f(gender, SES, school, subject, LaptopY/N)
As previously noted this is crude, non-experimental data. Points to consider:
- the SES by postcode is from the Australian Bureau of Statistics SEIFA - Socio-Economic Index for Areas. SEIFA 4 has been used: Parent Education and Employment category, since this accounts for the best variance
- I originally hoped to use the 'value-added' data as my dependent variable. However, this was generated by an external statistician, incorporating SES by postcode, gender etc already. As such there is massive multicollinearity and it has next to no correlation with my model plus I do not want to have to justify someone else's construct, whereas examination scores are a matter of record.
- the Mac/PC variable was originally included but appeared to have no bearing. This will be reported on
- dummy variables were used for the 6 subjects and 36 schools
- importantly, the model above, which appeared to be the most all-encompassing for the data available, also provided the best goodness of fit (e.g. filtering down to only Science scores decreased the goodness of fit)
- the coefficient of determination R2 is only 0.166. So the first dilemma is what to make of this. Some people can get a bit hysterical about R2 values, a rule-of-thumb I have heard is that Sciences would hope for an R2 > 0.80 and Social Sciences would hope for at least R2 > 0.30. Either way I am working with 0.166 though this will not make up the main analysis of my research. As a colleague pointed out "we are not so much concerned about the fuzziness of the cloud as trying to find the best path through it".
- an added complication is that several of the schools seem to have poor or very poor significance. The model has not taken into account any variation in deployment of 1:1. Should I remove schools with alternative deployment or poor significance?
- with educational analysis being so complicated and full of overlapping variables it has been suggested that I should use Multi-Level Modelling instead, but this is a road I don't really want to go down. Any advice would be greatly appreciated!
2. 2010 School Certificate results in 6 subjects for 4500 students (27,000 entries)
This is the same set of data as above but for 2010 instead. The same model and multiple regression were performed. Points to consider:
- only 35 schools this time
- some of the SES data is missing but I should have this shortly
- at present, with missing SES data, R2 is only 0.147.
3. 2010 Student and Teacher Surveys
A couple of months prior to sitting their School Certificate examinations I surveyed 1200 students (60% response rate) and 47 Science teachers (64% response rate) from 14 schools. These 14 schools had relatively similar 1:1 laptop deployments but for 7 of them it was their 1st cohort with 1:1 whereas for the other 7 it was their 2nd 1:1 cohort to sit SC. The surveys examined an individual student's perception of their use of their laptop at home, school and in Science and their Science Teacher's perception of their professional development, own use and that of their students. Accordingly I am in the process of using the data to calculate efficacy scores for the individual students and teachers. Thus ultimately, in combination with the examination data, I hope to perform multiple regression analyses with the model:
Exam score = f(gender, SES, school, subject, LaptopY/N, Student Efficacy, Teacher Efficacy)
This should hopefully yield far better measures of goodness of fit as it should be a better model in terms of the impact of 1:1 laptops.
4. Data yet to be Obtained and Analysed
I still intend to:
- survey 2011 SC and HSC Science students and teachers, calculate efficacies, combine with 2011 results and regress (2011 HSC will be the only instance where half the students will have been schooled with a laptop for 3+ years whilst the other half will have received conventional instruction).
- compare performance data historically by school and LaptopY/N
- perform side analyses and report on the survey data
So in a nutshell that is where I am at to-date. Obviously I need to pick up on my reading (255 articles and publications accrued so far) and start backing my data up with theory. Any observations and/or interviews will happen later if at all. Yet again, any advice or comments would be keenly sought, thanks!!