The original OU (UK) course M347 was delivered online. Here in this HKMU course, however, the content will mainly be delivered through printed units, with the PDF files uploaded to the course Online Learning Environment (OLE). You will therefore need to be aware of some special features you may come across as you work through this adapted HKMU print-based course. Firstly, you should ignore references to the course code M347 in the units and assume that they refer to STAT S347. The word 'module' means the course. In the original OU online course, animations and screencasts were linked directly into the units. When you come across references to these in STAT S347, please be aware that these items can be found on the course OLE instead of through a Web link. In addition, you may sometimes find icons in the units referring to certain terms. As these relate only to the original materials, please ignore them.
In this section, you will find further details about working through the course.
Study units
A summary of the contents of the units is provided below.
Unit 1
Unit 1 is largely a review unit. To some of you, it might be wholly a review unit, to others it might introduce one or two additional things that you have not previously studied in detail. Unit 1 reviews some parts of the statistical background that you should already have, or can readily attain, in order to study the rest of the course; it also reviews the main mathematical techniques that will be employed during the course. (Review of other relevant elements of statistical background will be delayed until nearer to the time they are used, in Units 4 and 8, the first units of Blocks 2 and 3.)
The style of Unit 1 and the quantity and role of its extra exercises are somewhat different from later units, as described in the Unit 1 introduction.
The other two units in Block 1 comprise an introduction to the theory of continuous distributions, that is, models for quantities that vary randomly on a continuous scale.
Unit 2
Unit 2 specifically concerns models for 'univariate' continuous random variables. 'Univariate' is the statistician's favoured word for 'one-dimensional'. The unit concerns a number of basic properties of univariate continuous distributions, many of which you are probably already familiar with. (No real problem if you aren't; the unit takes things pretty much from scratch.)
In Unit 2, however, the above notions are developed in a more mathematical manner than you might have seen before. You will develop skills to be able to calculate and develop formulae for these quantities yourself. Calculus, especially integration, will be particularly important here. (This is one of the mathematical techniques reviewed in Unit 1.)
Unit 3
In Unit 3, the mathematical structure of 'multivariate' continuous distributions will be explored. 'Multivariate' is statistician-speak for 'multi-dimensional'. Once there is more than one variable involved, new issues arise about the way variables depend on one another. The joint behaviour of collections of variables is important. Multivariate variables therefore have joint distributions. These are defined in Unit 3, although some important aspects of joint distributions are still univariate, such as distributions of individual variables (so-called 'marginal distributions') and distributions of individual variables conditional on the values of other variables (so-called 'conditional distributions').
Dependence between variables can partly be understood through the concepts of 'covariance' and 'correlation', which are also investigated in this unit. Another statistical notion that you are probably already aware of, concerned with dependence structure, is regression (and allied methods). This will not be studied until Block 4.
Many of the general issues concerning dependence between several variables are present in the bivariate case, so much of Unit 3 takes place in this context, with extension to the full multivariate case only towards its end. (Yes, you have the idea by now: 'bivariate' is statistician-speak for 'bi-dimensional', or 'two-dimensional'.)
You should be warned that Unit 3 is the longest in the whole course although, of course, you are given correspondingly longer time in which to study it.
Unit 4
Block 2 starts in Unit 4 with a review of the key concepts of classical statistical inference. All of these concepts are introduced without mathematical detail, the purpose being to review the basic ideas of classical inference before Units 5–7 delve into the underlying mathematical statistics. Unit 4 is quite short; indeed, it is the shortest unit in the whole course.
Unit 5
It is often desirable to give a single estimate for an unknown parameter. The process of obtaining such a single estimate is known as 'point estimation' and is the subject of Unit 5. The main method of point estimation considered in Unit 5 is 'maximum likelihood estimation', in which a single estimate for a parameter is found by maximizing the 'likelihood function' using calculus. A number of general properties associated with point estimation are also considered in Unit 5.
Unit 6
Because of sampling variability, it is almost certain that a single estimate of a parameter is not equal to the true value of the parameter. It may therefore be desirable to obtain instead an interval of values which one is confident contains the true value of the parameter. Such intervals are known as 'confidence intervals' and are explored in Unit 6. The majority of Unit 6 is taken up with the related concept of 'hypothesis tests', which considers evidence for and against contrasting hypotheses about the true value, or range of values, of a parameter. General, principled, approaches are taken to derive hypothesis tests and confidence intervals in this unit, in contrast, perhaps, to more ad hoc or specific approaches that you might have encountered before (and will have been briefly reminded of in Unit 4).
Unit 7
Finally in Block 2, Unit 7 explores 'asymptotic theory' which describes the behaviour of quantities based on a sample of observations of a random variable as the sample size gets large. Asymptotic theory is very important to statistics as it provides the theoretical justification for many of the approximations — with some of which you are already familiar, such as the Central Limit Theorem — which are applied to practical problems. As well as a number of important general results, asymptotic properties of maximum likelihood estimation are investigated in some detail in this unit.
Unit 7 is probably the most 'abstractly mathematical' in STAT S347, and so might have a rather different feel than most other units in the course. (If you don't like it too much, be assured that the rest of the course is not written in the same vein.)
Units 8 and 9
As you might have suspected, Bayes' Theorem plays a major role in the Bayesian approach to statistics and is used to combine the information contained in the data with any information external to the data. Unit 8 focuses on the process of using Bayes' Theorem to combine the two sources of information about any unknown parameters. How this combined information can then be used for Bayesian inference is the subject of Unit 9.
Unit 10
As mentioned earlier, Bayesian statistics can be computationally difficult to implement in practice. Unit 10 explores a computational technique known as Markov chain Monte Carlo. This technique is the principal reason why Bayesian statistics became computationally feasible, and consequently popular, in the late 20th century.
Unit 11
The block starts, in Unit 11, with a detailed study of linear regression with a single explanatory variable. The random variation in the response variable over and above the contribution to its value made by the explanatory variable is here modelled by a normal distribution.
Linear regression with one explanatory variable is treated first in a classical manner (resulting in various formulae with which you might already be familiar), and then in a Bayesian manner. In this unit, the latter is (very) restricted to use of a specific, improper, prior which has the property of leading to the same results as the classical case, albeit with a different interpretation.
Unit 12
Unit 12 extends the ideas of Unit 11 to 'multiple regression', the name given to linear regression with two or more explanatory variables.
This unit is a little shorter than other 'full length' (whatever that might be!) units.
Unit 13
The normality assumption made in Units 11 and 12 is, in some ways, rather limiting, and Unit 13 considers two important extensions to such regression models.
- The 'General Linear Model', which despite its grand title is a relatively minor extension of multiple regression, discards normality, but continues with normality-related 'least squares' parameter estimation. The focus here is more on the structure of the explanatory variables and how to cope with awkward, but practically important, cases (such as the effects of 'treatments' in medical, industrial and agricultural experiments).
- The 'Generalised Linear Model', another grand title, is a more far-reaching extension of regression. It allows a much wider variety of response variable types in the same unified framework, no longer needing to be normally distributed or even continuous; responses can even be binary.
Unit 14
Unit 14 considers further Bayesian linear modelling, incorporating more general prior information than used in Unit 11. This unit will explore the Bayesian approach to modelling each of the following:
- linear regression models with one explanatory variable;
- multiple regression models;
- the generalised linear model.
This closing unit is in length somewhere between a half and two-thirds of most units in the course.
Format of the units
The units contain the following elements:
- Main course text
- Exercises — see the subsection below for more details
- Examples — to assist with your learning
- Bold terms within text — to highlight important terms
- Boxed material — to emphasize important material
- Animations — see the subsection below for more details
- Screencasts and audio — see the subsection below for more details.
Exercises
Throughout STAT S347 there are many exercises integrated into each unit. To help your learning you should try to do each exercise as you come to it, and so for this purpose you should always have a pen and paper handy next to you. The solutions to each part of an exercise can be found at the end of the unit.
A note on accuracy: It is worth noting that statisticians are pretty relaxed about the number of decimal places or significant figures to which numerical answers are given. In the numerical exercises in the text, a desired number of decimal places is sometimes specified. Often, however, you are not asked to display your answers to a given number of decimal places. In such cases, you should use a sensible number, displaying your final answer in a way that is consistent with your intermediate working. Remember, there is no sense in being ultra-exact in (most of) your numerical computations when the statistical modelling process concerns numerous assumptions that are rarely exactly true in practice. Modelling approximations therefore correspond to approximations in answers to real-life questions, the inaccuracies of which far outweigh mathematical worries about, say, the fifth or sixth decimal place.
Having said this, however, if you are asked to give your answer to the specified number of decimal places in an assignment or the exam, do follow the instructions you are given.
Animations
Many of the units contain animations which are designed to help your understanding of various aspects of STAT S347. The animations can be viewed in the Course Materials section of the course OLE. To enhance your learning, we recommend that you view the animations as you come to them.
If an animation has an associated audio description, you will need to have speakers or headphones connected to your computer in order to hear the audio.
Screencasts and audio
Several units have screencasts, which are short audio-visual presentations explaining a particular aspect of a unit. You will need to have speakers or headphones connected to your computer to hear the screencast. When a screencast is referred to in a unit, turn to the Course Materials section of the OLE to find and play it.
Extra exercises
In addition to the exercises which are integrated into the units and which you should attempt, each unit also has a set of extra exercises which is optional. The extra exercises for each unit can be found in PDF form on the course OLE. Please note that these are in soft copy format only and you will not be sent a printed version. If you feel that you would like to have some extra practice with a particular topic, then it would be a good idea to have a look at the extra exercises.
You might also find the extra exercises useful for your revision. But please do not feel that you need to do all (or indeed any!) of the extra exercises: they are there as additional help for those students who would like to use them. The extra exercises have been written so that you can 'dip into' them and do as many (or as few) as you wish to do. As such, even though many of the extra exercises do follow on from each other, they are written as 'stand-alone' exercises and each extra exercise is written on the assumption that students may not have done any of the previous extra exercises.
Optional material
Some units (specifically Units 5, 6, 9 and 13) have some optional material associated with them. This material has been uploaded to the OLE for completeness of the course for those of you who would like to see proofs of some of the more difficult results in these units. It should be emphasized, though, that you do not have to look at this material and it certainly won't be assessed: the optional material is most definitely optional!
The OLE
As mentioned previously, you will need to access the OLE in order to access some of the course materials. In addition, you can use the OLE to submit your assignments, view course announcements and communicate with your tutor and fellow students on the course discussion board.
Presentation Schedule
The Presentation Schedule for this course can be found on the course OLE. It shows you how long to spend on each unit and when to attend tutorials and submit assignments.
Equipment needed
Calculator
You will need a calculator with basic mathematical functions (exp, log, square root, etc.), but not necessarily with statistical functions. You will be allowed to bring a calculator into the examination, but only an HKMU-approved model. A list of approved calculator models can be found on the STAT S347 OLE.
Home computer
You will also need to have access to a computer with an Internet connection to access the course OLE.