© 2004 2005 2006 2007 2008 2009 Raazesh Sainudiin. © 2008 2009 Dominic Lee.

This is a course about **computational statistical experiments** using **Sage**.

Official Description:
The power of modern computers has unleashed new ways of thinking about statistics and implementing statistical solutions. This course introduces the student to computational techniques with uses ranging from exploratory data analysis to statistical inference. These techniques are now widely used and are fast becoming indispensable in the modern statistical toolkit. The course will provide the student with a sound understanding of the computational methods and hands-on experience in implementing and using them. Topics may include generation of random variables, Monte Carlo integration and importance sampling, bootstrap methods, Markov chain Monte Carlo, kernel density estimation and regression, classification and regression tree.s

**Who does computational statistical experiments?**

A *statistical experimenter* is a person who conducts a *statistical experiment* (for simplicity, from now on, these will be called *experimenters* and *experiments*). Roughly, an experiment is an action with an empirically observable outcome (data) that cannot necessarily be predicted with certainty, in the sense that a repetition of the experiment may result in a different outcome. Most quantitative scientists, engineers, managers and decision-makers are experimenters if they apply statistical principles to further their current understanding or theory of an empirically observable real-world phenomenon (simply, a phenomenon). Roughly, furthering your understanding or theory is done by improving your mathematical model (rigorous cartoon) of the phenomenon on the basis of its compatibility with the observed data or outcome of the experiment. In this sense, an experimenter attempts to learn about a phenomenon through the outcome of an experiment. An experimenter is often a decision-maker, scientist or engineer, and vice versa.

Technological advances have fundamentally inter-twined computers with most experiments today. First, our instrumentational capacity to observe an empirical phenomenon, by means of automated data gathering (sensing) and representation (storage and retrieval), is steadily increasing. Second, our computational capability to process statistical information or to make decisions using such massive data-sets is also steadily increasing. Thus, our recent technological advances are facilitating computationally intensive statistical experiments based on possibly massive amounts of empirical observations, in a manner that was not viable a decade ago. Hence, a successful decision-maker, scientist or engineer in most specialisations today is a **computational statistical experimenter**, i.e. *a statistical experimenter who understands the information structures used to represent her data as well as the statistical algorithms used to process her administrative, scientific or engineering decisions*.

A computational statistical experimenter has to *tell a machine what to do* with her observations or data. In order to efficiently command a computing machine she has to master *the art of computer programming*. Programming alone is not enough. Statistical experimenters use a mathematically formal way of thinking about their experiments. They use set theory, probability theory and other branches of pure and applied mathematics through established statistical theory to reach their administrative, scientific and engineering decisions from their data. This course is designed to help you take the first steps along this path.

**What is Sage and why are we using it?**

Sage is a free open-source mathematics software system licensed under the GPL. Sage can be used to study mathematics and statistics, including algebra, calculus, elementary to very advanced number theory, cryptography, commutative algebra, group theory, combinatorics, graph theory, exact linear algebra, optimization, interactive data visualization, randomized or Monte Carlo algorithms, scientific and statistical computing and much more. It combines various software packages into an integrative learning, teaching and research experience that is well suited for novice as well as professional researchers.

Sage is a set of software libraries built on top of Python, a widely used general purpose programming language. Sage greatly enhance Python's already mathematically friendly nature. It is one of the languages used at Google, US National Aeronautic and Space Administration (NASA), US Jet Propulsion Laboratory (JPL), Industrial Light and Magic, YouTube, and other leading entities in industry and public sectors (read more...). Scientists, engineers, and mathematicians often find it well suited for their work. Obtain a more thorough rationale for Sage from Why Sage? and Success Stories, Testimonials and News Articles. Jump start your motivation by taking a Sage Feature Tour right now!

This course has been redesigned into a light-weight version. See Rationale for Course Redesign Wednesday, 19-Aug-2015 14:52:23 MST presented to University Centre for Teaching and Learning. The past version of the course is archived below.

CC: This work (all of its contents) is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.

- Syllabus and Lectures Wednesday, 19-Aug-2015 14:52:24 MST
- Assignment Wednesday, 19-Aug-2015 14:52:06 MST
- Supplementary Materials
- Student projects from previous years:
- Computational Statistical Experiments: STAT 218 - 07S2 (C) Student Projects Report UCDMS 2008/5 Wednesday, 19-Aug-2015 14:52:07 MST
- Computational Statistical Experiments: STAT 218 - 08S2 (C) Student Projects Report UCDMS 2009/5 Wednesday, 19-Aug-2015 14:52:07 MST