Monthly Seminars

Informal, general interest seminars on topics in statistical data analysis

 

2021 - December

2021 in Graphs

This has been an year unlike any other, well, except last year. It has been a year of hardship but also a year of opportunity and unexpected trends. In this talk the Stats Central team will channel Alan Kohler to collate a set of funky graphs that take a light-hearted, almost inappropriately upbeat, look at The Year That Was. This will also be an opportunity to explore different techniques for visualising data and presenting ideas in creative ways.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Professor David Warton, Director of UNSW Stats Central

Date:       Thursday 9 December 2021

Time:       2.30-3.30pm

Location: Online

2012 - November

How to avoid p-hacking, HARKing and other statistical sins

Most published research findings are not reproducible ( 'the reproducibility crisis' ). This is a polite way to say these research findings are wrong. A set of related practices called p-hacking or HARKing as well as cherry-picking, fishing expeditions, data dredging or data mining have been named as a major cause of the reproducibility crisis in research.

These practices are incredibly common, and usually carried out by well-meaning researchers wanting to extract as much information from their data as possible, and not realising they are doing the wrong thing. Most worryingly they are often taught in statistics courses. In this seminar I will describe what these practices are, why they lead to bad research, and how to (very easily) avoid them.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Gordana Popovic, Statistical Consultant, UNSW Stats Central

Date:       Thursday 18 November 2021

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - October

Bootstrapping: How to use data to get out of a tight spot

Statistical analyses often rely on assumptions about the underlying data. In many situations, it is questionable whether these are justified. Instead of blindly trusting the assumptions bootstrapping makes the data work harder to help us obtain reliable results. This seminar will discuss how the bootstrap procedure works and how you can use it for your own data analyses.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Peter Humburg, Statistical Consultant, UNSW Stats Central

Date:       Thursday 21 October 2021

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - September

Four strategies for dealing with multiple comparisons

Multiple hypotheses may be generated by multiple treatment arms; heterogeneous treatment effects; or measuring multiple outcome variables. In a hypothesis testing framework, using p <0.05 as a criterion for declaring significance, it can be easy to get spurious results when many hypotheses are tested. This talk will discuss 4 things you can do when faced with multiple comparisons- covering the difference between controlling the family-wise error rate and the false discovery rate; the Bonferroni-Holm adjustment; the Benjamini-Hochberg adjustment; strategies for multiple outcome variables and strategies for correlated multiple comparisons.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Eve Slavich, Statistical Consultant, UNSW Stats Central

Date:       Thursday 23 September 2021

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - August

A Gentle Introduction to Survival Analysis

When we are interested in the length of time until an event first occurs, we might not be able to observe the individuals for long enough to see them all have the event. This phenomenon, known as censoring, can be handled using survival analysis methods. In this seminar, I will introduce the basic concepts and methods used in survival analysis, and give examples of some of its many extensions for dealing with more complicated scenarios.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Mark Donoghoe, Statistical Consultant, UNSW Stats Central

Date:       Thursday 19 August 2021

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - July

Survey and Questionnaire Design: Practical advice for researchers

Data collection by surveys and questionnaires is an efficient and popular method. Researchers can quickly collect many observations, especially with online survey administration. But if items on the instrument aren’t crafted with care, data quality can be poor. This seminar will provide recommendations for survey and question design that will help ensure that your survey and questionnaire data help you answer your research questions.

The talk will be about 30 minutes long and following by discussion.

Date: Thursday 15 July 2021

Time: 2.30-3.30pm

Speaker: Nancy Briggs, Senior Statistical Consultant, UNSW Stats Central

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - June

T-test a special case of regression? Strange but true …

There are so many statistical tests - t-test, ANOVA, ANCOVA, regression ... ! Which one should I use? Many tests are really just different versions of one method, a "linear model". We'll talk about why it makes life a little easier to think about them in this way.

The talk will be about 30 minutes long and following by discussion.

Date: Thursday 17 June 2021

Time: 2.30-3.30pm

Speaker: Peter Geelan-Small, Statistical Consultant, UNSW Stats Central

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - May

Practical study design

Note: This is a "special 2 hours" seminar - First time offer

Good study design is crucial for answering your research questions. No amount of post processing or statistical expertise can compensate for poor or inadequate study design. I will review basic concepts like confounding, controls and randomization; show you how to estimate an appropriate sample size for your question; talk about how to conduct good observational studies; and describe how to use blocking in your study design, so that you can get more power with a smaller sample size.

The talk will be about 2 hours long including discussion.

Date: Thursday 20 May 2021

Time: 2.30-4.30pm

Speaker: Gordana Popovic Statistical Consultant, UNSW Stats Central

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - April

It all depends: Interaction terms in regression

If the effect of an independent variable on the response variable depends on some other variable, then you are in interaction land. When does a study question call for interaction terms? What does a categorical variable interacting with a categorical variable mean? And how about two continuous variables that interact, and continuous with categorical variable interactions? How can I interpret my results if my interaction was significant/ not significant? We’ll look at lots of ways to plot out the results of a regression that has included an interaction between 2 (or more) variables.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Eve Slavich, Statistical Consultant, UNSW Stats Central

Date:       Thursday 29 April 2021

Time:       2.30 - 3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - March

A practical guide to meta-analysis

Systematic reviews and meta-analysis are appearing with increasing frequency in the literature. This seminar will discuss the steps needed to perform a meta-analysis as a guideline. The type of data, effect measures, heterogeneity, and publication bias in meta-analysis will be reviewed through examples, and illustrated with forest plots and funnel plots.  

The talk will be about 30 minutes long and follow by discussion.

Speaker: Zhixin Liu, Statistical Consultant, UNSW Stats Central

Date:       Thursday 18 March 2021

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2021 - February

Like Cats and Dogs – Why model selection and inference just can’t get along

A common problem facing researchers following data collection is that it is unclear which of the (possibly many) variables should be included in the analysis. While this process can be challenging in its own right, it raises another, often more problematic, issue. The resulting model can no longer be used for statistical inference to answer research questions. In this seminar, we will take a closer look at why that is the case and discuss possible ways out of the dilemma.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Peter Humburg, Statistical Consultant, UNSW Stats Central

Date:       Thursday 18 February 2021

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2020 - December

2020 in Graphs

This has been a year unlike any other, well, unless you view it in terms of its sporting outcomes (Melbourne and Richmond won… again!).  It has been a year with plenty of hardship and tragedy, but also a year with plenty of opportunity and unexpected trends.  In this talk the Stats Central team will channel Alan Kohler to collate a set of funky graphs that take a light-hearted, almost inappropriately upbeat, look at The Year That Was.  This will also be an opportunity to explore different techniques for visualising data and presenting ideas in creative ways.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Prof David Warton, Director of UNSW Stats Central

Date:       Thursday 10 December 2020

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2020 - November

When Your Research Meets a Global Pandemic

This year, just about everything we do as researchers has been affected by COVID-19. Distancing measures has made it difficult for some people to pursue their research. This talk will highlight some of the problems the consultants at Stats Central have encountered when helping people throughout 2020. I will discuss some of the issues that may – or may not! – lead to changes in your study.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Nancy Briggs, Senior Statistical Consultant, UNSW Stats Central

Date:       Thursday 19 November 2020

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2020 - October

Methods for ranking and quantifying the importance of predictor variables

We all love to rank and order things. A common question after regression is “how do I know which variable is affecting my response variable more?”. E.g. Do extremes of temperature matter more than the mean (for plant growth)? Does sociodemographic index matter more than high school grade (for graduate outcomes)?

We discuss the options for this question, which depend on the model type, and include partitioning R2 type methods, regression coefficients and model averaging.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Eve Slavich, Statistical Consultant, UNSW Stats Central

Date:       Thursday 22 October 2020

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2020 - September

Visualise high dimensional data with the tourr R package

When we have many variables, we are told the only way to visualize them is two or maybe three at a time with scatterplots and boxplots. Turns out that is baloney. The tourr package in R uses a technique called projection pursuit which allows you to visualise datasets containing 5, 10, even 20 dimensions. It feels a bit like walking around your data, hence the name tour. Touring your data lets you explore clusters in high dimensions, look at variable importance, dependence between variables,  and see outliers. If you know how to use R then touring your data is very simple.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Gordana Popovic, Statistical Consultant, UNSW Stats Central

Date:       Thursday 17 September 2020

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2020 - August

Variable selection: too many variables? what next?

There are various techniques for finding the most parsimonious statistical regression model. Not all methods can be recommended without any qualification. This talk will look at a number of variable selection methods and suggest some general guidelines.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Peter Geelan-Small, Statistical Consultant, UNSW Stats Central

Date:       Thursday 13 August 2020

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

2020 - July

Residuals in linear models: more than just what’s left over

Researchers commonly use linear models in analysing their data. How do we know that a model we use is adequate and appropriate? Residuals, the difference between what we see in our data and what a model predicts, can be used to diagnose problems with our model and lead us to improving our analysis. This talk with provide a general overview of the types of residuals in a general context, including general and generalized linear models.

The talk will be about 30 minutes long and follow by discussion.

Speaker: Nancy Briggs, Senior Statistical Consultant, UNSW Stats Central

Date:       Thursday 16 July 2020

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

 

2020 - June

Effect size: p value is not enough, measures of magnitude matters!

With the recognition that p-value is not enough for a research inquiry, measures of magnitude in terms of effect size is important to report in the results. Besides, effect size is often needed in sample size calculation and meta analysis. In this talk, we will explain what is effect size, type of the effect size, and how we define and calculate it under different scenarios with examples.

The talk will be about 30 minutes long and follow by discussion

Speaker: Zhixin Liu, Statistical Consultant, UNSW Stats Central

Date:       Thursday 11 June 2020

Time:       2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

 

2020 - May

Mind your Ps: How to use and interpret p-values

Various branches of science are experiencing a "reproducibility crisis", with the (mis)use of p-values being widely identified as a major factor. As a result, the American Statistical Association released an official statement containing six principles underlying the proper use and interpretation of p-values. We will discuss each of the six principles and provide some advice on applying them in your work.

The talk will be about 30 minutes long and follow by discussion

Speaker: Mark Donoghoe, Statistical Consultant, UNSW Stats Central

Date: Thursday 14 May 2020

Time: 2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

 

2020 - April

Now there are two of them! Why you shouldn't dichotomise your variables

In many fields, it is common practice to dichotomise continuous variables prior to analysis. It may seem like a good idea that makes your life easier, but did you know that your results may suffer? We will take a closer look at the effect that dichotomising variables may have on results and discuss alternatives.

The talk will be about 30 minutes long and follow by discussion

Speaker: Peter Humburg, Statistical Consultant, UNSW Stats Central

Date: Thursday 9 April 2020

Time: 2.30-3.30pm

Location: Online

Slides for the presentation are HERE

View video Logos

 

2020 - February

Paired Data

Paired data crops up in many contexts. When two measurements are made on the same experimental unit, the data values are paired - for example, pre- and post-activity with the same subjects; measurements from two litter mates; data from each hand of a person. Paired data is clearly not independent and this dependence must be accounted for in any statistical analysis. This talk will look at how to analyse some examples of paired data, including continuous and discrete outcome measures.

The talk will be about 30 minutes long and will be followed by time for discussion. We'll move on after our talk to Hacky Hour (Penny Lane Café, 3.00 pm to 4.00 pm) where people can get help on statistics and bioinformatics, high performance computing and other data-related things.

Speaker: Peter Geelan-Small, Statistical Consultant

Date and Time:  Thursday, 13 February 2020, 2.30 pm to 3.00 pm

Location:  Mathews Theatre C (K-D23-303) | Kensington Campus

Slides for the presentation are HERE.

 

2019 - November

2019 in Graphs

As we come towards the end of 2019, we will look back on the year that was – politics, trade wars, climate change and S25 – through the lens of data visualisation!

The talk will be about 30 minutes long and will be followed by time for discussion. We'll move on after our talk to Hacky Hour (Penny Lane Café, 3.00 pm to 4.00 pm) where people can get help on statistics and bioinformatics, high performance computing and other data-related things.

Speaker: Prof. David Warton, Director of Stats Central, David is a professor of statistics specialising at its interface with ecological & environmental statistics.

Date and Time:  Thursday, 14th November 2019, 2.30 pm to 3.00 pm

Location:  Central Lecture Block, Theatre 2, Kensington Campus, UNSW Sydney

Slides for the presentation are HERE.

 

2019 - October

The Odd Thing About Odds Ratios

The odds ratio is a commonly used effect size measure for binary responses, but it has attracted some criticism. This seminar will attempt to demystify odds ratios, discuss their advantages and disadvantages, and present possible alternatives.

The talk will be about 30 minutes long and will be followed by time for discussion. We'll move on after our talk to Hacky Hour (Penny Lane Café, 3.00 pm to 4.00 pm) where people can get help on statistics and bioinformatics, high performance computing and other data-related things.

Speaker:   Mark Donoghoe, Statistical Consultant, Stats Central

Date and Time:  Thursday, 10th October 2019, 2.30 pm to 3.00 pm

Location:  Central Lecture Block, Theatre 2, Kensington Campus, UNSW Sydney

Slides for the presentation are HERE.

2019 - September

The wonderful geometry of regression models, conditional relationships and "controlling for" variables

Have you ever wondered how regression models can “control for” variables when assessing effects? This seminar will attempt to demystify this concept by explaining underlying geometry with pictures and demonstrations.

The talk will be about 30 minutes long and will be followed by time for discussion. We'll move on after our talk to Hacky Hour (Penny Lane Café, 3.00 pm to 4.00 pm) where people can get help on statistics and bioinformatics, high performance computing and other data-related things.

Speaker:   Gordana Popovic, Statistical Consultant, Stats Central. 

Date and Time:  Thursday, 19th September 2019, 2.30 pm to 3.00 pm

Location:  Mathews Theatre C, Kensington Campus (D23), UNSW Sydney

Slides for the presentation are HERE.

 

2019 - August

No difference does not imply equivalence: misuse of P values in equivalence/non-inferiority testing

There has been growing interest in studies to determine if new therapies have equivalent or non-inferior efficacies to standard therapy. These studies are called equivalence/noninferiority studies. This talk will describe the concepts and statistical methods involved in testing equivalence/non-inferiority, and its difference to superiority testing. We will demonstrate with examples the setup of specific margins and null hypotheses, the use and interpretation of confidence intervals as well as how to avoid the misuse of P values in equivalence/non-inferiority testing.

The talk will be about 30 minutes long and will be followed by time for discussion. We'll move on after our talk to Hacky Hour (Penny Lane Café, 3.00 pm to 4.00 pm) where people can get help on statistics and bioinformatics, high performance computing and other data-related things.

Speaker:   Zhixin Liu, Statistical Consultant, Stats Central. 

Date and Time:  Thursday, 8th August 2019, 2.30 pm to 3.00 pm  Please note new starting time for our seminars!

Location:  Central Lecture Block, Theatre 5, Kensington Campus, UNSW Sydney

Slides for the presentation are HERE.

 

2019 - July

Dealing with missing data in your research

Missing data occurs in almost all research, even well-designed and controlled studies. Missing data can reduce power and result in a biased estimate of your effect of interest. In this talk, I will review the mechanisms that give rise to missing data.  I will also discuss some of the strategies available to address missingness, such as value substitution and deletion and more advanced methods such as imputation and maximum likelihood.

The talk will be about 30 minutes long.

Speaker:  Nancy Briggs, Senior Statistical Consultant and Manager, Stats Central. 

Date and Time:  Thursday, 11th July 2019, 2.30 pm to 3.00 pm  Please note new starting time for our seminars!

Location:  Central Lecture Block, Theatre 1, Kensington Campus, UNSW Sydney

Slides for the presentation are HERE.

 

2019 - June

Visualising data - making sure your graph is worth 1,000 words

How can you make a picture of your data to make its message clear? You need good graphs to analyse data well and communicate your results effectively. This talk focusses on principles of effective data visualisation. Graphing principles and different types of graphs will be demonstrated using the R statistical package, but the basic principles apply to making graphs in any software package.

The talk will be about 30 minutes long and followed by discussion.

Speaker: Peter Geelan-Small, Statistical Consultant, Stats Central.

Date and Time:  Thursday, 13 June 2019, 3-4 pm

Location:  Central Lecture Block (E19), Theatre 6, Kensington Campus UNSW Sydney

Slides for the presentation are HERE.

2019 - May

Three C's of causal consideration: confounding, collinearity and colliders

A useful talk about deciding on the inclusion/exclusion of variables based on certain causal relationships or large levels of correlation.

The talk will be about 30 minutes long and followed by discussion.

Speaker:  Ben Maslen, Statistical Consultant. Ben works at the interface of statistics and ecology using a wide variety of statistical techniques and has particular expertise in models for multivariate abundance data.

Date and Time:  Thursday, 23 May 2019, 3-4 pm

Location:  Central Lecture Block (E19), Theatre 3, Kensington Campus UNSW Sydney

Slides for the presentation are HERE.

2019 - April

How do you deal with count data?

We will talk about techniques for dealing with data obtained when we count things and various properties that these types of data have. Some topics we will discuss are: mean-variance relationships, overdispersion and underdispersion, as well as a variety of models for data obtained from counts (such as, Poisson and negative binomial models and models for binomial successes).

The talk will be about 30 minutes long and followed by discussion.

Speaker:  Ben Maslen, Statistical Consultant. Ben works at the interface of statistics and ecology using a wide variety of statistical techniques and has particular expertise in models for multivariate abundance data.

Date and Time:  Thursday, 11 April 2019, 3-4 pm

Location:  Central Lecture Block (E19), Theatre 1, Kensington Campus UNSW Sydney

Slides for the presentation are HERE.

2019 - March

What can Data Science do for you?

Ever wonder how investment companies improve investment returns? What tools manufacturers use to improve their productivity? How e-commerce companies can increase their revenue? (Spoiler alert: they use Data Science!)

This is an introductory seminar on Data Science from a Computer Science perspective. Data science is a multi-disciplinary field that requires skills from mathematics, computer science and business. I will cover topics including:

  • The basics of Data Science
  • How to apply Data Science to your projects and
  • Requirements and challenges when using Data Science

The talk will be about 30 minutes long

Speaker: Raymond Wong, Stats Central and Associate Professor at the School of Computer Science and Engineering. Raymond's research interests include: big data management, XML and semi-structured data, data mining and analytics, mobile technologies and service computing

Date and Time:  Tuesday, 12th March 2019, 3.00 to 4.00 pm

Location:  NewSouth Global Theatre, Webster Building (G14), Room 127 | UNSW Kensington Campus

Slides for the presentation are HERE

2019 - February

Analysis of Pretest-Posttest Data: It’s not as straightforward as you might think!

Experimental designs comparing group differences in change over two time points are common in many areas of research. A pre-post design is a simple way to test the effect of an intervention on mean outcomes, but the statistical analysis does pose some questions for the researcher. This talk will outline some of the analysis options available to analyse two-group, pre-post data, including repeated measures analysis of variance, analysis of covariance, change scores and mixed models.

The talk will be about 30 minutes long and will be followed by (free!) afternoon tea and time for discussion.

Speaker: Nancy Briggs, Senior Statistical Consultant and Manager, Stats Central. Nancy's research interests include: the application of latent variable models, multilevel models and related models to problems in public health, medical research and behavioural sciences; Bayesian statistics; clinical trials.

Date and Time:  Thursday, 14th February 2019, 3.00 to 4.00 pm

Location:  Ainsworth Building (J17), Room G03 | UNSW Kensington Campus

Important! Please register by Thursday, 14th February, 10.00 am, so we can cater properly for afternoon tea!

Slides for the presentation are HERE