Statistical Analysis for Small and Big Data

An interactive Compendium based on Reproducible Computing

Start here to understand how to use the handbook effectively and what practical skills you will build across chapters.

Author

Patrick Wessa

Published

24 March, 2026

This work is provided “as is”, without warranty of any kind, express or implied, including but not limited to warranties of merchantability, fitness for a particular purpose, and noninfringement.

All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted by copyright law.

Preface

Science never solves a problem without creating ten more.

– George Bernard Shaw

This handbook began as teaching material: a structured guide for students to learn statistical analysis through explanation, examples, and practice. Over time it evolved into something more open. While it still serves the classroom, it now aims to be a public, web-native compendium for anyone who wants to learn or apply statistics.

A large part of the content–its ideas, tone, and structure–reflects my own work. But I have chosen to make this a hybrid handbook: one that explicitly allows different LLMs to help improve existing sections and add new material, code, and examples. The result is not a fixed textbook but a living document that can grow and adapt.

I see this as an experiment in collaboration between human and artificial intelligence: a way to build an interactive compendium that is more complete, more practical, and more responsive than a traditional book. The goal is to combine careful human judgment with the speed and breadth of modern AI, while keeping transparency and reproducibility at the core.

What this handbook focuses on:

core statistical concepts (descriptive, inferential, modeling, and time series),
practical methods with real data examples,
reproducible workflows that link explanations to computation,
interactive tools (Shiny apps) that allow immediate experimentation.

These topics were chosen because they form the smallest set of ideas and tools that reliably supports real statistical work–from basic exploration to modeling and decision-making.

I would like to express my gratitude to the many people who helped me during the creation of earlier versions of this handbook–everyone who supported, edited, proofread, and tested those drafts. I also thank two dedicated LLMs (Claude and ChatGPT) for their tireless dedication to bug-fixing, typo-hunting, refactoring, and the occasional late-night rescue mission.

24 March, 2026, Patrick Wessa