1  The Recordings

Modified

April 26, 2026

Open the four .mat files, walk through the eleven channels, and explain what each one carries — including a small puzzle the data documentation doesn’t resolve.

1.1 What’s inside a .mat file

Two top-level fields in every session: fs (the sampling rate) and y (the data, channels × time). Eleven channels at 256 Hz, recording about three and a half minutes per session — matching the four-minute runs reported in Guger et al. (2012).

What each row carries:

  • CH1 is a linear ramp — the sample-time index, not a recorded signal.
  • CH2–CH9 are the eight occipital EEG channels (PO7, PO3, POz, PO4, PO8, O1, Oz, O2).
  • CH10 sits at zero between trials and steps to one of {9, 10, 12, 15} during stimulation, encoding the active LED frequency in Hz.
  • CH11 is g.tec’s LDA prediction — our baseline competitor for everything that follows.

One thing that isn’t in data_description.txt: CH11’s non-zero values are small integers — class indices, not frequencies. The class-to-frequency mapping comes from data/subject_1_fvep_led_training_1_result2d.PNG, which shows the g.tec analysis tool with the four classes laid out as 1 → 15 Hz, 2 → 12, 3 → 10, 4 → 9 (highest class index → lowest frequency). Always audit every file the organizers shipped, not just the obvious-looking text one.

1.2 The four recordings

The dataset only ships four files, so we’ll plot all of them. EEG panels are clipped to ±50 µV (clinical-EEG display convention); the amplifier-startup transient at t = 0 is left visible as a truncated spike.

CH11 also fires intermittently — it only outputs a class when the LDA’s confidence (presumably) crosses a threshold, so per-sample agreement against CH10 isn’t the right way to evaluate the baseline. That needs trial-level evaluation, deferred to Ch 7.

Code
from pathlib import Path
import numpy as np
import scipy.io
import matplotlib.pyplot as plt

DATA_DIR = Path("data")

LABELS = [
    "CH1: sample time",
    "CH2: PO7", "CH3: PO3", "CH4: POz", "CH5: PO4",
    "CH6: PO8", "CH7: O1",  "CH8: Oz",  "CH9: O2",
    "CH10: trigger (Hz)",
    "CH11: g.tec LDA",
]

def plot_session(path):
    mat = scipy.io.loadmat(path)
    fs = int(mat["fs"][0, 0])
    y = mat["y"]
    n_channels, n_samples = y.shape
    t = np.arange(n_samples) / fs

    print(f"File:          {path.name}")
    print(f"Sampling rate: {fs} Hz")
    print(f"Shape:         {y.shape}  ({n_channels} channels × {n_samples} samples)")
    print(f"Duration:      {n_samples / fs:.1f} s")

    fig, axes = plt.subplots(11, 1, figsize=(14, 11), sharex=True)
    for i, ax in enumerate(axes):
        ax.plot(t, y[i, :], lw=0.4)
        ax.set_ylabel(LABELS[i], fontsize=8, rotation=0, ha="right", va="center")
        ax.tick_params(labelsize=7)
        if 1 <= i <= 8:
            ax.set_ylim(-50, 50)
    axes[-1].set_xlabel("Time (s)")
    fig.suptitle(path.name, fontsize=10)
    fig.tight_layout()
    fig.savefig(f"images/01-recordings_{path.stem}.png", dpi=200, bbox_inches="tight")
    plt.show()

1.2.1 Subject 1 x Run 1

Code
plot_session(path = DATA_DIR / "subject_1_fvep_led_training_1.mat")
File:          subject_1_fvep_led_training_1.mat
Sampling rate: 256 Hz
Shape:         (11, 57728)  (11 channels × 57728 samples)
Duration:      225.5 s
Figure 1.1: Subject 1, run 1.

Twenty stimulation blocks on CH10, evenly split across the four frequencies. CH11 is the standout: it only ever outputs class 3, regardless of which LED is flashing. The LDA in this session is broken in some way — we’ll come back to it in Ch 7 when we evaluate the baseline trial by trial.

1.2.2 Subject 1 x Run 2

Code
plot_session(DATA_DIR / "subject_1_fvep_led_training_2.mat")
File:          subject_1_fvep_led_training_2.mat
Sampling rate: 256 Hz
Shape:         (11, 58112)  (11 channels × 58112 samples)
Duration:      227.0 s
Figure 1.2: Subject 1, run 2.

Same trial structure — twenty balanced blocks. CH11 now fires three of the four classes; class 2 (12 Hz, per the mapping above) never appears, even though the trigger does step to 12 Hz five times. The LDA isn’t broken in the same way as run 1, but it’s still missing a class.

1.2.3 Subject 2 x Run 1

Code
plot_session(DATA_DIR / "subject_2_fvep_led_training_1.mat")
File:          subject_2_fvep_led_training_1.mat
Sampling rate: 256 Hz
Shape:         (11, 58757)  (11 channels × 58757 samples)
Duration:      229.5 s
Figure 1.3: Subject 2, run 1.

CH11 covers all four classes here, and fires for a much larger fraction of samples than either subject-1 session — the LDA is genuinely trying in this run. The flip side: data quality is poor. PO3 and O1 hit the ±50 µV clip throughout, and the occipital channels (Oz, O2) degrade further past the ~120 s mark — by the second half of the session their panels are essentially solid blocks of saturation. Only POz, PO4, and PO8 stay reasonably contained. Sustained clipping on that many channels usually points at electrode contact issues (high impedance, sweat bridges) rather than brain signal, which is why this run isn’t the running example for later chapters.

1.2.4 Subject 2 x Run 2

Code
plot_session(DATA_DIR / "subject_2_fvep_led_training_2.mat")
File:          subject_2_fvep_led_training_2.mat
Sampling rate: 256 Hz
Shape:         (11, 57697)  (11 channels × 57697 samples)
Duration:      225.4 s
Figure 1.4: Subject 2, run 2.

Twenty trials, all four classes on CH11, EEG channels mostly quiet within the clip. The cleanest of the four for downstream work — and the one we’ll lean on as the running example in later chapters.