I recently got myself a Raspberry Pi 4 to use as a Pi-hole and wanted to understand whether it would benefit from heat sinks or not. Most answers online agree that it is not needed, but it doesn’t hurt. What I didn’t find was some proper before-and-after comparison in real life conditions, so I checked myself.

Experiment Setup

I run the Raspberry Pi in the official case and manage it headless over ssh. I prepared a script and a cron job to log the temperature every 10 minutes and save the result in a csv file. After running the Pi-hole naked for over a month, I applied thermal tape and heat sinks to the main components. I left it running for a similar amount of time and then checked the measurements.

no_heatsink.jpg Raspberry Pi without heat sinks

with_heatsink.jpg Raspberry Pi with heat sinks


I got a 11 880 measurements in total with timestamp and temperature.

!pip install --upgrade --quiet numpy pandas scipy
import numpy as np
import pandas as pd
from scipy.stats import mannwhitneyu
df = pd.read_csv("cpu_temp.csv", names=["time", "temperature"], parse_dates=[0])
time temperature
6911 2021-06-05 20:30:01 58.426
8925 2021-06-19 20:10:01 66.218
4317 2021-05-18 20:00:01 61.835
2480 2021-05-06 01:50:01 59.887
5833 2021-05-29 08:50:02 56.478

When applying the heat sinks I noted the time to know where to split the logs. There are 5 721 measurements without heat sink and 6 159 with it.

first_heatsink_ts = pd.Timestamp("2021-05-28 14:10:02")
no_heatsink_df = df[df["time"] < first_heatsink_ts]
with_heatsink_df = df[df["time"] >= first_heatsink_ts]
count 5721.000000
mean 60.055207
std 1.695518
min 53.556000
25% 58.913000
50% 59.887000
75% 60.861000
max 67.679000
count 6159.000000
mean 59.664731
std 2.435301
min 43.329000
25% 57.939000
50% 59.400000
75% 61.348000
max 67.679000

The average temperature is indeed lower with the heat sink (59.7° C vs 60.1° C). As I have several thousands data points, I can check whether the difference is also statistically significant. A proper statistical test in this case is the Wilcoxon rank-sum test (aka Mann–Whitney U test, not to be confused with the Wilcoxon signed-rank test).

The p-value turned out to be basically zero, so the null hypothesis is rejected and I feel confident that the difference is not by chance.

res = mannwhitneyu(no_heatsink_df["temperature"], with_heatsink_df["temperature"], alternative="greater")


The temperature difference is statistically significant but practically insignificant. For low loads as with the Pi-hole, the temperature stays around a reasonable 60° C. Less than a Celsius degree difference is not worth the hassle of applying the heat sinks, but it might be for heavier loads.


While I am happy with the results, a more scientifically accurate experiment would use multiple machines running in similar conditions. The weaknesses of my setup include:

  • As I started the experiment in the spring and finished in the summer, the temperature could have been affected by the weather and heating at my apartment.
  • The measurements with the heat sink were running afterwards and therefore with a more up-to-date and possibly more efficient software versions.
  • I might have placed the thermal paste and the heat sinks poorly.
  • There are different kinds of cases and thermal management solutions with possibly different results.
tags: Data   Python   Statistics