Abstract

This report seeks to answer the following question: Do black vehicles have a greater mean wheel diameter than white vehicles? The underlying, safe assumption in this regard is that there is no significant difference in mean wheel diameter between black vehicles and white vehicles (null-hypothesis). This paper reports on an observational, statistical investigation that was performed based on two independent, random samples for vehicles (one sample for black vehicles and the other for white vehicles). The samples were drawn from a very specific population of vehicles and great care was taken to ensure that the samples were as close as possible to a simple random samples (SRS). The samples’ distributions facilitated in utilising the t-test. The statistical investigation suggests a moderate relationship between the reported vehicle colours and mean wheel sizes. So, can the choice of vehicle colour lead the consumer to favour a different wheel size? This report suggests that it might not be the case.

Introduction

Picking a vehicle colour is never an easy choice and buyers are usually trapped between their emotions and their practicality with many of them wondering, “…should I go for the bad-ass black or the more sensible and safe white?” . According to news.com.au around 15% of all vehicles on Australian roads are black and 21% white . On the surface it seems to be a close split between the adventurous buyers and the practical buyers. Black is usually associated with high performance and/or luxurious vehicles as opposed to white vehicles being associated with average performance and/or standard vehicles. Apart from vehicle performance, there is also a safety aspect to consider – white is a more visible colour. Then, there are some practical aspects to consider – black vehicles tend to show dirt easier (or so it is perceived by the general populous*) and black vehicles have excessively warmer interiors since black doesn’t reflect any sunlight. The choice between a black or white coloured vehicle is a subtle social difference that may or may not lead to the average buyer’s choice of wheels fitted to the vehicle. This then, is also the main motivation for this investigation. From this background information and the general assumption that black vehicles have a greater average wheel diameter than white vehicles, the following hypotheses are drawn:

Null hypothesis: H0 : µblack = µwhite

Alternative hypothesis: HA : µblack > µwhite

µblack = mean rim diameter for black vehicles.

µwhite = mean rim diameter for white vehicles.

*The only vehicle which hides dirt the best is most likely a dirt coloured vehicle…? [Author’s note]

This document reports on a statistical method that evaluates the above stated hypotheses and, in the process, seeks to better understand the correlation between black and white vehicles and those two groups’ wheel sizes (diameter of the rim). Two random data samples (one for black vehicles and another for white vehicles) were collected by ways of observation and several measures were put in place to minimize bias and sampling errors.

Data

The following table lists the details of the two independent samples captured during the study.

Table 1 – Data Details

Details
Study type	Observational
Population	All motor vehicles parked in the parking lots of the Westfields Mall in Knox City on Wednesday morning, 1 August 2018, between 10am and 11am.
Sample sizes	Sample 1: 20 black vehicles. Sample 2: 20 white vehicles.
Variables	Explanatory variable: Vehicle colour – categorical (nominal). Response variables: Wheel rim diameter – quantitative (numerical).
Exclusion or groupings	None.
Covariates	Gender of vehicle owner (The person who bought the vehicle). Weather on the day of the sample being taken. Night time sampling. It may be considered that individuals with higher earning capacity visit malls in the evening purely for entertainment reasons. Higher earning individuals may be considered to have a different vehicle purchasing profile than lower earning individuals.
Data collectors	Person B: White vehicles. Person A: Black vehicles

The wheel rim diameter (response variable - b in Table 1) is read directly from the sampled vehicle’s tyre code. E.g. from the wheel code: 185/65R15, the R15 implies that the rim diameter is 15 inches. This then, is the number recorded for the sample. The following procedures were put in place to reduce bias. Firstly, a very specific geographical boundary was chosen for the population - Westfields mall parking lot at Knox City centre. Secondly, a narrow timeslot was set for sampling. It was impossible to ensure that the population did not change over the course of the data collection time slot (10am -11am). The only way to ensure that, would have been to lock down the parking lot for an hour. That would have caused panic, mayhem and widespread riots and probably landed the data capturers in custody. So, a small margin of population turnover, i.e. an error of approximately 3%, must be taken into consideration for vehicles entering and existing during the data collection time slot. The exact number of parking spaces at Westfields mall, as sourced from the centre’s management, is 6000. A random number generator was then utilized to generate 20 random numbers from a uniform distribution - one set for black vehicles and another set for white vehicles. The random numbers correlate to physical locations in the parking lots. If neither a black nor white vehicle was parked in the randomly chosen position, then the next random number is selected and so forth. For a parking lot with 6000 spaces it is impractical to try and capture more than 20 vehicles in 1 hour, hence the sample size of 20 was chosen. In this regard the experimental setup had to sacrifice specificity to enhance the SRS aspect of the samples.

Methods

The R-Studio analytical software package, version 1.1.453, was utilized to analyse the data. The captured results were entered into R-data-frames and initial EDA was done utilising the following R commands: c, table, qqnorm, hist, boxplot and summary. Initial EDA suggested that both the samples’ distributions were normal. Since two independent samples, which both conforms to normal distributions is used in this analysis, the t-test was chosen to determine whether there is a significant statistical difference between the two samples .

To reject the null-hypothesis and to avoid committing a type I error, a significance level of 5% is chosen (α = 0.05). By utilising the following R commands: mean, sd and var, both the samples’ means, standard deviations and variances were calculated. From these calculations it was noticed that the variances for the two samples differed and therefore the t.test function’s parameter: var.equal=false. The setup of the null-hypothesis resulted in setting the t.test function’s parameter: alternative=”greater”. From this statistical test method and by utilising the listed R commands a conclusive statistical result was determined.

Results and Discussion

From Table 1, which lists the basic statistical properties of the two samples, it can be observed that the sample for the black vehicles has a higher mean than the sample for the white vehicles and that the two samples have different variances. The first part of the latter observation adds support towards the alternative hypothesis.

Table 2 – Black and White Sample statistics

	Min	1st Qu	Median	Mean	3rd Qu	Max	Variance
White Vehicles	14.0	16.0	17.0	16.9	18.0	20.0	2.20
Black Vehicles	15.0	16.0	17.0	17.2	18.0	20.0	2.06

From Figure 1, it can be observed that the 1st and 3rd quartiles of both samples are very much the same and that there are no outliers for any of the two samples.

A screenshot of a social media post Description generated with very high confidence

Figure 1 – Black and White boxplots

From Figure 2, it can be observed that both samples have relatively normal distributions. From that observation it can be deduced that the larger population of motor vehicles from which the samples where collected, is also normally distributed .

A screenshot of a cell phone Description generated with very high confidence

Figure 2 – Black and White Quartile plots

From the results of the t-test, as listed in Table 3, it can be observed that a p-value of 0.2599 was calculated with 37.961 degrees of freedom (df = 37.961). This p-value is much larger than the initially set confidence level of α = 0.05. This is a strong indication to not reject the null-hypothesis. Therefore, there is no statistically significant difference between the two samples captured for this study.

Table 3 - T.Test Results

Parameter	Value
t	0.64979
df	37.961
p-value	0.2599
95% Confidence interval	-0.4784088 Inf

Conclusions

Even though the black vehicle sample had a higher mean value than the white vehicle sample, the t-test results conclusively showed that there is no real significant statistical difference between the two samples (p-value > α). Given the small sample sizes (20), it would be interesting to see what results a larger sample size would generate. Even though it seems that the general populous seems to think that black cars have larger wheel diameters than white cars, this statistical analysis suggests otherwise.

Appendix A

Samples

Black Vehicles	White Vehicles
No.	Random No.	Rim Diameter	No.	Random No.	Rim Diameter
1	196	17	1	110	16
2	436	15	2	816	17
3	767	15	3	2520	17
4	970	19	4	2613	19
5	1219	17	5	2753	17
6	1298	18	6	2928	15
7	1723	18	7	2944	14
8	2129	15	8	3073	19
9	2158	17	9	3507	17
10	2607	19	10	3986	16
11	3431	20	11	4356	18
12	3888	18	12	4410	17
13	4477	16	13	4692	20
14	4802	17	14	3892	15
15	5090	17	15	5038	17
16	5148	18	16	5102	16
17	5638	16	17	5310	16
18	5699	17	18	5504	18
19	5866	19	19	5884	18
20	5956	16	20	5972	16

Black-vs-White