Psychodemographics of Tea Purchase

  • IO
  • Monday, May 10, 2021
blog-image

Project

The aim of this study is to gain a better understanding of tea consumption by examining the consumer sustainability awareness on tea purchase decision-making when tea is purchased either as a gift or for household self-use.

library(dplyr)
library(tidyr)
library(factoextra)
library(janitor)
library(DataExplorer)
tea = haven::read_sav("data set tea consumption-shared  data.sav")

tea <- tea %>% dplyr::select(1:8, 30:34)
tea <- janitor::clean_names(tea)

cols <- colnames(tea)[1:8]
tea[cols] <- lapply(tea[cols], factor)

tea_dum <- DataExplorer::dummify(tea)

EDA

Missing values

plot_intro(tea)

skimr::skim(tea)
Table 1: Data summary
Name tea
Number of rows 280
Number of columns 13
_______________________
Column type frequency:
factor 8
numeric 5
________________________
Group variables None

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
purchase_purpose 0 1 FALSE 2 1: 156, 2: 124
sex 0 1 FALSE 2 2: 144, 1: 136
hometown 0 1 FALSE 5 3: 106, 2: 98, 1: 46, 5: 25
age 0 1 FALSE 6 3: 94, 2: 91, 1: 56, 4: 26
ethnic 0 1 FALSE 2 1: 266, 2: 14
education 0 1 FALSE 5 3: 148, 1: 70, 4: 43, 5: 11
job 0 1 FALSE 9 4: 76, 2: 51, 3: 47, 9: 41
monthly_income 0 1 FALSE 5 2: 102, 3: 82, 4: 39, 5: 31

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
fac1_sustain 0 1 0 1 -3.45 -0.52 0.21 0.77 1.55 ▁▂▃▇▇
fac2_brand 0 1 0 1 -2.80 -0.76 -0.04 0.81 1.82 ▁▅▇▇▆
fac3_prg 0 1 0 1 -3.36 -0.70 0.13 0.80 1.89 ▁▂▆▇▅
fac4_fashion 0 1 0 1 -3.04 -0.57 0.08 0.65 2.62 ▁▃▇▆▁
fac5_conformty 0 1 0 1 -3.27 -0.57 0.08 0.63 2.03 ▁▂▅▇▃

Clustering

WSS plot

factoextra::fviz_nbclust(tea_dum, kmeans, method = "wss")

It seems 4 is an acceptable number for clusters.

K-means

set.seed(123)

tea_kmeans <- kmeans(tea, 4, nstart = 50, iter.max = 100)

tea_dum$Clusters <- factor(tea_kmeans$cluster)

tea$Clusters <- factor(tea_kmeans$cluster)

Plots

plot_bar(tea, by="Clusters", 
         order_bar = F, by_position = "fill")

Correlation Heatmap

plot_correlation(tea_dum)

Purchase purpose 1 is related to pragmatism.

Purchase purpose 2 is related to brand and prestige chasing.

Cluster Analysis

Cluster 1 - Novelty-seekers

It seems that these people tend to Prefer distinguished products with special appearances and enjoy novelty when deciding on a tea purchase. It is possible that these people are more like to have sensation seeking personalities.

Cluster 2 - Non-eco-friendly People

There is not much to say about people from this cluster other than they specifically tend not to care about sustainability for the environment in their tea purchase decisions. This means that they are not particularly concerned about their carbon footprint or being eco-friendly.

Cluster 3 - Gift-givers

For their tea purchases, they look for a known brand and prestige when deciding on tea purchases; they don’t care about pragmatic utility the tea has. Also, they are more like to purchase tea as a gift to other people rather then themselves. Therefore, these people tend to buy prestigeous tea brands to other people.

Cluster 4 - Utilitarians

They tend to choose a tea after considering the pragmatic utility of the kind, meaning they choose what is most needed and consider affordability. Lastly, they are more likely to purchase the tea for themselves rather than as a gift to other people.

Data Coding

Gender

  1. Male

  2. Female

Place of birth

  1. Northern China

  2. Easter China

  3. Southern China

  4. Western China

  5. Central China

Age

  1. 18–24

  2. 25–34

  3. 35–44

  4. 45–54

  5. 55–64

  6. 65+

Ethnic group

  1. Han

  2. Minority group

Education background

  1. Below high school graduate

  2. High school degree

  3. Bachelor’s degree

  4. Master’s degree

  5. PhD. and above

Occupation

  1. Civil servant

  2. Diplomats

  3. Professional (Educator, Engineering, IT, Doctor, Nurse, Lawyer, Consultant, Athletes)

  4. General Business Clerk

  5. Corporate Management

  6. Artists (e.g. Producer, Actor, Director, Designer)

  7. Self-employed

  8. Farmer

  9. Others

Household monthly income (RMB)

  1. Up to 3499

  2. 3500–7499

  3. 7500–12499

  4. 12500–16499

  5. More than 16500