Beverage Preference Data set

March 25, 2016
Tohru Iwasaki & Tetsuo Furukawa
Department of Human Intelligence Systems
Kyushu Institute of Technology

This database contains a survey data of beverage preference from 604 respondents.
The data represent the degrees of frequency drinking 14 beverages under 11 different situations. Thus the entire dataset is represented by a 3-dimensional array, i.e., a tensor of order 3.

We collected this dataset for our research on nonlinear tensor decomposition in machine learning field. We hope this dataset will be used as a benchmark by the people who are interested in tensor data analysis, relational data analysis, recommendation system, etc.

This database also contains the preference data of food and leisure collected from the same respondents. Thus this database can be also used for the research on multi-view learning.

You can use this database under the following conditions.

You can use this dataset for research purpose.
You must not redistribute without our permission.
We are NOT liable for any damages or losses, arising out of or related to your use or inability to use this dataset.
We appreciate if you would acknowledge the use of the dataset in publications with citing one of our related publications.

This data is appeared first in the following paper.

T. Iwasaki and T. Furukawa
Neural Networks, Vol.77, pp.107-125, 2016.
doi:10.1016/j.neunet.2016.01.013

File contents

Download

The archive file contains the following three files.

Readme.txtREADME
Beverage604.txtBeverage preference data
Beverage604-side.txtSide information of respondents

This dataset is collected by the questionnaire survey from 604 Japanese respondents. Lazy respondents who responded same scores for all questions were removed in advance, and they are not included in this dataset. There is no missing data in this database.

The respondents were asked to answer the frequency of drinking 14 beverages under 11 situations. Thus each respondent is requested to rate 14 x 11 scores, and the entire dataset is represented by an array of (604 respondents) x (14 beverages) x (11 situations). The chosen beverages are commonly sold at supermarkets in Japan, which are usually in plastic bottles.

Data Format of Beverage Preference Data (Beverage604.txt)

The data file "Beverage604.txt" consists of 604 blocks corresponding to the 604 respondents. Each block represents a 14 x 11 matrix, and the numbers are the ratings. 5 means that the respondent drinks the beverage frequently under the situation, and 1 means he/she drinks it rarely. Every row corresponds to a beverage, and columns are situations.

For example, the first block is as follows.

2 5 1 1 4 5 1 3 2 3 5	# Coke
2 3 2 2 5 4 1 1 2 3 2	# Soda pop (Seven up, etc.)
1 2 1 1 3 3 2 4 2 5 4	# Ginger ale
2 3 1 2 3 3 3 1 2 2 2	# Melon soda
5 3 3 5 4 3 5 2 1 2 1	# Orange juice
3 1 3 5 4 3 3 2 1 2 1	# Apple juice
5 4 4 4 3 1 5 3 5 4 2	# Vegetable juice
4 1 5 2 1 1 4 3 3 1 3	# Black tea (with sugar)
1 3 2 2 1 1 2 5 3 5 5	# Oolong tea (Chinese tea)
5 4 5 4 1 3 4 5 5 1 5	# Green tea (Japanese tea)
4 1 5 5 2 2 5 4 4 1 4	# Cafe au lait (Coffee with milk)
1 3 3 3 2 3 3 1 1 2 1	# Lactic drink
3 5 4 3 5 5 3 5 5 5 4	# Mineral water
3 5 2 1 5 5 1 2 3 2 2	# Isotonic drink

This first block represents the score of the first respondent. Such matrices are repeated 604 times separated by a null line.

The 11 situations are in the following order.

Column 1: Deskwork or studying
Column 2: Outdoor work
Column 3: Brake time or teatime
Column 4: Indoor leisure (watching video, etc.)
Column 5: Sports/exercise time
Column 6: Outdoor leisure
Column 7: In the car/train
Column 8: Lunch time
Column 9: Awakening time
Column 10: Bedtime 
Column 11: Party time

Data Format of Side Information Data (Beverage604-side.txt)

"Beverage604-side.txt" consists of 604 lines corresponding to the 604 respondents. The side information consists of 19 items. For example, lines 1-5 of the data file correspond to the data of the respondent 1 to 5, which is as follows.

1 80 2 5 5 5 5 3 5 5 3 3 1 4 1 3 1 1 5	#Respondent 1
1 55 2 5 4 4 4 4 5 5 5 5 3 5 2 4 4 5 4	#Respondent 2
2 57 2 5 4 4 4 3 4 4 4 4 4 1 1 1 3 3 4	#Respondent 3
2 48 2 5 5 5 5 2 3 5 5 4 3 5 1 2 1 3 3	#Respondent 4
1 59 1 4 1 1 1 1 4 4 4 4 2 5 5 4 1 1 5	#Respondent 5

Column 1-3: Respondent attributes
- Column 1: Gender (1: Male, 2: Female)
- Column 2: Age
- Column 3: Married or not (1: Not married, 2: Married)
Column 4-13: Food preference (5 is most preferred, and 1 is least.)
- Column 4: Japanese cuisine
- Column 5: Chinese cuisine
- Column 6: French cuisine
- Column 7: Italian cuisine
- Column 8: Ethnic foods
- Column 9: Japanese noodle (Soba, Udon)
- Column 10: Chinese noodle (Ramen)
- Column 11: Sweets (Western-style)
- Column 12: Sweets (Japanese-style)
- Column 13: Alcoholic beverage
Column 14-19: Leisure preference (5 is most preferred, and 1 is least.)
- Column 14: Watching TV/Video
- Column 15: PC games
- Column 16: Reading
- Column 17: Sports/exercise
- Column 18: Outdoor leisure
- Column 19: Drive

Contact address

If you have a question, please contact to

Tetsuo Furukawa
furukawa@brain.kyutech.ac.jp

Department of Human Intelligence Systems
Kyushu Institute of Technology