# Data Analysis

-Class Test No. 1 Page 1

Last Name First Name Student ID

Data Description: Suppose that management at Qantas wants to understand consumers’ preferences for various flights from Sydney to Wellington, New Zealand. A marketing research firm has been comissioned to collect survey data from a representative sample of Qantas’ target market in Australia. Each respondent was asked to indicate his or her choice and buying intentions on a randomly chosen flight from Sydney to New Zealand operated by Qantas or Air New Zealand. The data from 1600 respondents are stored in flight06.sav, which you can download from UTS Online. For your convenience, a list of the variables in the data set is as follows:

 1. caseid: Respondent ID, ranging from 1 to 1600.  2. fare: Return airfare between the Sydney and Wellington airports.

o 0 = \$550 o 1 = \$600 o 2 = \$650 o 3 = \$700

 3. depart: Departure time from the Sydney airport. o 0 = Depart at 8 am o 1 = Depart at 9 am o 2 = Depart at 12 pm o 3 = Depart at 2 pm

 4. time: Flying time. o 0 = Three hours o 1 = Three and half hours o 2 = Four hours o 3 = Four and half hours

 5. stops: Number of stops en route. o 0 = Non-stop o 1 = One-stop

 6. audio: In-flight audio service. o 0 = No o 1 = Yes

 7. video: In-flight video service. o 0 = No o 1 = Yes

 8. meals: In-flight free meals. o 0 = No o 1 = Yes

 9. airline: Name of airline. o 0 = Air New Zealand o 1 = Qantas

In-Class Test No. 1 Page 2

 10. choice: Would you choose this flight? o 0 = No o 1 = Yes

 11. intent: Buying intention measured on a 10-point graphic rating scale. o 1 = Extremely unlikely to buy a ticket for this flight o 10 = Extremely likely to buy a ticket for this flight

The first 25 questions are multiple-choice questions (25 marks). Please read carefully each of the questions and use your pencil to mark the letter (by filling the corresponding oval completely) on the answer sheet that represents the most appropriate answer in your judgment. There is only one correct answer to each question. The last question is a short-answer question (5 marks). 1. The variable, fare, in its current code values provides ________ data.

a. nominal b. ordinal c. interval d. ratio

2. If we replaced its code values with actual dollar values, the variable, fare, would provide

________ data. a. nominal b. ordinal c. interval d. ratio

3. The variable, stops, provides ________ data.

a. nominal b. ordinal c. interval d. ratio

4. The variable, choice, provides ________ data.

a. nominal b. ordinal c. interval d. ratio

5. The mean can be meaningfully estimated for:

a. all the variables in the data. b. all the variable in the data except for caseid and depart. c. the variable of intent only. d. the variables of intent and stops.

6. Correlation analysis between fare and intent shows that:

a. there is no significant correlation between the two variables. b. there is a moderate and inverse correlation. c. there is a strong and significantly negative correlation. d. there is a week but significantly negative correlation.

7. Approximately what percentage of the respondents would choose the flight offered to them?

In-Class Test No. 1 Page 3

8. Management at Qantas expected that less than half of its target customers would choose the

flights offered to them. What should the null hypothesis be to test this proposition? Is the test one- tail or two-tail test? a. Ho:  (population proportion) >= 50%, one-tail test b. Ho:  (population proportion) >= 50%, two-tail test c. Ho:  (population proportion) <= 50%, one-tail test d. Ho:  (population proportion) <= 50%, two-tail test

9. Management at Qantas expected that less than half of its target customers would choose the

flights offered to them. Based on the data we have, what is your test conclusion? a. Do not reject Ho at the 0.01 level and conclude that more than half of its target customers

would choose the flights. b. Reject Ho at the 0.01 level and conclude that less than half of its target customers would

choose the flights. c. Do not reject Ho at the 0.01 level and conclude that less than half of its target customers would

choose the flights. d. Reject Ho at the 0.01 level and conclude that more than half of its target customers would

choose the flights. 10. What is the 95% confidence interval (CI) for the variable of intent? What is the correct

relationship between 99% CI and 95% CI? a. 95% CI ranges from 5.2961 to 5.5007. The 99% CI would be narrower. b. 95% CI ranges from 5.3206 to 5.4762. The 99% CI would be wider. c. 95% CI ranges from 5.3206 to 5.4762. The 99% CI would be narrower. d. 95% CI ranges from 5.2961 to 5.5007. The 99% CI would be wider

11. What is your conclusion for testing the null hypothesis that buying intention measured on the 10-

point graphic rating scale is 8 on average? a. Reject Ho at the 0.01 level and conclude that the buying intention on average is 8 on the 10-

point rating scale. b. Do not reject Ho at the 0.01 level and conclude that the buying intention on average is 8 on the

10-point rating scale. c. Reject Ho at the 0.01 level and conclude that the buying intention on average is less than 8 on

the 10-point rating scale. d. Do not reject Ho at the 0.01 level and conclude that the buying intention on average is less than

8 on the 10-point rating scale. 12. Is it appropriate to run a linear regression using intent as the dependent variable and fare as the

independent variable? What is the reason? a. Yes, because the dependent variable is already dummy-variable coded. b. No, because the dependent variable is not interval- or ratio-scaled. c. No, because the independent variable is nominal-scaled. d. Yes, because both variables are interval-scaled.

In-Class Test No. 1 Page 4

13. What is your conclusion for the independent samples t-test comparing buying intention scores between those who would choose the flight and those who would not? a. Both the Levene’s test and t-test are significant at the 0.01 level. We conclude that the buying

intention score for those who would choose the flight was significantly higher than those who would not.

b. The Levene’s test is insignificant and the t-test is significant at the 0.01 level. We conclude that the buying intention score for those who would choose the flight was significantly higher than those who would not.

c. The Levene’s test is significant and the t-test is insignificant at the 0.01 level. We conclude that the buying intention score for those who would choose the flight was significantly higher than those who would not.

d. Both the Levene’s test and t-test are insignificant at the 0.05 level. We conclude that there was no significant difference in buying intention score between those who would choose the flight and those who would not.

14. Run a crosstab (i.e., bivariate chi-square test) between stops and choice. What can you conclude?

a. The number of stops significantly affects choice (p < 0.05). Based on row percentage, customers offered with a non-stop flight were 1.08 times as likely to choose the flight as those offered with a one-stop flight.

b. The number of stops significantly affects choice (p < 0.05). Based on odds ratio, customers offered with a non-stop flight were 1.12 times as likely to choose the flight as those offered with a one-stop flight.

c. The number of stops does not seem to significantly affect choice (p > 0.05). d. Both a and b are correct.

15. Run a crosstab (i.e., bivariate chi-square test) between depart and choice. What can you conclude?

a. Departure time significantly affects choice (p < 0.01). It appears that customers prefer flights that depart at 8 am or 9 am.

b. Departure time significantly affects choice (p < 0.01). It appears that customers prefer flights that depart at 8 am.

c. Departure time significantly affects choice (p < 0.01). It appears that customers prefer flights that depart at noon.

d. Departure time significantly affects choice (p < 0.01). It appears that customers prefer flights that depart at noon or 2 pm.

16. Run a crosstab (i.e., bivariate chi-square test) between meals and choice. What can you conclude?

a. In-flight meals significantly affected choice (p < 0.05). Based on row percentage, customers offered with free meal flight were 1.36 times as likely to choose the flight as those offered with no free meal flight.

b. In-flight meals did not significantly affect choice (p > 0.05). c. In-flight meals significantly affected choice (p < 0.05). Based on row percentage, customers

offered with free meal flight were 1.57 times as likely to choose the flight as those offered with no free meal flight.

d. In-flight meals significantly affected choice (p < 0.05). Based on row percentage, customers offered with free meal flight were 1.16 times as likely to choose the flight as those offered with no free meal flight.

In-Class Test No. 1 Page 5

17. Run a one-way ANOVA between fare and intent. What can you conclude? a. No conclusion can be made because the ANOVA test is invalid. b. There is no relationship whatsoever between airfare and buying intention (p > 0.05). c. Since Levene’s test is significant (p = 0.035), Robust Tests of Equality of Means should be

used. Since both Welch and Brown-Forsythe tests are highly significant (p < 0.01), H0: Equal Means should be rejected and airfare significantly affects buying intention. Tukey HSD tests show that there are highly significant differences in buying intention among the four airfare levels (p < 0.01).

d. Since airfare is on an interval scale, ANOVA is not appropriate. 18. Run a one-way ANOVA between depart and intent. What can you conclude?

a. No conclusion can be made because the ANOVA test is invalid. b. Since departure time is on an interval scale, ANOVA is not appropriate. c. Since Levene’s test is significant (p < 0.01), we can conclude that departure time significantly

affects buying intention. d. Since Levene’s test is highly significant (p < 0.01), Robust Tests of Equality of Means should

be used. Since both Welch and Brown-Forsythe tests are highly significant (p < 0.01), H0: Equal Means should be rejected and departure time significantly affects buying intention. Tukey HSD tests show customers are more likely to book flights that depart at 8 am.

19. Run a simple linear regression between stops and intent. What can you conclude?

a. There is a strong inverse relationship between number of stops and buying intention. b. There is no significant relationship between number of stops and buying intention. c. Since VIF = 1, no interpretation of the regression model should be made because of the

multicollinearity problem. d. No conclusion can be made because the regression analysis is inappropriate.

20. Run a linear regression using intent as the dependent variable and fare and time as the

independent variables. What can you conclude? a. While both return airfare and flying time significantly affect buying intention (p < 0.01),

airfare has relatively more influence than flying time. b. While both return airfare and flying time significantly affect buying intention (p < 0.01), flying

time has relatively more influence than airfare c. Both return airfare and flying time have significant and equal impact on buying intention. d. No conclusion can be made because the regression analysis is inappropriate.

21. Recode fare into a different variable called fareraw (i.e., to replace the four code values: 0, 1, 2,

& 3 with raw values: \$550, \$600, \$650, & \$700). Assess the possible non-linear relationships between fareraw and intent via SPSS Analyze – Regression – Curve Estimation procedure. What can you conclude? a. The linear model has the highest R-square of 0.514. b. The quadratic model has the highest R-square of 0.514. c. The logarithmic model has the highest R-square of 0.514. d. The inverse model has the highest R-square of 0.514.

22. Do mean correction for fareraw (i.e., fare1 = fareraw – 625) and create the quadratic term (i.e.,

fare2 = fare1**2). Then assess two regression models (linear and quadratic models) among fare1, fare2, and intent using hierarchical regression procedure. What can you conclude? a. The linear model is the best prediction model. b. The quadratic model is the best prediction model, however, this model suffers from

multicollinearity problem. c. The quadratic model is the best prediction model, and this model does not have a

multicollinearity problem. d. None of the above.

In-Class Test No. 1 Page 6

23. First do a dummy-variable coding for depart using the last group as the reference group and then run a linear regression between the dummy variables (depart1, depart2, depart3) and intent. What can you conclude? a. The overall model is significant (p < 0.05), however, the significant difference is between 8 am

flights and 2 pm flights. b. The overall model is significant (p < 0.05), however, the significant difference is between 9 am

flights and 2 pm flights. c. No conclusion can be made because the regression analysis is inappropriate. d. The overall model is significant (p < 0.05), however, the significant difference is between 12

pm flights and 2 pm flights. 24. Run a multiple linear regression using intent as the dependent variable and depart1, depart2,

depart3, time, stops, audio, video, meals, airline, and fareraw as the independent variables. What can you conclude? a. The overall model is highly significant (R-square = 0.832, p < 0.01) but residual plots show

serious violations of the underlying regression assumptions. b. The overall model is highly significant (R-square = 0.832, p < 0.01) but the histogram

indicates that the standardized residuals are not normally distributed, suggesting the model does not fit the data well.

c. No conclusion can be made because the regression analysis is inappropriate. d. The overall model is highly significant (R-square = 0.832, p < 0.01) and the residual plots

suggest that the model fits the data reasonably well. 25. Based on the linear regression model you get from Q24, evaluate the following two flights: (1)

Flight A: Air New Zealand fight that departs at 8 am, no in-flight audio & video entertainment, no free meals, with one-stop at Auckland, total flying time is 3.5 hours and costs \$600; (2) Flight B: Qantas no-stop flight that departs at noon with in-flight audio & video entertainment, plus free meals, total flying time is 3 hours and costs \$650. Which of the two flights is more likely to be chosen? Why? a. Flight B is more likely to be chosen because the predicted buying intention score for Flight B

(6.64) is higher than that for Flight A (5.72). b. Flight A is more likely to be chosen because the predicted buying intention score for Flight A

(6.64) is higher than that for Flight B (5.72). c. No conclusion can be made because the regression analysis is inappropriate. d. Both flights have an equal chance of being chosen because the predicted buying intention score

for both flights is the same. 26. This short-answer question (5 marks) is related to Q24 & Q25.

a. What is the advantage of using fareraw over the use of fare? b. In Q25, what would be the conclusion if we reduced the airfare for Flight B from \$650 to \$580? c. Which of the eight airline attributes is perceived to be most important? d. Which of the eight airline attributes is perceived to be 2nd most important? e. Which of the eight airline attributes is perceived to be least important?