P2-8: Investigating Music Track Liking in the Halo of Album Covers
Oleg Lesota, Anna Hausberger, Ivanna Pshenychna, Oleksandr Shvydanenko, Olha Yehorova, Markus Schedl
Subjects: Music retrieval systems ; Human-centered MIR ; User behavior analysis and mining, user modeling ; Music recommendation and playlist generation ; Music interfaces and services ; Applications ; Open Review ; Cognitive MIR ; Knowledge-driven approaches to MIR
Presented In-person
4-minute short-format presentation
Research on music retrieval and recommendation often neglects the fact that a user’s response to a music track depends on contextual factors, such as the composition of the results list, the design of the user interface or the additional media displayed. However, a body of psychological research suggests that human perception and decision making can be strongly influenced by contextual factors. In particular, an initial positive aesthetic impression of a product may influence a buyer's perception of its features unrelated to appearance, such as utility or reliability, which is a manifestation of a cognitive bias called the halo effect. The work at hand investigates whether an album cover shown to the listener during playback can create a halo effect, influencing the listener's liking of the track. We approach this question by means of a two-stage user study. In the first stage, participants individually rated a series of album covers and music snippets. In the second stage, they were presented with music tracks and album covers (from those they indicated as unfamiliar to them at the first stage) arranged in pairs, such that their least liked tracks were shown with their most liked album covers and vice versa. The results show that displaying an appealing album cover while playing a music track may result in a higher rating of the track and vice versa. We also observe an indication of halo effect created by the perceived degree of matching between the music tracks and the album cover shown.
Q2 ( I am an expert on the topic of the paper.)
Disagree
Q3 ( The title and abstract reflect the content of the paper.)
Agree
Q4 (The paper discusses, cites and compares with all relevant related work.)
Disagree
Q5 ( Please justify the previous choice (Required if “Strongly Disagree” or “Disagree” is chosen, otherwise write "n/a"))
The work could include machine learning (ML) works that employ album cover image and audio information tackling downstream tasks such as music classification in the multi-modal setup. For instance, following two works can be under such category:
Sergio Oramas, Francesco Barbieri, Oriol Nieto, Xavier Serra: Multimodal Deep Learning for Music Genre Classification. Trans. Int. Soc. Music. Inf. Retr. 1(1): 4-21 (2018) Igor Vatolkin, Cory McKay: Multi-Objective Investigation of Six Feature Source Types for Multi-Modal Music Classification. Trans. Int. Soc. Music. Inf. Retr. 5(1): 1-19 (2022)
Q6 (Readability and paper organization: The writing and language are clear and structured in a logical manner.)
Agree
Q7 (The paper adheres to ISMIR 2025 submission guidelines (uses the ISMIR 2025 template, has at most 6 pages of technical content followed by “n” pages of references or ethical considerations, references are well formatted). If you selected “No”, please explain the issue in your comments.)
Yes
Q8 (Relevance of the topic to ISMIR: The topic of the paper is relevant to the ISMIR community. Note that submissions of novel music-related topics, tasks, and applications are highly encouraged. If you think that the paper has merit but does not exactly match the topics of ISMIR, please do not simply reject the paper but instead communicate this to the Program Committee Chairs. Please do not penalize the paper when the proposed method can also be applied to non-music domains if it is shown to be useful in music domains.)
Agree
Q9 (Scholarly/scientific quality: The content is scientifically correct.)
Agree
Q11 (Novelty of the paper: The paper provides novel methods, applications, findings or results. Please do not narrowly view "novelty" as only new methods or theories. Papers proposing novel musical applications of existing methods from other research fields are considered novel at ISMIR conferences.)
Agree
Q12 (The paper provides all the necessary details or material to reproduce the results described in the paper. Keep in mind that ISMIR respects the diversity of academic disciplines, backgrounds, and approaches. Although ISMIR has a tradition of publishing open datasets and open-source projects to enhance the scientific reproducibility, ISMIR accepts submissions using proprietary datasets and implementations that are not sharable. Please do not simply reject the paper when proprietary datasets or implementations are used.)
Disagree
Q13 (Pioneering proposals: This paper proposes a novel topic, task or application. Since this is intended to encourage brave new ideas and challenges, papers rated “Strongly Agree” and “Agree” can be highlighted, but please do not penalize papers rated “Disagree” or “Strongly Disagree”. Keep in mind that it is often difficult to provide baseline comparisons for novel topics, tasks, or applications. If you think that the novelty is high but the evaluation is weak, please do not simply reject the paper but carefully assess the value of the paper for the community.)
Disagree (Standard topic, task, or application)
Q14 (Reusable insights: The paper provides reusable insights (i.e. the capacity to gain an accurate and deep understanding). Such insights may go beyond the scope of the paper, domain or application, in order to build up consistent knowledge across the MIR community.)
Agree
Q15 (Please explain your assessment of reusable insights in the paper.)
As contextual material, the album cover can affect the perception/appreciation of the music while listening. This insight opens up a number of research directions, both in music psychology and MIR applications such as recommender systems.
Q16 ( Write ONE line (in your own words) with the main take-home message from the paper.)
Album cover art co-presented during music listening can affect listeners' perception of music.
Q17 (This paper is of award-winning quality.)
No
Q19 (Potential to generate discourse: The paper will generate discourse at the ISMIR conference or have a large influence/impact on the future of the ISMIR community.)
Agree
Q20 (Overall evaluation (to be completed before the discussion phase): Please first evaluate before the discussion phase. Keep in mind that minor flaws can be corrected, and should not be a reason to reject a paper. Please familiarize yourself with the reviewer guidelines at https://ismir.net/reviewer-guidelines.)
Weak accept
Q21 (Main review and comments for the authors (to be completed before the discussion phase). Please summarize strengths and weaknesses of the paper. It is essential that you justify the reason for the overall evaluation score in detail. Keep in mind that belittling or sarcastic comments are not appropriate.)
Summary
The work investigates the halo/horn effect of the presence of album cover art as a contextual material in music preference (liking) when it is presented during music listening. An two-stage experiment is designed, where the first phase collect each participants' preference on music and album cover art independently as a baseline, and in the second phase participants indicate the preference on each music with which are paired with album covers of the most significant preference difference to the music in the first round. The result indicates that the delta between the preference scores of music on the two stages positively correlates with the album covers, which implies the halo/horn effect of album covers on the music when presented. Additionally, the matching between audio and cover is recorded in the second phase, which shows the lesser, but positive effect on the delta of preference, might suggest that the matching can be contributed as the latent factor.
Major Comments
Strengths
- The work investigates a novel topic. The effectiveness of album cover art has already been studied considerably, but mainly on machine learning (ML) downstream applications in the MIR field, which generally positively affects the automatic (machine) music understanding. To my knowledge, psychological studies involving human perception study, which are relatively less studied in MIR, could not only explain the positive effect in applications, but also build fundamental understanding of the phenomenon. The experimental design is reasonably sound, and the result shows the effect's existence (even with some noise concern that will be followed in the next subsection). This has several interesting implications for applications and opens some future research directions.
- The work is well-read in general.
Weaknesses
- The experimental design could be improved:
- The primary material can be more representative:
- The study album covers are heavily sampled, but in a skewed way that filters out ones containing any text. This results in about 1% of the covers from the study population, potentially implying that the remaining samples might be atypical. An alternative approach might be uniform sampling followed by anonymization (i.e., anonymization of title/artist/album name text), which might be better for the distribution of images.
- Music data might also not be the most representative, as it includes the least popular songs from each popular genre. Popularity might be confounded with qualities of musical/production aspects of the song (i.e., songs can be unpopular due to their poor production, or musically too atypical, etc.). This could make the remaining study materials less representative.
- The experimental procedure could be less noisy by controlling a few factors:
- The participants are allowed to listen to music songs for 5 seconds to 15 seconds, which might add variability in the result as the stimuli vary arbitrarily per participant.
- As it is not indicated that the participants are from a US representative population (which one could choose in the prolific platform), I assume that there is a chance that the participants' distribution is less representative.
Minor Comments
- p3.l219 "In the second stage, ... a song snippet.": Is this randomized? Or with sorting applied to the previous paragraph?
- p3.l258 "The agreement score ... computed analogously.": This seems to be an identical definition to precision@K, which is popularly used in recommender systems literature
- p3.l264 "To ensure that ... over the 100 runs.": The variance of these individual bootstrap values would also be interesting.
- p3.l270 "... Spearman's rank correlation ... second stage ($a_s^1$)": Why Spearman's rho is used? It may be due to the standardization, which transforms the integer scale to something closer to a Gaussian distribution. However, as Kendall's Tau is non-parametric, I assume the result would be almost identical.
- p4.l328 "This suggests that ... affected track ranking.": There is no control group (i.e., coming to the second stage and doing the same thing as the first round, without being presented with album covers). There might be a baseline fluctuation because people can forget things.
- p4.l332 "... (with the threshold 0.05) ...": I assume it is the threshold on p-value, but it could be better clarified.
- p5.l440 "... to knew ...": It seems "new" is intended here.
- p6. "5. LIMITATIONS & FUTURE WORK": it could be interesting to check (as a side study) to see how the participants generally care about the album cover in their daily lives. Some might be keen to check the album cover if they are music fans, while others might not even see it while using streaming services. Controlling this factor might get a more precise result
- p6.l474 "The study at hand ... streaming platforms.": Is it a representative US population (from Prolific option)?
- p6.l489 "Future work could ... first listening session.": Also, the experimental design could add a third session, to see if the effect decreases?
Q22 (Final recommendation (to be completed after the discussion phase) Please give a final recommendation after the discussion phase. In the final recommendation, please do not simply average the scores of the reviewers. Note that the number of recommendation options for reviewers is different from the number of options here. We encourage you to take a stand, and preferably avoid “weak accepts” or “weak rejects” if possible.)
Accept
Q23 (Meta-review and final comments for authors (to be completed after the discussion phase))
Summary of the reviews
Strengths:
- The topic is novel, relevant, and interesting, and opens new research directions
- The experimental design overall is sound and reproducible to some extent
- The presentation of the work is generally good and easy to follow
For improvement:
- A few methodological limitations can be found:
- Experimental materials (e.g., selection of songs and album covers) need more elaboration and justification
- Participants study population can be better representative
- It would benefit from a more in-depth introduction of related work surveys, connected to methodological choices
Other notable points brought by reviewers are: - Explicit mention of the ethics approval procedure seems to be missing - The work could offer the experimental design as a product of the work, which can be reused for future works in an extended context - Presentation of the result section could be improved (e.g., using a correlation matrix)
Overall comment on the decision
The reviewers all agreed that the work conducted a novel and interesting study, opening future research directions. The experimental design is overall sound and well-presented. Reviewers also found a few points to be improved: 1) methodological limitations such as representativeness of participants and study materials (i.e., study samples of covers and songs) and 2) related work, including engineering-centered MIR papers involving the album covers as part of data. The overall decision was unanimous, indicating that all reviewers agreed that the work is a valuable contribution to the conference, especially with these improvements.
Q2 ( I am an expert on the topic of the paper.)
Agree
Q3 (The title and abstract reflect the content of the paper.)
Strongly agree
Q4 (The paper discusses, cites and compares with all relevant related work)
Strongly agree
Q6 (Readability and paper organization: The writing and language are clear and structured in a logical manner.)
Agree
Q7 (The paper adheres to ISMIR 2025 submission guidelines (uses the ISMIR 2025 template, has at most 6 pages of technical content followed by “n” pages of references or ethical considerations, references are well formatted). If you selected “No”, please explain the issue in your comments.)
Yes
Q8 (Relevance of the topic to ISMIR: The topic of the paper is relevant to the ISMIR community. Note that submissions of novel music-related topics, tasks, and applications are highly encouraged. If you think that the paper has merit but does not exactly match the topics of ISMIR, please do not simply reject the paper but instead communicate this to the Program Committee Chairs. Please do not penalize the paper when the proposed method can also be applied to non-music domains if it is shown to be useful in music domains.)
Strongly agree
Q9 (Scholarly/scientific quality: The content is scientifically correct.)
Agree
Q11 (Novelty of the paper: The paper provides novel methods, applications, findings or results. Please do not narrowly view "novelty" as only new methods or theories. Papers proposing novel musical applications of existing methods from other research fields are considered novel at ISMIR conferences.)
Agree
Q12 (The paper provides all the necessary details or material to reproduce the results described in the paper. Keep in mind that ISMIR respects the diversity of academic disciplines, backgrounds, and approaches. Although ISMIR has a tradition of publishing open datasets and open-source projects to enhance the scientific reproducibility, ISMIR accepts submissions using proprietary datasets and implementations that are not sharable. Please do not simply reject the paper when proprietary datasets or implementations are used.)
Agree
Q13 (Pioneering proposals: This paper proposes a novel topic, task or application. Since this is intended to encourage brave new ideas and challenges, papers rated "Strongly Agree" and "Agree" can be highlighted, but please do not penalize papers rated "Disagree" or "Strongly Disagree". Keep in mind that it is often difficult to provide baseline comparisons for novel topics, tasks, or applications. If you think that the novelty is high but the evaluation is weak, please do not simply reject the paper but carefully assess the value of the paper for the community.)
Disagree (Standard topic, task, or application)
Q14 (Reusable insights: The paper provides reusable insights (i.e. the capacity to gain an accurate and deep understanding). Such insights may go beyond the scope of the paper, domain or application, in order to build up consistent knowledge across the MIR community.)
Agree
Q15 (Please explain your assessment of reusable insights in the paper.)
i appreciate the research question tackled in this study. though i had some questions about the specifics of the method, i find the interpretations really interesting and i think they'll spark nice dialogue for both MIR folk and music psychology folk.
Q16 (Write ONE line (in your own words) with the main take-home message from the paper.)
visual components (i.e. album cover) can influence judgement of audio (music tracks).
Q17 (Would you recommend this paper for an award?)
No
Q19 (Potential to generate discourse: The paper will generate discourse at the ISMIR conference or have a large influence/impact on the future of the ISMIR community.)
Agree
Q20 (Overall evaluation: Keep in mind that minor flaws can be corrected, and should not be a reason to reject a paper. Please familiarize yourself with the reviewer guidelines at https://ismir.net/reviewer-guidelines)
Weak accept
Q21 (Main review and comments for the authors. Please summarize strengths and weaknesses of the paper. It is essential that you justify the reason for the overall evaluation score in detail. Keep in mind that belittling or sarcastic comments are not appropriate.)
i love the idea of this submission - the research question is really interesting and well-supported by background literature. I had some questions about the method: Selecting 40 album covers from 38746 - i'd like to know more about the choices. And 5 most pop + 8 least pop x 5 genres seems to be more than 40 tracks, so again, i'd love to know the details of the media selected for the study (also what 15 sec of 30 were selected). and if people are getting individualised study media in the second step, is the sample large enough (power) for the analyses?
i found the results section hard to easily engage with. but i think this piece of research will create lively discussion amongst the community.
Q2 ( I am an expert on the topic of the paper.)
Agree
Q3 (The title and abstract reflect the content of the paper.)
Strongly agree
Q4 (The paper discusses, cites and compares with all relevant related work)
Agree
Q6 (Readability and paper organization: The writing and language are clear and structured in a logical manner.)
Strongly agree
Q7 (The paper adheres to ISMIR 2025 submission guidelines (uses the ISMIR 2025 template, has at most 6 pages of technical content followed by “n” pages of references or ethical considerations, references are well formatted). If you selected “No”, please explain the issue in your comments.)
Yes
Q8 (Relevance of the topic to ISMIR: The topic of the paper is relevant to the ISMIR community. Note that submissions of novel music-related topics, tasks, and applications are highly encouraged. If you think that the paper has merit but does not exactly match the topics of ISMIR, please do not simply reject the paper but instead communicate this to the Program Committee Chairs. Please do not penalize the paper when the proposed method can also be applied to non-music domains if it is shown to be useful in music domains.)
Strongly agree
Q9 (Scholarly/scientific quality: The content is scientifically correct.)
Strongly agree
Q11 (Novelty of the paper: The paper provides novel methods, applications, findings or results. Please do not narrowly view "novelty" as only new methods or theories. Papers proposing novel musical applications of existing methods from other research fields are considered novel at ISMIR conferences.)
Agree
Q12 (The paper provides all the necessary details or material to reproduce the results described in the paper. Keep in mind that ISMIR respects the diversity of academic disciplines, backgrounds, and approaches. Although ISMIR has a tradition of publishing open datasets and open-source projects to enhance the scientific reproducibility, ISMIR accepts submissions using proprietary datasets and implementations that are not sharable. Please do not simply reject the paper when proprietary datasets or implementations are used.)
Agree
Q13 (Pioneering proposals: This paper proposes a novel topic, task or application. Since this is intended to encourage brave new ideas and challenges, papers rated "Strongly Agree" and "Agree" can be highlighted, but please do not penalize papers rated "Disagree" or "Strongly Disagree". Keep in mind that it is often difficult to provide baseline comparisons for novel topics, tasks, or applications. If you think that the novelty is high but the evaluation is weak, please do not simply reject the paper but carefully assess the value of the paper for the community.)
Agree (Novel topic, task, or application)
Q14 (Reusable insights: The paper provides reusable insights (i.e. the capacity to gain an accurate and deep understanding). Such insights may go beyond the scope of the paper, domain or application, in order to build up consistent knowledge across the MIR community.)
Agree
Q15 (Please explain your assessment of reusable insights in the paper.)
Given that "album art" is one of the types/forms of contextual media the MIR context, the research designs and/or results might be translated to or adapted to similar research with other types of contextual media.
Q16 (Write ONE line (in your own words) with the main take-home message from the paper.)
The appeal of album cover and users' ranking of music tracks are positively associated with each other.
Q17 (Would you recommend this paper for an award?)
No
Q19 (Potential to generate discourse: The paper will generate discourse at the ISMIR conference or have a large influence/impact on the future of the ISMIR community.)
Strongly agree
Q20 (Overall evaluation: Keep in mind that minor flaws can be corrected, and should not be a reason to reject a paper. Please familiarize yourself with the reviewer guidelines at https://ismir.net/reviewer-guidelines)
Strong accept
Q21 (Main review and comments for the authors. Please summarize strengths and weaknesses of the paper. It is essential that you justify the reason for the overall evaluation score in detail. Keep in mind that belittling or sarcastic comments are not appropriate.)
Strengths of this paper:
A very interesting topic that is worth the exploration.
The study has opened up further integration of psychological perspectives in to MIR research and practices.
The research questions are clearly presented.
Various research design decisions are fairly well-justified, e.g., 5 seconds as the minimum duration of each snipper, participation in the second part no earlier than 36 hours, etc.
Results and their implications are clearly presented.
There is potential for further research on
Issues/potential improvement of this paper:
As mentioned in the abstract, there could be other contextual factors affecting users’ preferences of music tracks. If space allows, perhaps the literature review can briefly cover what other of these contextual factors are, and also how the study’s design attempted to minimize the confounding influences of these other factors.
Perhaps more details about the music tracks can be supplied, e.g., instrumental vs. vocal music, if with lyrics then what are the languages, etc.
What are the demographics of the participants? (It has been briefly mentioned in the Limitations section, though more demographic information should have been described earlier in the paper.) Many factors could come into play, e.g., their music listening habits/experience, the preferences over multimedia materials (e.g., visual vs. auditory)
As pinpointed in the Conclusion, the album covers were supposed to be “unfamiliar” to the users, though it is possible that some visual elements in an album cover art might be seen by the users before — even though the entire album cover is “new” to the users.
While the current way of presenting the results seem clear, authors can consider using a correlation matrix, which would help fast readers understand the results better.
In addition to practical implications, authors can think about what methodological contributions this study (and/or their further endeavours) can make, such as the experimental design for research in similar directions (e.g., matching between contextual media and listeners’ preferences of tracks) and/or the modalities of data (e.g., incorporating multimodal data for the evaluation), etc.
Q2 ( I am an expert on the topic of the paper.)
Agree
Q3 (The title and abstract reflect the content of the paper.)
Agree
Q4 (The paper discusses, cites and compares with all relevant related work)
Agree
Q6 (Readability and paper organization: The writing and language are clear and structured in a logical manner.)
Strongly agree
Q7 (The paper adheres to ISMIR 2025 submission guidelines (uses the ISMIR 2025 template, has at most 6 pages of technical content followed by “n” pages of references or ethical considerations, references are well formatted). If you selected “No”, please explain the issue in your comments.)
Yes
Q8 (Relevance of the topic to ISMIR: The topic of the paper is relevant to the ISMIR community. Note that submissions of novel music-related topics, tasks, and applications are highly encouraged. If you think that the paper has merit but does not exactly match the topics of ISMIR, please do not simply reject the paper but instead communicate this to the Program Committee Chairs. Please do not penalize the paper when the proposed method can also be applied to non-music domains if it is shown to be useful in music domains.)
Agree
Q9 (Scholarly/scientific quality: The content is scientifically correct.)
Agree
Q10 (Please justify the previous choice (Required if "Strongly Disagree" or "Disagree" is chosen, otherwise write "n/a"))
Please do see my comments in the main review.
Q11 (Novelty of the paper: The paper provides novel methods, applications, findings or results. Please do not narrowly view "novelty" as only new methods or theories. Papers proposing novel musical applications of existing methods from other research fields are considered novel at ISMIR conferences.)
Agree
Q12 (The paper provides all the necessary details or material to reproduce the results described in the paper. Keep in mind that ISMIR respects the diversity of academic disciplines, backgrounds, and approaches. Although ISMIR has a tradition of publishing open datasets and open-source projects to enhance the scientific reproducibility, ISMIR accepts submissions using proprietary datasets and implementations that are not sharable. Please do not simply reject the paper when proprietary datasets or implementations are used.)
Agree
Q13 (Pioneering proposals: This paper proposes a novel topic, task or application. Since this is intended to encourage brave new ideas and challenges, papers rated "Strongly Agree" and "Agree" can be highlighted, but please do not penalize papers rated "Disagree" or "Strongly Disagree". Keep in mind that it is often difficult to provide baseline comparisons for novel topics, tasks, or applications. If you think that the novelty is high but the evaluation is weak, please do not simply reject the paper but carefully assess the value of the paper for the community.)
Disagree (Standard topic, task, or application)
Q14 (Reusable insights: The paper provides reusable insights (i.e. the capacity to gain an accurate and deep understanding). Such insights may go beyond the scope of the paper, domain or application, in order to build up consistent knowledge across the MIR community.)
Agree
Q15 (Please explain your assessment of reusable insights in the paper.)
The methodology as applied in the paper could, with some adaptation, be applied considering other types of context such as those mentioned in future work. The specific insights may also be used to stress the importance of visual aspects in reommendation contexts.
Q16 (Write ONE line (in your own words) with the main take-home message from the paper.)
The authors stress the importance of considering the impact that listening context (in this case, in the form of album art) may have on listeners’ overall song rating, and evaluat this halo/horn effect in a user study.
Q17 (Would you recommend this paper for an award?)
No
Q19 (Potential to generate discourse: The paper will generate discourse at the ISMIR conference or have a large influence/impact on the future of the ISMIR community.)
Disagree
Q20 (Overall evaluation: Keep in mind that minor flaws can be corrected, and should not be a reason to reject a paper. Please familiarize yourself with the reviewer guidelines at https://ismir.net/reviewer-guidelines)
Weak accept
Q21 (Main review and comments for the authors. Please summarize strengths and weaknesses of the paper. It is essential that you justify the reason for the overall evaluation score in detail. Keep in mind that belittling or sarcastic comments are not appropriate.)
This paper describes a user study looking into context of music recommendations - specifically, album art. My overall evaluation is to weakly accept this paper, as it has added value to the ISMIR community, but there is room for improvement in several aspects.
Strengths:
General topic: The investigation of the halo effect in this specific domain and context has not been done before (though is not ground-breaking), and the authors discuss how results from this study may impact future work in a broader application area.
Writing and structure: The paper is written very well and has a clear structure. It is useful that key findings are outlined separately, though it sometimes does lead to some duplicate information.
Introduction: The context and research gaps are provided well.
Method: I appreciate that this is a user study, focusing on measuring the halo/horn effect in a novel context. It is clearly described how songs and album covers were selected, and familiarity is taken into account and controlled for (though it still may be that participants recognize an artist, but not the specific song). The evaluation metrics seem appropriate.
Results: The tables and graphs are helpful in understanding the insights, which are clearly described.
Reproducibility: The paper provides all questions asked in the study, and the final dataset that was used.
Weaknesses:
Abstract: The last sentence is put somewhat too strong based on the presented evidence, it would be good to introduce some of the nuance that is descibed in the paper.
Research questions: It could be made more clear what the difference is between the two RQs and the approach to answer them. The questions are separately answered as described in the method section, with RQ1 seemingly looking into changes in ranking due to having an album cover present yes/no, and RQ2 looking into changes in ranking due to the appeal of the album cover. However, as both influence each other, it could be clarified that they are not completely independent. Also, the phrasing implies only RQ2 is measuring the halo/horn effect, but is RQ1 effectively not also describing this effect?
Method: While some choices in the study are supported (e.g., number and length of samples), it is not clear whether the method as a whole was based on any previous work. The decisions on the song-album matching, as well as the survey questions, should be validated in order to know whether they actually measure what is intended, but it is unclear if they were. How do other works measure halo/horn effect? Secondly, as the study contains several steps and data/participant filtering mechanisms, it would be helpful to include a flowchart of the process to visualize it. Thirdly, as the specific UI might have slightly influenced the study (as also mentioned in limitations), it would have been good to also include one or several screenshots.
Ethical considerations: Even though this is a user study, the authors do not mention any ethical considerations or approval of an ethical review board.
Results: Perceived matching between album and song is described in the context of RQ2 as a control question, but in the results, it is mentioned as being relevant to halo/horn effects. The introduction and RQs do not mention song/album matching in the context of these effects, so it is unclear how this survey question came about and whether it helps answering RQ2.
Discussion: There is no comparison with insights from previous work, which makes it more difficult to properly assess the contribution of the current work.
Implications: In a real-life music listening setting, listeners will likely see the album cover before actually hearing the music, as opposed to receiving both at the same time. Also, users are likely to receive recommendations in the context of a playlist, possibly also mixed in with songs/album covers that are familiar to them. It would be good if expected changes in outcome compared to this study’s, were discussed.