Ethics Review Questions

We ask that reviewers consider the following questions before answering the portions of the review form concerning ethical considerations. We encourage authors to think through these questions and address them (as appropriate) in their submitted papers. Additional advice to authors is available in the EMNLP 2022 Ethics FAQ.

Please note that this is not an exhaustive list of types of ethical issues that can arise, but rather a set of questions meant to help researchers think through possible adverse impacts. Reviewers who spot concerns unrelated to the questions in this list are encouraged to note them in the review form as well.

For papers presenting new datasets:

Does the paper describe how intellectual property (copyright, etc) was respected in the data collection process?
Does the paper describe how participants’ privacy rights were respected in the data collection process?
Does the paper describe how crowd workers or other annotators were fairly compensated and how the compensation was determined to be fair?
Does the paper indicate that the data collection process was subjected to any necessary review by an appropriate review board?

For papers presenting new datasets AND papers presenting experiments on existing datasets:

Does the paper describe the characteristics of the dataset in enough detail for a reader to understand which speaker populations the technology could be expected to work for?
Do the claims in the paper match the experimental results, in terms of how far the results can be expected to generalize?
Does the paper describe the steps taken to evaluate the quality of the dataset?

For papers concerning tasks beyond language-internal matters:

Does the paper describe how the technology would be deployed in actual use cases?
Does the task carried out by the computer match how it would be deployed?
Does the paper address possible harms when the technology is being used as intended and functioning correctly?
Does the paper address possible harms when the technology is being used as intended but giving incorrect results?
Does the paper address possible harms following from potential misuse of the technology?
If the system learns from user input once deployed, does the paper describe checks and limitations to the learning?
Are any of the possible harms you’ve identified likely to fall disproportionately on populations that already experience marginalization or are otherwise vulnerable?

For papers using identity characteristics (e.g. gender, race, ethnicity) as variables:

Does the paper use self-identifications (rather than attributing identity characteristics to participants)?
Does the paper motivate the range of values used for identity characteristics in terms of how they relate to the research question?
Does the paper discuss the ethical implications of categorizing people, either in training datasets or in the deployment of the technology?