Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

Unplag is a similarity detection engine used by educators and students to improve writing skills and prevent plagiarism (Unplag scans submitted files/text for similarities with Internet and/or file collections). Unplag is a cloud service that works as a standalone application or can be integrated into learning management system (LMS) or other platforms via standard integration technologies (LTI/API/plugin).

(texts). Disclosure: I am a partnership manager at Unplag. We would like to develop couple features Unplag extension for DSPACE platform based on our engine (some to be used by all for free, and some requiring subscription)DSpace platform to allow repository administrators (or other relevant roles) to run on-demand plagiarism scans. The aim is to provide efficient and easy-to-use tools to help maintain high quality content in the repository. Automatic similarity check upon file deposit can serve as another layer of quality assurance. Prior to committing to this project, we have approached DCAT committee and the suggestion was Unplag has approached DSpace Community Advisory Team (DCAT). The committee suggested to gather feedback from community using this forum. 

 

 

Suggested features

Available for all (no account on Unplag is required)

Based on Unplag engine, we would like to develop a feature for DSPACE platform, that will generate auxiliary information for each of the documents stored in the repository: a) list of citations b) list of references. The engine will extract this information from the document and will compile a brief report, that can be viewed by the user, without downloading the document. We assume, that having this information quickly available for review at a glance will improve research experience of users. Currently, to review citations and references, and, indeed, any content of the work, user, needs to download it first.

 

Available as an upgrade option (account on Unpag is required)

Based on Unplag engine, we would like to develop a feature for DSPACE platform, that will produce a plagiarism (originality) report for a given document(s) or even whole repository on demand. The report will show sources on the web, that contain text similarities with the given document. For each source the user will have a clickable HTTP link to open the source, % of similarity, color mask applied to original document and the sources (for easy and quick examination of similar text blocks).

The report can be used to:
a) Track the dissemination of the document (and/or its parts) on the web.
b) Identify sources, that, potentially, plagiarize from the given document.
c) Identify sources, that, potentially, violate open access license, under which the document has been published.

 

 

Goals

Improve research experience by showing citations and references for the document (quick view) and provide tools for identifying sources on the web, that contain similarities with original document, thus helping to track dissemination, as well as help prevent plagiarism and/or other activities that violate open access license(s) and/or rights of the institution/author.

 

 

Use Cases

 

 

Feedback

to understand what features and in what scenarios can be useful. Please provide feedback how  you see the usefulness of suggested features and who might benefit from them and in what user cases.

 

Suggested functionality

Available to DSpace users at no cost (repository scan)

User can scan a file or multiple files for similarities across repository (most popular file formats are supported including .doc, .docx, .rtf, .txt, .odt, .html, .pdf). If there is significant text overlap (similarity score is high), it can be indicative of plagiarism, duplication of authors’ content without proper citation, or other academic integrity misconduct. Automatic similarity scan is an additional quality control measure that helps to ensure that repository content is original (free from plagiarism) and of high quality (properly cited).


Available to DSpace users for a fee (repository + Internet scan)

User can scan a file or multiple files for similarities across repository and the Internet. If there is significant text overlap (similarity score is high), it can be indicative of plagiarism, duplication of authors’ content without proper citation, or other academic integrity misconduct. Automatic similarity scan is an additional quality control measure that helps to ensure that repository content is original (free from plagiarism) and of high quality (properly cited).

The difference between free and paid options is that paid option includes Internet scan - i.e. billions of pages and documents, published on the web. Since Unplag is charged by web index provider (Microsoft Bing) for using fresh index, this scan option is offered for a fee.

 

 

Unplag report (examples)

Similarity matches are highlighted with yellow. Citations are highlighted with blue. Information section gives basic statistics about originality and has list of sources. User can click text matches in report and see relevant sources and vice versa.

Screenshot1. User can open the source and see the text match (color mask) in the source:

Image Added


Screenshot2. User can exclude citations and references from the document, browse citations:

Image Added

 

Use Case

Automatic similarity check upon depositing research output - link.

 

 

Feedback

Please, let us know what you think, what features you would like to see and what user cases come to your mind.Appreciate feedback on usefulness of these features. Please, do not hesitate to speak out suggestions, ideas, criticism, questions.