Abstract: |
Citation manipulation occurs when references are deliberately included in academic works for reasons unrelated to their genuine scholarly merit. Instead of serving their primary purposes—such as supporting arguments, providing context, or guiding readers—these citations are often utilized to inflate metrics like citation counts artificially. Manipulated citations tend to deviate from the standard patterns and structures found in authentic citation networks. Consequently, when such networks are perturbed by removing certain nodes or connections, these manipulated citations are more likely to exhibit inconsistencies. This paper introduces a method for detecting citation manipulation by studying how citation patterns change under random perturbations of the citation graph. The method employs the GraphSAGE algorithm to generate embeddings of the altered graph in an Euclidean space, thereby reconstructing the removed edges. The approach assumes that legitimate citations are bolstered by a network of indirect connections, leading to closely related embeddings for nodes linked by authentic citations that facilitate the accurate prediction of missing edges. By iteratively perturbing the graph and assessing the accuracy of edge reconstruction, the method highlights suspected manipulated citations, which consistently exhibit poor reconstruction performance, signifying supposed anomalous comportment. Numerical experiments validate the effectiveness of this approach in identifying anomalies within citation networks, highlighting its potential as a reliable tool for enhancing the integrity of scholarly communication. |