How do we remove a bunch of spam users on OJS? » Open Journal Theme

How do we remove a bunch of spam users on OJS?

OJS, which is a platform in the journal management process, is a platform that is currently very popularly used by scientific publishers and institutions. Unfortunately, OJS does not activate the spam feature by default, which opens up great opportunities for spammers to exploit the registration process in each journal.

The existence of this spam is very detrimental to the journal. Some drawbacks include:

  1. Spam makes OJS or journals slow, this is because the query process in user searches and user-related tables requires a lot of additional resources
  2. The amount of spam makes it difficult for editors in the process of assigning from the editor and reviewer sections
  3. Spam users can fill in information from profile data such as URLs pointing to
  4. Inappropriate content can be posted by spam in publicly accessible sections which will have fatal consequences related to the journal’s reputation
  5. Lack of protection against spam can lead to exploits related to OJS security vulnerabilities/vulnerabilities that can cause sites to become victims of hacking.

This case is found when one client come to us, where the editorial team said they were the target of spam. To respect the privacy of our clients, here we will not write down the journals that we have handled.


Disclaimer

  • This article only describes the activities we do in the journals we have handled. This article is not a suggestion or explanation regarding the actions to be taken to resolve problems with your journal/database.
  • We are not responsible for the results that occur in your journal after following this article.
  • We recommend doing periodic backups and backups before following some of the query suggestions in this article.

Knowing the pattern of Spam User

Before we tackle this problem, it’s a good idea to browse the PKP forum where we believe that this spam problem is certainly also faced by many journal managers/editorials. We will get some insight and take some knowledge to fix this problem.

Some of the resources we found are:

Removing spam registrants

Unfortunately, in this article, the query described along with the table relations in the article is only compatible with OJS 2. While the case we are facing uses OJS 3. The article also describes a case that is very different from the case we are facing. But what we can learn from the discussion is the need to understand the patterns of spammers.

A collaborative list of spam user patterns

Many things can be taken from this article. We think it is important to understand that before deleting data from a user, make sure that the user does not have an important role in the journal.

Such roles are admin, journal manager, section manager, reviewer, and author. Most of the spammers register themselves as subscribers or readers. We still assume that there are some spammers who also register themselves as authors.

From the discussion and the analysis process of how the relations between tables in OJS we get two points that should be considered when removing any user in the OJS:

  1. Get an active role that doesn’t need to be deleted if the user has that user
  2. Get patterns from spam users and delete them if that user doesn’t have submissions.

Deletion Process

1. Identify the pattern of the spam registration

After learning about the database record in the user table and user settings we know that the spam enrolled to the journal with some similar pattern. Here are what we found:

  1. The same family name pattern. One of them is “Barbosa” and “Maria”.
  2. Removing users that email contains gmx, arcor, freenet, bigstring
  3. Contains joao, pedro, paulo
  4. A lot of user with a country extension of .de

1. Getting the list of the same family name pattern

When we list the list of the user in the OJS system we find many of the users used family similar family names. This was the screenshot that was taken from the user list :

We used below query to get the list of user that have repetitive family name pattern :

SELECT user_id, setting_name, setting_value, COUNT(*) AS x FROM user_settings where setting_name = 'familyName' GROUP BY setting_value HAVING COUNT(*) > 3 ORDER BY x desc

In this query, we list users whose familyName is repeated 3 times, and store that user’s data in a variable that we will use later.

In addition, we also added the list with the assumption that spam occurred during the previous 2 years, so the idea came up to run the following query:

SELECT user_id FROM users WHERE YEAR(date_registered) BETWEEN '$twoYearsAgo' AND '$currentYears'

Note that we are using PHP variables to keep the list of users based on those queries.

2. Getting the list of users based on the contained string pattern

By name
Add the potential list of user that name contains joao, pedro, paulo by using this query to get the user list

SELECT user_id, setting_name, setting_value FROM user_settings WHERE setting_value LIKE ‘%joao%’ OR LIKE ‘%pedro%’ OR LIKE ‘%paulo%’

By Email
Add the list of potential spam users that the email contains with gmx, arcor, freenet, bigstring by using this query to get the user list

SELECT user_id, email FROM users WHERE email LIKE ‘%gmx%’ OR LIKE ‘%arcor%’ OR LIKE ‘%freenet%’ OR LIKE ‘%bigstring%’;

By Email that contains country extensions .de
Add the list with a country extension of .de by using the query to get the user list

SELECT user_id FROM users WHERE email LIKE ‘%.de’;

2. Filter: Identify the active role being used by the journal

After knowing the pattern of the suspected spam user, it is not enough to delete those users, we need to know what is the role that is actively used in the journal such as the section editor.

This can be obtained by making a query in the OJS database:

SELECT user_group_id, COUNT(*) as total_user FROM user_user_groups GROUP BY user_group_id having total_user >= 1

By knowing the roles that a user has, we then compare those users from the list of users that we will not delete. So the list will be excluded if the user has a role.

In the journals we handle, the roles included in this allowed and actively used user roles are:

    // 'ROLE_ID_MANAGER'        =  2
    // 'ROLE_ID_SITE_ADMIN'     = 1
    // 'ROLE_ID_SUB_EDITOR'     = 5 | 6
    // 'ROLE_ID_AUTHOR'         = 14 | 15;
    // 'ROLE_ID_REVIEWER'       = 16
    // 'ROLE_ID_ASSISTANT'      = 7 - 13
    // 'ROLE_ID_READER'         = 17
    // 'ROLE_ID_SUBSCRIPTION_MANAGER'   = 18

Do not use the role id above for your journal because it could be different.

3. Filter: Detection of users whether they had any submission

From the list we have, in step 1, we filter so that the deleted user is not an active user in the journal. This filter, apart from the user role, also goes through the activities of each suspected user in the stage_assigments table.

To perform this process we perform a lookup on the stage_assignments table. For this we use the function we created with PHP :

 public function userHasSubmission($userId) {
        $checkUserSubmissions = DB::table('stage_assignments')->where('user_id', $userId);
        return $checkUserSubmissions->exists();
    }

After we make sure that the user does not pass the 2 filters that we have made, then we make sure that the user is spam.

So we can delete the user.
At first, we did the deletion by thinking that the tables that needed to be deleted were the tables: users, user_interests, user_user_groups, and user_settings, but after we executed it, it turned out that this caused an error (the page could not be loaded) when we accessed the user list in OJS.

After doing some browsing we found a function that has been provided by PKP so that the deletion process from the user can be done with the following example function:

$ php tools/mergeUsers.php username1 username2

Username1 is the user who is the target where information related to username2 will be transferred.

This command is an example:

$ php tools/mergeUsers.php journal_admin_username spam_username

By doing this we have sorted in more than 3.000 spam user that is registered in the OJS.

Overcome the new registration of new potential spam

Of the total we removed about 5000+ spam users registered with OJS, the next step is to protect the journal from new spam registration attempts.

To protect user data from spam, one of the steps is to configure Google Captcha in your OJS.

Even though google reCaptcha has been installed on the OJS, keep in mind that this Google Recaptcha also has weaknesses. The main drawback is that this feature is not supported in countries that are included in countries where Google services are not available such as China. This causes the user to fail to register or log in to OJS with this feature installed.

Alternativenya bisa menggunakan fitur spam protection dengan menggunakan proteksi dari plugin aksimet dan Form honeyPot di OJS :

This plugin is available in the plugin gallery on your OJS.

Keep in mind that this Honeypot Form based on our experience is not very effective in countering spam because there are several types of spam that can go through this honeyPot challenge.

Although this article covers our steps to remove the spam in the OJS and protect the OJS by using the free plugin. You should also be cautious that your journal site can also be the victim of spam traffic that have a different case with this article case. This spam traffic can make your journal excessive usage of resources and make your site error or slow. Read more detail about the case in the below article.

How to fix slow access journals abused by fake traffic in the OJS?

Tags :
user-avatar

Project Manager

Hendra here, I love writing about OJS and share knowledge about OJS. My passion is about OJS, OMP platform and doing some research on creating a innovated product for that platform

Openjournaltheme.com started in 2016 by a passionate team that focused to provide affordable OJS, OMP,  OPS,  Dspace, Eprints products and services. Our mission to help publishers to be more focus on their content research rather than tackled by many technical OJS issues.

Under the legal company name :
Inovasi Informatik Sinergi Inc.

Secure Payment :

All the client’s financial account data is stored in the respective third-party site (such as Paypal, Wise and Direct Payment).
*Payment on Credit card can be done by request
Your financial account is guaranteed protection. We never keep any of the clients’ financial data.