Why Google Scholar Didn't Index an OJS Site? And How to Fix it?

Introduction

Google Scholar is one of the most reputable indexing sites, and it’s free. This platform offers a wide range of scientific research materials, including journals, dissertations, books, and more. By using Google Scholar, anyone can easily find and cite research from various fields of study. Additionally, it provides several useful features. One of these is the filter, which allows users to search for data by year, type, relevance, and other criteria. This makes it easy to quickly and accurately find the information you need. Overall, Google Scholar is a comprehensive, free, and powerful indexing site with a variety of helpful features.

Despite being free, Google Scholar is one of the most popular indexing sites, attracting a large number of users. When a journal or OJS site is successfully indexed in Google Scholar, it significantly enhances the journal’s visibility. This, in turn, can greatly boost the journal’s popularity. For new publishers, achieving high visitor traffic is a primary goal, and being indexed on Google Scholar is a crucial step toward that.

So, how do we get our OJS journals or sites indexed in Google Scholar? Fortunately, the process is quite simple because Google Scholar’s systems operate automatically. Unlike other indexing sites, such as Copernicus, which require a manual article import/export process, Google Scholar is entirely different. It uses tools like “robots” or “crawlers” to automatically and extensively search for journal or research data available online. However, to ensure these “robots” or “crawlers” work effectively, we must make sure our OJS site is accessible and that the journal metadata meets the criteria required by Google’s crawlers.

Unfortunately, some publishers or journal managers are not fully aware of how Google Scholar indexing works. This can result in their OJS sites failing to be indexed or even being deindexed. So, what do we need to focus on to ensure our OJS sites are indexed by Google Scholar? We’ll cover everything you need to know in this article, so please read it all the way through.

Need better hosting that protect you from Google Scholar Blacklist 👍
Check our hosting package that allow you to host and include our expert team to support you in technical aspect of your OJS.

Benefit Using Google Scholar Indexing

Although Google Scholar is a free indexing site, it offers several significant benefits for journals that are indexed there. Some of the advantages of having our OJS journal or site indexed in Google Scholar include:

1. Automatic Indexing

Technically, Google Scholar differs from other indexing sites, particularly in terms of storage, outreach, and database updates. Google Scholar uses its own tools, called “robots” or “crawlers,” which are responsible for accessing various journal data on the internet, storing it, and updating their database. This eliminates the need for manually importing or exporting articles, making the process more effective and efficient.

2. Easy Indexing Process

The Google Scholar indexing process is generally easier compared to other indexing sites like Scopus, DOAJ, and others. For instance, when submitting for indexing in Scopus or DOAJ, you need to provide detailed information about the editorial team, aim and scope, author guidelines, and more. In contrast, Google Scholar doesn’t have such additional requirements. Generally, all you need to do is ensure that your OJS site is accessible and running smoothly, and that your data sources or meta tags meet Google Scholar’s standards for indexing.

3. Increasing Visibility

Google Scholar is one of the free and open indexing sites, and anyone can access it easily. So this allows various researchers or other academics around the world to find various research results on the Google Scholar site. When a journal is successfully indexed in Google Scholar, this has a positive impact on increasing the visibility of the journal. Moreover, with the increasing visibility of the journal, this will also increase the number of visitors to the OJS site.

4. Increasing Number of Citations

When a journal is indexed in Google Scholar, it has a great opportunity to increase the number of citations. This is because Google Scholar can be accessed by millions of researchers and other academics from various countries, especially for scientific citations. So with the indexing of articles in Google Scholar, various researchers will easily find and cite our articles in their scientific works.

5. Improving Journal Credibility

The more journals are cited and indexed on various indexing sites, the more it will increase the credibility of the journal, including being indexed on Google Scholar. When a journal is successfully indexed on Google Scholar, it means that the journal has met the various requirements that have been set and is worthy of being published widely. This credibility is also useful for increasing the “trust” of various researchers towards the quality and relevance of the journal’s content.

6. Citation Count Metric Feature

In Google Scholar we can also find the Citation Count Metric feature as shown in the following image.

In this feature, we can see a graph related to the number of citations to our articles on Google Scholar. With this feature, it will be easier for us to see the development of our article citations. Of course, this will make it easier for us to analyze what research titles are relevant or in great demand today. In addition, we can also see in detail the number of citations per article title in the cited by column as shown in the following image:

7. Widely Used by Certain Countries in Validating Lecturers’ Scientific Work

In most of countries like in Asia, Africa continent, Arab, the validation of scientific work is checked through the index of their article work in the Google Scholar. As this is mandatory, the Google Scholar indexing is a must. This mean that when your journal is not managed professionally, when some of the journal not meet the Google Scholar compliant, it may get delisted automatically to all of the article record even from the long years record before.

What Google Scholar Requires in Crawling Data:

In carrying out the process of indexing article data, Google Scholar does it with the automatic tools they have called “robots” or “crawlers”. When carrying out this crawling activity, Google Scholar requires several conditions or appropriate requirements that must be present on an OJS site. This needs to be done so that the data crawling process can run well.

Here are some things that are needed so that the data crawling process by Google robots can run smoothly.

1. Open Access

Make sure our site in this case, especially OJS, can be accessed openly. This will make it easier for Google to reach various data in our OJS. So that the data crawling process can run optimally. Of course, Google will find it easier to index articles on a site that can be accessed openly. So at least we need to make sure that our OJS site can be accessed openly and the existing metadata can be accessed publicly without having to log in or other access restrictions.

2. Structured Metadata

The metadata on our OJS site must be structured, clear, and of course openly accessible. We need to ensure that our OJS site can display clear metadata such as information related to journal name, author, abstract, references, published date, and others. In addition, we also need to ensure that the metadata format on our OJS site can be read by Google properly, especially in meta-tag format. Regarding meta-tags, Google itself has mentioned several meta-tags that are needed in the indexing process.

Here are some examples of meta-tags found in articles in OJS.

You can enable the Google Scholar plugin which is available for free in the Plugin Gallery in OJS.

3. URL Consistency

Then we also need to ensure the consistency of the URLs on our OJS site. In the indexing process by Google, URL consistency plays a fairly important role, especially when the data crawling process is taking place by Google. An inconsistent or changing URL will have a negative impact on the site. This is because it will make it difficult for Google’s crawler to identify various data on the site, or worse, Google will consider the site as spam or duplicate content because the URL changes.

This consistency url also applies to PDF files in each article. In the PDF file of the article if you provide text such as “Available online: article url” the url must be consistent with the location of the article url. Consistency of other metadata in PDF such as journal name, ISSN, journal abbreviation also has an effect on indexing in Google scholar. We have experienced this where this metadata is different from the metadata in the online article causing GS not to index the journal.

4. Site Accessibility

Site accessibility also plays an important role during the crawling or indexing process of data by Google. A site that can be accessed quickly and smoothly will make it easier for Google crawlers to search for various data on our OJS site effectively. In this case, the load speed of a site is quite influential, we need to ensure that our site can be accessed quickly or for example each page can be loaded in less than 10 seconds.

A site that has a low load speed or is often down will potentially disrupt the data crawling process, making it difficult for Google to index the site.

5. Configuration of robot.txt

In the OJS server folder we can find a file with the format “robot.txt” as shown in the following image.

This is the default file that comes with OJS, so when we install OJS we can find this file in our OJS folder. This file contains a script related to accessibility that OJS provides to a robot to reach or crawl various data on our OJS site. So we need to make sure that the script in it does not block access for Google robots.

An example of the default contents of the robot.txt file in OJS is as follows:

In general, the script means that the function applied applies to all robots from any site without exception, and then the robot is prohibited or cannot access the cache file. Because the cache file only contains a collection of temporary data so that there is a lot of irrelevant content in it.

But the most important point is that we also have to make sure that our server does not block access for robots. Because even though OJS provides access to robots, if the server blocks the robot, it will be in vain. In some cases, there are servers that limit or even block robot activity.

So if this happens, Google robots cannot access the OJS site, and this results in the site not being indexed in Google Scholar.

6. PDF File Accessibility

Google robots will also try to reach or crawl data in the form of PDF files on our OJS site. This robot is able to search well for every text in the PDF file. After completing the crawl, the PDF file will then be integrated into the Google Scholar database. If all processes are successful, Google can display this PDF file in its search column and when clicked will immediately direct to the PDF file.

In order for Google bot to crawl PDF files on OJS properly, we need to ensure that:

The metadata in the PDF file is synchronized and matches the metadata in the OJS source. Whether it is the article title, author name, abstract, journal link, DOI, publication date, and various other metadata.
The abstract is written in text format, not in image format.
The PDF file can be accessed properly without any restrictions or blocking by the server configuration.
If there is a link, the link in the PDF file must be consistent and lead to the source.

Why Google Scholar Didn’t Index an OJS Site?

So far you may feel that the OJS site you manage is fine. However, after we checked the Google Scholar site, it turned out that your site was not indexed or even deindexed. Of course this makes us confused about what caused all this. So the question arises “why Google Scholar didn’t index an OJS site?”

So far we have conducted various analyzes related to what causes an OJS site to fail to be indexed in Google Scholar. We found this from various problems experienced directly by our clients and also based on the presentation of data that was mentioned by Google Scholar itself.

Here are some of our experiences on why your journal is not indexed in Google Scholar:

The domain and site you started have only been active for a short time. In our experience, Google will need about 1 to 3 months to enter inclusion.
Your domain has been detected by Google Scholar as being hacked or indicated to have links that lead to illegal sites such as online gambling, indicated predatory journals, prostitution, and other types.
The article was previously published in a journal/domain that was blacklisted by Google (see point 2).
Periodic downtime on your site or inaccessibility (frequent errors).
DOI that points to a different url than the article url
Articles that are double posted on different domains. This will be considered by GS as an act of plagiarism.
Your site is accessible but there is something that causes a 500 response on your site.
Your OJS Sitemap URL is not compatible. It has been found that in some versions of OJS (below version OJS 3.3.0.18) the sitemap format is not compatible where it should be http instead of https (ref: https://github.com/pkp/pkp-lib/issues/9845)

Recommendation

Setup Google Search Console for each of your journals. Here is a tutorial for setting up Google Search Console for each of your journals.
Make sure the configuration on your server does not block robot activity, especially Google robots.
Any metadata contained in the article PDF file must match the source, including links that lead to the appropriate pages.
Regularly upgrade your OJS version, to avoid any vulnerability or security issue. Prioritize using the LTS (Long Term Support) OJS version because in that version PKP will continue to provide long-term handling.

On the other hand, Google Scholar has also explained what are the common problems that are often found related to the failure of an OJS site to be indexed in Google Scholar. Here are some of the common problems that are often found.

Incorrect bibliographic metatags. Mistakes related to writing or formatting metatags are quite fatal. If this metatag error occurs widely on an OJS site, then Google cannot perform the indexing process. Because metatags are one of the core aspects needed during the data crawling process.
Sites outages or site down while indexing system is working on OJS site. Sites that are down or cannot be accessed also affect the continuity of the indexing process in Google Scholar. Basically, the system that works on this Google bot will work automatically, when this robot tries to access the OJS site and it turns out that the OJS site is down, this will have a bad impact. If this happens, the robot will assume that your OJS site is inactive, and the indexing process will be stopped.

Whether your OJS site is down for a short time or a long time, bots don’t care. Because basically if the bot fails to access your site, your site will be considered inactive and will fail to be indexed. So you need to consider carefully the capacity and performance of the server you are using. To ensure that your OJS site always stands out. So whenever this Google bot accesses your OJS site, your site is always ready and can be accessed properly.
Blocking of google bot crawlers and slow site issues. The existence of restrictions or blocking from the server on robot activity has a very significant impact. If this happens, the google bot cannot perform the crawling process, causing the site to fail to be indexed in Google Scholar. In addition, slow sites or low load speeds also affect the activity of this google bot, causing data crawling to be ineffective.

What can you do to fix Google Scholar?

Then what are the things you can do to fix Google Scholar? Here are some steps you can take to optimize the Google Scholar indexing process.

It takes time. If your journal is new, GS will need 1-3 months to recognize your journal. You need to periodically visit each article to trigger GS.
Activate the Google Scholar plugin from the plugin library
Do not install excessive plugins
Not all plugins available in the Plugin gallery are free of problems, for example, there are plugins that cause the site to slow down, or even cause errors. This is also happen for the plugin that is written by the PKP itself. So don’t install carelessly install any plugin for your OJS.
Make sure your site is not overloaded and is always uptime, this can be done using features such as Uptime monitoring tool.
Periodically check the output of your web response to make sure it always produces a 200 code.
Make sure your sitemap URLs are OJS compatible (ref: https://github.com/pkp/pkp-lib/issues/9845)

Google Scholar Checklist

In this section, we will explain to you what are the various things that you need to check and ensure on your OJS site so that the Google Scholar indexing process can run smoothly. The various data points that we found are a summary of Google Scholar data guidelines and various experiences that we have found so far. This checklist is expected to make it easier for you to check various aspects of your OJS site related to Google Scholar indexing needs.

Here are some things you can check on your OJS site to ensure that the Google Scholar data crawling and indexing process is running smoothly.

General Requirement :

Site does not have long time of online record to get hacked (3 days is the tolerated time)
Site respon fast (no longer 10s), this should frequently checked using VPN
Site respon always return 200, you can check it here : https://httpstatus.io
Use HTTPS protocol for site security
Site does not have link to external illegal site such as gambling, porn, prostitution and other.

Make sure your OJS Site:

Have valid ISSN (optional)
Site don’t have any illegal content visible or invisible (cloaked link) https://developers.google.com/search/docs/essentials/spam-policies
No duplicated content from internal or external domain
Ensure that each article’s metadata is clear and complete
Articles are accessible without requiring login (Open Access)
Submit the sitemap to Google Search Console (additional)
Adjust or optimize images and other media files for faster loading times
Use strong passwords, especially for admin users, journal managers and other important roles to minimize hacking on the site.
Your OJS site have compatible sitemap URL

Make sure your article’s PDF file:

The abstract must be in text format, not an image.
Links within the PDF should not differ; they must match the source.
Author information, including affiliation, must be consistent with the source.
Abbreviations must align with the source.
The ISSN displayed must match the source and be valid.
The article title should be clearly visible and match the source metadata.
Page numbers should be structured and consistent.
References should match the source.
The text must not be a scanned image; it should be in the original text format.
Images in the PDF should have sufficient resolution but remain lightweight.
Tables, graphs, and other data must be easily readable.
Use standard, legible fonts.
Include complete metadata that matches the source.

What we have done to protect and optimize Google Scholar indexing process

As a provider with years of experience in handling various needs of our OJS clients, from newcomers publishers to other highly reputable journals, we have conducted various research and analysis especially to maximize the Google Scholar indexing process. What we do is one form of dedication and commitment that we can do for all publishers, especially OJS users.

We are also aware that lately many publishers have complained that their OJS sites cannot be indexed in Google Scholar. As we have explained previously, the reasons for the failure of journals to be indexed in Google Scholar vary widely. So on that basis, our team continues to innovate to help various publishers maximize the Google Scholar indexing process on their OJS sites.

Here are some things we do to maximize the Google Scholar indexing process.

1. OJS Hosting Service

We provide servers that are specifically designed to run OJS functions optimally. So when using our hosting, your OJS site can run all the essential functions of OJS without exception. The servers we design also allow Google bots or crawlers to access various data in OJS freely without limitations, so this will smooth the Google Scholar indexing process.

With our experience from several clients who have migrated to our servers, we found the issue that the hosting they used previously was not specifically designed for OJS and had several crucial constraints for OJS.

With our experience in handling thousands of clients, we built hosting that we built in-house to overcome these obstacles:

Optimizing PHP Configuration by completing the PHP extensions required by OJS
Fast response hosting service. Hosting infrastructure is built with nginx web server and we have optimized it, for example, the caching feature on static files, SSL handshake optimization, php optimization by implementing opcache and using redis caching on sessions.
Built-in automatic uptime monitoring performs periodic response times without the need for manual intervention.
Protection against scanner bots consuming OJS resources
Increased security with the exclusivity of the OJS platform from each client. By building a custom firewall configuration and file monitoring that will automatically notify our team to immediately secure OJS if there is any suspicious activity in your OJS directory.

Has a high vulnerability to being hacked due to the multi-platform that exists on other client hosting.
Integrated support. The team available from us not only helps you in terms of technical handling of hosting but also has expertise in the technical field of OJS management specifically. So if you have any problems related to
Supports collation characters that support the display of articles with various Latin, Roman, Chinese and other article characters.

Availability of backups performed periodically.
Provide other features to improve your journal such as free Letter of Acceptance Plugin, Free Copernicus Plugin, Free RePEc plugin.
Ticketing system that will facilitate the delivery of various information or obstacles. So you can easily connect directly with our technical team.
Here is a preview of the ticketing system.

By using this ticketing, it will be easier for hosting clients to connect with our technical team directly. Clients can convey various obstacles experienced in this ticketing, so that our technical team will help resolve them. This ticketing also allows clients to attach images, links, and other files. So that the process of conveying information and your obstacles becomes clearer and more focused.

You can find more detailed information regarding the OJS Hosting Service via this link page: OJS Hosting Service by openjournaltheme

Choose the only OJS hosting that is include with free support of OJS aspects by our expert team 👍
Check our hosting package, choose only hosting provider that understand entirely for the OJS aspects

2. OJS Theme and Meta Tags

So far we have several OJS themes such as Academic Pro, Academic Pro Extended, Classy, Noble, Novelty, and Unify. Our various OJS themes have been used by thousands of our clients spread globally. And proven to increase the visibility of their journals and even help improve and meet the requirements set by various reputable indexing such as Scopus, DOAJ, WoS, and others.

Here are examples of some clients who use our OJS Theme and have been indexed in various reputable indexing.

a. Forest and Society (indexed in Scopus Q1, DOAJ)

b. Tropical Animal Science Journal (indexed in Scopus Q2, DOAJ, Sinta 1)

c. TEFLIN Journal (indexed in Scopus Q2, DOAJ)

You need to know that the OJS theme that we created is not only from the aesthetic side of the display. But more than that, our team has built an OJS theme that can display metatags well. As we know that metatags are an important thing that Google Scholar needs in carrying out a series of indexing processes. And in this case our team has finished building an OJS theme with an optimal metatags format and according to Google Scholar’s needs.

3. RePEc Indexing Plugin

Our experience we have done we have made a plugin specifically to simulate metatags from articles by following metatags from EPrints, Dspace and other platforms. We do this to overcome deindexing of our clients. Our goal is to make the Google Scholar engine think that the publication that has been done is not from OJS but another platform.

In the end we found that the plugin did not solve the problem. In another case we found that one of the journals was not indexed in GS even though the domain was old, the published articles were reliable, and the metatags were fulfilled, but because the site often outages so GS did not index it. Because this journal discusses the field of economics which is in line with RePEc indexing, then we set up the journal with RePEc indexing and in the end we found that the journal was indexed well in GS after RePEc included the journal.

RePEc is an indexing site specifically for OJS articles in the fields of economics, accounting, management and the like. When an OJS site is indexed in RePEc, the indexing results will usually also be displayed on Google Scholar. So that the title of your article can appear well on Google Scholar thanks to the help of RePEc indexing.

One of our clients who successfully optimized RePEc and was indexed in Google Scholar is JTAKEN Journals (Jurnal Tata Kelola dan Akuntabilitas Keuangan Negara) although it is previously does not indexed by GS due to excessive downtime.

If the same problem occurs in your journal and your journal discusses topics that are in line with economics, management, accounting and business, you can consider using the RePEc plugin to overcome the Google Scholar deindex. More details can be checked at the following link: RePEc plugin by openjournaltheme

FAQ (Frequently Asked Questions)

My site is not available in Google Scholar, what should I do, although all checklist items are fulfilled?
Answer:
If you journal is new and using OJS platform also you have fulfilled all the checklist, the Google Scholar may need some time (between 1 – 3 month) in indexing your journal. This is also relevant to the new domain.
My site is not new and previously all the article is indexed in Google Scholar ?
Answer:
If you getting deindexed in Google Scholar (GS), please note for point of General Requirement section in that checklist. Those checklist items based on our experience may lead the action from GS to deindex your journal.
Say that those General Requirement section points occur in my site, what should we do?
Answer:
Let we elaborate the case for GS deindex, on our experience. The journal site have been indexed in Scopus but have issue with GS. After we have thoroughly check, it was caused the migration activity that is not done professionally. So we found that there is a duplicated content.
Theses was our past action to try to fix it:
a. We have try to change the domain of one of journal that have such issue, later in the moment it was getting indexed by GS, however after sometime the GS remove all the article again. We believe that GS have some information for the journal article title.
b. The second case we also try to change the publisher name and ISSN, but it still did not fix the case
c. We even create a plugin that simulate Eprint to let GS acknowledge that the article is published in Eprint platform. But without any luck this was failed.

After some moment, we found that the article can get indexed again to GS by the inclusion of the journal in some reputable indexing or publish the article with same metadata to repository system such as Eprint or DSpace.

Because of point as covered in the General Requirement Section checklist, those the reason we created our hosting platform so we can improve the security measurement in our infrastructure and fixing the issue that emerge to our client journal as soon that our automatic system detect it.
Is there any fast fix?
Answer:
Currently, there is no such quick fix for this issue. However we found that once journal get indexed in a certain index such as RePEc or others as we mention in some case the journal get back to indexed to GS. So don’t give up, keep use the journal and gain more indexing for it, later if more indexing register your journal, you site may gain trust again from GS.
Is there any way we can contact GS representative for this kind of issue?
Answer:
The contact email is available, unfortunately we have tried in the past to contact the personnel multiple times but never get any reply. in term of account privacy we cannot disclosure that personal account it here as there is not official account for GS representative.

Conclusion

Google Scholar is one of the most widely accessed indexing sites by various academic circles from various countries. By using Google Scholar, it will make it easier for us to find millions of article titles according to the field we want to search for. The indexing system in Google Scholar itself occurs automatically, using a robot or crawler that will reach your journal data so that it can be integrated with the Google database.

But on the other hand, currently many publishers whose OJS sites fail to be indexed in Google Scholar. If not handled early for publishing OJS-based journals, for example, not properly managed technical server issues, unsupported plugins that produce error responses 500 or 300, use of unsupported themes, then it threatens that your journal will not be indexed in GS or your journal will be blacklisted from GS.

We have explained these things before in the section above. We hope that this explanation can help you to know more about the various things that need to be considered and prepared in order to maximize the Google Scholar indexing process on your OJS site.

References

About the Author

Irsyad Baihaqi

Hello I am Irsyad, OJS Support from openjournaltheme. I like to share experiences, tips and tricks, and more about OJS, OMP, and EPrints.

Why Google Scholar Didn’t Index an OJS Site? And How to Fix it?

Introduction

Benefit Using Google Scholar Indexing

1. Automatic Indexing

2. Easy Indexing Process

3. Increasing Visibility

4. Increasing Number of Citations

5. Improving Journal Credibility

6. Citation Count Metric Feature

7. Widely Used by Certain Countries in Validating Lecturers’ Scientific Work

What Google Scholar Requires in Crawling Data:

1. Open Access

2. Structured Metadata

3. URL Consistency

4. Site Accessibility

5. Configuration of robot.txt

6. PDF File Accessibility

Why Google Scholar Didn’t Index an OJS Site?

What can you do to fix Google Scholar?

Google Scholar Checklist

General Requirement :

Make sure your OJS Site:

Make sure your article’s PDF file:

What we have done to protect and optimize Google Scholar indexing process

1. OJS Hosting Service

2. OJS Theme and Meta Tags

a. Forest and Society (indexed in Scopus Q1, DOAJ)

b. Tropical Animal Science Journal (indexed in Scopus Q2, DOAJ, Sinta 1)

c. TEFLIN Journal (indexed in Scopus Q2, DOAJ)

3. RePEc Indexing Plugin

FAQ (Frequently Asked Questions)

Conclusion

References

Leave a Comment Cancel Reply