WordPress Robots.txt optimization (+ XML Sitemap) – Website traffic, SEO & ranking Booster

Описание

Better Robots.txt creates a WordPress virtual robots.txt, helps you boost your website SEO (indexing capacities, Google ranking,etc.) and your loading performance –Compatible with Yoast SEO, Google Merchant, WooCommerce and Directory based network sites (MULTISITE)

With Better Robots.txt, you can identify which search engines are allowed to crawl your website (or not), specify clear instructions aboutwhat they are allowed to do (or not) and define a crawl-delay (to protect your hosting server against aggressive scrapers). Better Robots.txt also gives you full control over your WordPress robots.txt content via the custom setting box.

Reduce your site’s ecological footprint and the greenhouse gas (CO2) production inherent to its existence on the Web.

Краткий обзор:

SUPPORTED IN 7 LANGUAGES

Better Robots.txt plugins are translated and available in: Chinese –汉语/漢語, English, French – Français, Russian –Руссɤɢɣ, Portuguese – Português, Spanish – Español, German – Deutsch

А вы знали…

  • The robots.txt file is a simple text file placed on your web server which tells web crawlers (like Googlebot) whether they should access a file.
  • The robots.txt file controls how search engine spiders see and interact with your web pages;
  • This file and the bots they interact with are fundamental parts of how search engines work;
  • Первое, что искатель поисковой системы просматривает, когда он посещает страницу, — это файл robots.txt;

Robots.txt является источником SEO-сока, ожидающего разблокировки. Попробуйте лучше Robots.txt!

О версии Pro (дополнительные функции):

1. Увеличьте свой контент в поисковых системах с помощью своей карты сайта!

Make sure your pages, articles, and products, even the latest, are taken into consideration by search engines !

The Better Robots.txt plugin was made to work with the Yoast SEO plugin (probably the best SEO Plugin for WordPress websites). It will detect if you are currently using Yoast SEO and if the sitemap feature is activated. If it is, then it will add instructions automatically into the Robots.txt file asking bots/crawlers to read your sitemap and check if you have made recent changes in your website (so that search engines can crawl the new content that is available).

If you want to add your own sitemap (or if you are using another SEO plugin), then you just have to copy and paste your Sitemap URL, and Better Robots.txt will add it into your WordPress Robots.txt.

2. Защитите свои данные и контент

Блокируйте плохих ботов со сбоев на вашем сайте и коммерциализации ваших данных.

The Better Robots.txt plugin helps you block most popular bad bots from crawling and scraping your data.

When it comes to things crawling your site, there are good bots and bad bots. Good bots, like Google bot, crawl your site to index it for search engines. Others crawl your site for more nefarious reasons such as stripping out your content (text, price, etc.) for republishing, downloading whole archives of your site or extracting your images. Some bots were even reported to pull down entire websites as a result of heavy use of broadband.

The Better Robots.txt plugin protects your website against spiders/scrapers identified as bad bots by Distil Networks.

3. Скрыть и защитить свои обратные ссылки

Stop competitors from identifying your profitable backlinks.

Backlinks, also called “inbound links” or “incoming links” are created when one website links to another. The link to an external website is called a backlink. Backlinks are especially valuable for SEO because they represent a “vote of confidence” from one site to another. In essence, backlinks to your website are a signal to search engines that others vouch for your content.

If many sites link to the same webpage or website, search engines can infer that the content is worth linking to, and therefore also worth showing on a SERP. So, earning these backlinks generates a positive effect on a site’s ranking position or search visibility. In the SEM industry, it is very common for specialists to identify where these backlinks come from (competitors) in order to sort out the best of them and generate high-quality backlinks for their own customers.

Учитывая, что создание очень выгодных обратных ссылок для компании занимает много времени (время + энергия + бюджет), позволяя вашим конкурентам легко идентифицировать и дублировать их, это чистая потеря эффективности.

Better Robots.txt поможет вам блокировать все поисковые роботы (aHref, Majestic, Semrush), чтобы ваши обратные ссылки были необнаружимыми.

4. Избегайте обратных ссылок спама

Bots populating your website’s comment forms telling you ‘great article,’ ‘love the info,’ ‘hope you can elaborate more on the topic soon’ or even providing personalized comments, including author name are legion. Spambots get more and more intelligent with time, and unfortunately, comment spam links can really hurt your backlink profile. Better Robots.txt helps you avoid these comments from being indexed by search engines.

5. Инструменты Seo

While improving our plugin, we added shortcut links to 2 very important tools (if you are concerned with your ranking on search engines): Google Search Console & Bing Webmaster Tool. In case you are not already using them, you may now manage your website indexing while optimizing your robots.txt ! Direct access to a Mass ping tool was also added, allowing you to ping your links on more than 70 search engines.

We also created 4 shortcut links related to the best Online SEO Tools, directly available on Better Robots.txt SEO PRO. So that, whenever you want, you are now able to check out your site’s loading performance, analyze your SEO score, identify your current ranking on SERPs with keywords & traffic, and even scan your entire website for dead links (404, 503 errors, …), directly from the plugin.

6. Будьте уникальным

We thought that we could add a touch of originality on Better Robots.txt by adding a feature allowing you to “customize” your WordPress robots.txt with your own unique “signature.” Most major companies in the world have personalized their robots.txt by adding proverbs (https://www.yelp.com/robots.txt), slogans (https://www.youtube.com/robots.txt) or even drawings (https://store.nike.com/robots.txt – at the bottom). And why not you too? That’s why we have dedicated a specific area on the settings page where you can write or draw whatever you want (really) without affecting your robots.txt efficiency.

7. Prevent robots crawling useless WooCommerce links

We added a unique feature allowing to block specific links («add-to-cart», «orderby», «fllter», cart, account, checkout, …) from being crawled by search engines. Most of these links require a lot of CPU, memory & bandwidth usage (on hosting server) because they are not cacheable and/or create «infinite» crawling loops (while they are useless). Optimizing your WordPress robots.txt for WooCommerce when having an online store, allows to provide more processing power for pages that really matter and boost your loading performance.

8. Avoid crawler traps:

«Crawler traps” are a structural issue within a website that causes crawlers to find a virtually infinite number of irrelevant URLs. In theory, crawlers could get stuck in one part of a website and never finish crawling these irrelevant URLs. Better Robots.txt helps prevent crawler traps which hurt crawl budget and cause duplicate content.

9. Growth hacking tools

Today’s fastest growing companies like Amazon, Airbnb and Facebook have all driven breakout growth by aligning their teams around a high velocity testing/learning process. We are talking about Growth Hacking. Growth hacking is a process of rapidly experimenting with and implementing marketing and promotional strategies that are solely focused on efficient and rapid business growth. Better Robots.txt provide a list of 150+ tools available online to skyrocket your growth.

10. Robots.txt Post Meta Box for manual exclusions

This Post Meta Box allows to set «manually» if a page should be visible (or not) on search engines by injecting a dedicated «disallow» + «noindex» rule inside your WordPress robots.txt. Why is it an asset for your ranking on search engines ? Simply because some pages are not meant to be crawled / indexed. Thank you pages, landing pages, page containing exclusively forms are useful for visitors but not for crawlers, and you don’t need them to be visible on search engines. Also, some pages containing dynamic calendars (for online booking) should NEVER be accessible to crawlers beause they tend to trap them into infinite crawling loops which impacts directly your crawl budget (and your ranking).

11. Ads.txt & App-ads.txt crawlability

In order to ensure that ads.txt & app-ads.txt can be crawled by search engines, Better Robots.txt plugin makes sure they are by default allowed in Robots.txt file no matter your configuration. For your information, Authorized Digital Sellers for Web, or ads.txt, is an IAB initiative to improve transparency in programmatic advertising. You can create your own ads.txt files to identify who is authorized to sell your inventory. The files are publicly available and crawlable by exchanges, Supply-Side Platforms (SSP), and other buyers and third-party vendors. Authorized Sellers for Apps, or app-ads.txt, is an extension to the Authorized Digital Sellers standard. It expands compatibility to support ads shown in mobile apps.

More to come as always …

Скриншоты

  • Better Robots.txt Settings Page
  • Better Robots.txt Settings Page
  • Better Robots.txt Settings Page
  • Better Robots.txt Settings Page
  • Robots.txt file output

Установка

УСТАНОВКА ВРУЧНУЮ

  1. Распакуйте все файлы в каталог /wp-content/plugins/better-robots-txt
  2. Войдите в администратор WordPress и активируйте плагин «Better Robots.txt» через меню «Плагины»
  3. Перейдите в «Настройки> Better Robots.txt» в меню слева, чтобы начать работу с файлом robots.txt.

Часто задаваемые вопросы

Better Robots.txt plugin is enabled but why can’t I see any changes in robots.txt file?

Better Robots.txt creates a WordPress virtual robots.txt file. Please make sure that your permalinks are enabled from Settings > Permalinks. If permalinks are working then make sure that there is no physical robots.txt file on your server. Since it can’t write over physical file, so you must connect to FTP and rename or delete robots.txt from your domain root directory. It usually in /public_html/ folder on cPanel hostings. If you can’t find your domain root directory, please ask your hosting provider for help. If issue persists after taking these measures, please post it in support section or send a message to support@better-robots.com

Will there be any conflict with robots.txt which I’m already using?

If you have a pshysical robots.txt on your web hosting server, then this plugin will not work. As mentioned, it creates a WordPress virtual robots.txt file. Please follow steps in above answer if you want to use robots.txt file with this plugin.

How to add sitemap in my WordPress robots.txt?

Эта функция разрешена в версии Better Robots.txt Pro, которая автоматически добавляет файл Sitemap в файл robots.txt. Он определяет карту сайта из Yoast SEO-плагина. В случае, если вы используете другой плагин sitemap или созданный вручную файл Sitemap, вы можете просто добавить URL-адрес карты сайта в поле ввода карты. Если Yoast XML sitemaps также включены, вам необходимо сначала отключить его, просто перейдя в Yoast General Settings> Features и отключить функцию XML Sitemaps.

Why should I optimize the robots.txt?

А почему нет? Учитывая, что robots.txt — это самый первый файл, который читается при загрузке вашего веб-сайта браузером, почему бы не позволить сканерам непрерывно индексировать ваш контент? Простой факт добавления вашего Sitemap в Robots.txt — это просто здравый смысл. Зачем? Вы указали свой веб-сайт в Google Search Console, сделал ли ваш веб-мастер? Как сообщить сканерам, что у вас есть новый контент для индексации на вашем сайте? Если вы хотите, чтобы этот контент был найден в поисковых системах (Google, Bing,…), вам необходимо проиндексировать его. Это то, к чему стремится эта инструкция (добавление карты сайта). Последний вопрос. Основная причина, по которой этот плагин существует, связана с тем, что 95% времени (основанный на тысячах анализов SEO), robots.txt либо отсутствует, либо пуст, либо неверен, просто потому, что он либо неправильно понят, либо забыт. Представьте себе, если он был активирован и полностью функциональен.

Как этот плагин может повысить рейтинг моего сайта?

Фактически, этот плагин увеличит производительность индексации вашего сайта, что приведет к повышению вашего рейтинга в Google. Как ? Ну, идея создания этого плагина была предпринята после того, как вы сделали сотни оптимизаций SEO на профессиональных и корпоративных сайтах. Как упоминалось ранее, 95% проанализированных веб-сайтов не имели того, что мы могли бы назвать «оптимизированным» файлом robots.txt, и, хотя мы оптимизировали эти веб-сайты, мы поняли, что просто изменение содержимого этого файла было фактически «разблокировкой» этих веб-сайтов (основанный на ежедневных анализах SEMrush). Поскольку мы привыкли работать в 2 этапа (периоды времени), начиная с этой простой модификации, уже оказало значительное влияние на рейтинг Google, и это даже до того, как мы начали глубоко изменять либо контент, либо arborescence сайта, либо META Data. Чем больше вы помогаете поисковым системам понимать ваш веб-сайт, тем лучше вы

How to test and validate your robots.txt?

While you can view the contents of your robots.txt by navigating to the robots.txt URL, the best way to test and validate it, is through the robots.txt Tester option of Google Search Console.

Login to your Google Search Console Account. Click on robots.txt Tester, found under Crawl options. Click the Test button.

If everything is ok, the Test button will turn green and the label will change to ALLOWED. If there is a problem, the line that causes a disallow will be highlighted.

What is a virtual robots.txt file?

WordPress by default is using a virtual robots.txt file. This means that you cannot directly edit the file or find it in the root of your directory.

The only way to view the contents of the file, is to type https://www.yourdomain.com/robots.txt in your browser.

The default values of WordPress robots.txt are:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

When you enable the “Discourage search engines from indexing this site” option under Search Engine Visibility Settings, the robots.txt becomes:

User-agent: *
Disallow: /

Which basically blocks all crawlers from accessing the website.

Why Is Robots.txt Important?

There are 3 main reasons that you’d want to use a robots.txt file.

  • Block Non-Public Pages: Sometimes you have pages on your site that you don’t want indexed. For example, you might have a staging version of a page. Or a login page. These pages need to exist. But you don’t want random people landing on them. This is a case where you’d use robots.txt to block these pages from search engine crawlers and bots.
  • Maximize Crawl Budget: If you’re having a tough time getting all of your pages indexed, you might have a crawl budget problem. By blocking unimportant pages with robots.txt, Googlebot can spend more of your crawl budget on the pages that actually matter.
  • Prevent Indexing of Resources: Using meta directives can work just as well as Robots.txt for preventing pages from getting indexed. However, meta directives don’t work well for multimedia resources, like PDFs and images. That’s where robots.txt comes into play.

You can check how many pages you have indexed in the Google Search Console.

If the number matches the number of pages that you want indexed, you don’t need to bother with a Robots.txt file.

But if that number of higher than you expected (and you notice indexed URLs that shouldn’t be indexed), then it’s time to create a robots.txt file for your website.

Robots.txt vs. Meta Directives

Why would you use robots.txt when you can block pages at the page-level with the “noindex” meta tag?

As mentioned before, the noindex tag is tricky to implement on multimedia resources, like videos and PDFs.

Also, if you have thousands of pages that you want to block, it’s sometimes easier to block the entire section of that site with robots.txt instead of manually adding a noindex tag to every single page.

There are also edge cases where you don’t want to waste any crawl budget on Google landing on pages with the noindex tag.

Important things about robots.txt

  • Robots.txt must be in the main folder, i.e., domain.com/robots.txt.
  • Each subdomain needs its own robots.txt (sub1.domain.com, sub2.domain.com, … ) while multisites require only ONE robots.txt (domain.com/multi1, domain.com/multi2, …).
  • Some crawlers can ignore robots.txt.
  • URLs and the robots.txt file are case-sensitive.
  • Crawl-delay is not honored by Google (as it has its own crawl-budget), but you can manage crawl settings in Google Search Console.
  • Validate your robots.txt file in Google Search Console and Bing Webmaster Tools.
  • Don’t block crawling to avoid duplicate content. Don’t disallow pages which are redirected. Crawlers won’t be able to follow the redirect.
  • The max size for a robots.txt file is 500 KB.

PS: Pagup recommends Site kit by Google plugin for insights & SEO performance.

Отзывы

28.11.2021 2 ответа
This plugin is pretty useless unless you buy the premium version. For instance, you cannot even set the "Search Engine Visibility. I recommend that this plugin be withdrawn from the repository.
Посмотреть все 98 отзывов

Участники и разработчики

«WordPress Robots.txt optimization (+ XML Sitemap) – Website traffic, SEO & ranking Booster» — проект с открытым исходным кодом. В развитие плагина внесли свой вклад следующие участники:

Участники

«WordPress Robots.txt optimization (+ XML Sitemap) – Website traffic, SEO & ranking Booster» переведён на 3 языка. Благодарим переводчиков за их работу.

Перевести «WordPress Robots.txt optimization (+ XML Sitemap) – Website traffic, SEO & ranking Booster» на ваш язык.

Заинтересованы в разработке?

Посмотрите код, проверьте SVN репозиторий, или подпишитесь на журнал разработки по RSS.

Журнал изменений

1.0.0

  • Initial release.

1.0.1

  • fixed plugin directory url issue
  • some text improvements

1.0.2

  • fixed some minor issues with styling
  • improved text and translation

1.1.0

  • added some major improvements
  • allow/off option changed with allow/disallow/off
  • improved overall text and french translation

1.1.1

  • fixed a bug and improved code

1.1.2

  • added new feature «Spam Backlink Blocker»

1.1.3

  • fixed a bug

1.1.4

  • added new «personalize your robots.txt» feature to add custom signature
  • added recommended seo tools to improve search engine optimization

1.1.5

  • added feature to detect physical robots.txt file and delete it if server permissions allows

1.1.6

  • added russian and chinese (simplified) languages
  • fixed bug causing redirection to better robots.txt settings page upon activating other plugins

1.1.7

  • added new feature: Top plugins for SEO performance
  • fixed plugin notices issue to dismiss for define period of time after being closed
  • fixed stylesheet issue to get proper updated file after plugin update (cache buster)
  • added spanish and portuguese languages

1.1.8

  • added new feature: xml sitemap detection
  • fixed translations

1.1.9

  • added new feature: loading performance for woocommerce

1.1.9.1

  • fixed a bug in disallow rules for woocommerce

1.1.9.2

  • boost your site with alt tags

1.1.9.3

  • fixed readability issues

1.1.9.4

  • fixed default robots.txt file issue upon plugin activation for first time
  • fixed php error upon saving settings and permalinks
  • refactored code

1.1.9.5

  • added clean-param for yandex bot
  • ask backlinks feature for pro users
  • avoid crawler traps feature for pro users
  • improved default robots.txt rules

1.1.9.6

  • added 150+ growth hacking tools
  • fixed layout bug
  • updated default rules

1.2.0

  • Added Post Meta Box to Disable Indivdual post, pages and products (woocommerce pro only). It will add Disallow and Noindex rule in robots.txt for any page you choose to disallow from post meta box options.

1.2.1

  • Added multisite feature for directory based network sites (pro only). it can duplicate all default rules, yoast sitemap, woocommerce rules, bad bots, pinterest bot blocker, backlinks blocker etc with a single click for all directory based network sites.
  • Added version timestamp for wp_register_script ‘assets/rt-script.js’

1.2.2

  • Fixed some bugs creating error in google search console
  • Text improvement

1.2.3

  • Added «Hide your robots.txt from SERPs» feature
  • Text improvements

1.2.4

  • Fixed a bug
  • Text improvements

1.2.5

  • Fixed crawl-delay issue
  • Updated translations

1.2.5.1

  • Fixed a minor issue

1.2.6

  • Security patched in freemius sdk

1.2.6.1

  • Fixed Multisite Issue for pro users

1.2.6.2

  • Fixed Yoast sitemap issue for Multisite users

1.2.6.3

  • Fixed some text

1.2.7

  • Added Baidu/Sogou/Soso/Youdao — Chinese search engines features for pro users
  • Added social media crawl feature for pro users

1.2.8

  • Notification will be disabled for 4 months. Fixed some other minor stuff

1.2.9.2

  • Updated Freemius SDK v2.3.0
  • BIGTA recommendation

1.2.9.3

  • Fixed Undefined index error while saving MENUS for some sites
  • Removed «noindex» rule for individual posts as Google will stop supprting it from Sep 01 2019

1.3.0

  • Added 5 new rules to default config. Removed 4 old default rules which were cuasing some issues with WPML
  • Added a search rule to Avoid crawling traps
  • Added several new rules to Spam Backlink Blocker
  • Fixed security issues

1.3.0.1

  • VidSEO recommendation

1.3.0.2

  • Fixed some security issues
  • Added new rules to Backlink Protector (Pro only)
  • Multisite notification will be disabled permenantly once dismissed

1.3.0.3

  • Fixed php notice (in php log) for $host_url variable

1.3.0.4

  • Fixed php notice (in php log) for $active_tab variable
  • Fixed some typos

1.3.0.5

  • Added option to Be part of our worldwide Movement against CoronaVirus (Covid-19)
  • Fixed several php undefined index notices (in php log) related to Step 7 and 8 options

1.3.0.6

  • 👌 IMPROVE: Updated freemius to latest version 2.3.2
  • 🐛 FIX: Some minor issues

1.3.0.7

  • 🔥 NEW: WP Google Street View promotion
  • 🐛 FIX: Some minor text issues

1.3.1.0

  • 👌 IMPROVE: Admin Notices are set to permenantly dismissed based on user.
  • 👌 IMPROVE: Top level menu for Better Robots.txt Settings
  • 🐛 FIX: Styling conflict with Norebro Theme.
  • 🐛 FIX: Undefined variables php errors for some options

1.3.2.0

  • 🐛 FIXED: XSS vulnerability.
  • 🐛 FIX: Non-static method errors
  • 👌 IMPROVE: Tested up to WordPress v5.5

1.3.2.1

  • 🐛 FIXED: Call to undefined method error.

1.3.2.2

  • 👌 IMPROVE: Update Freemius to v2.4.1

1.3.2.3

  • 👌 IMPROVE: Tested up to WordPress v5.6
  • 🐛 FIX: Get Pro URL

1.3.2.4

  • 👌 IMPROVE: Added some more rules for Woocommerce performance
  • 👌 IMPROVE: Update Freemius to v2.4.2

1.3.2.5

  • 🔥 NEW: Meta Tags for SEO promotion

1.4.0

  • 👌 IMPROVE: Refactored code to MVC
  • 👌 IMPROVE: New clean design
  • 👌 IMPROVE: Many small improvements

1.4.0.1

  • 🐛 FIX: Added trailing backslash for using trait

1.4.1

  • 🔥 NEW: Search engine visibility feature (Pro version)
  • 🔥 NEW: Image Crawlability feature (Pro version)

1.4.1.1

  • 🐛 FIX: Sitemap issue

1.4.2

  • 🐛 FIX: Bugs and improvements
  • 🔥 NEW: Option to add default WordPress Sitemap (Pro Version)
  • 🔥 NEW: Option to add All in One SEO Sitemap (Pro Version)

1.4.3

  • 🐛 FIX: Text issues

1.4.4

  • 🐛 FIX: Security fix

1.4.5

  • 🐛 FIX: PHP warning undefined index

1.4.6

  • 🐛 FIX: SECURITY PATCH. Verify nonce for CSRF attack.
  • 🐛 FIX: PHP 8.2 warning undefined index