How to Search Google Images by the Exact Size

What is a robots.txt file

If you don’t want crawlers to access sections of your site, you can create a robots.txt file with appropriate rules. A robots.txt file is a simple text file containing rules about which crawlers may access which parts of a site. For example, the robots.txt file for may look like this:


How to compress a PDF file in Google Drive

Data compression means reducing the size of a file. However, while transmitting data, this process is called source coding as it means encoding done at the source before sending or storing data.

Steps to follow

Start by choosing the file you want to compress. It can be either from your cloud storage like DropBox, Google Drive, or your computer.

  • Automatic Size Reduction. The uploaded file reduces in size automatically once it gets uploaded to the Google Drive system. It keeps a relevant compressed quality suitable for the internet.
  • There is the option to further shrink your file to web/email quality.
  • When the file is ready, you can access the compressed PDF file by downloading it and viewing it in your Computer’s web browser.

Facts About File Compression

PDF file compression reduces bits of the original file by encoding slightly fewer information. However, there is no information loss. The compressed PDF files target and eliminate any statistical redundancies.

Other PDF Compression Options

There are other advanced compressing options than Google Drive. You can download and install desktop options or compress PDF files using online software.

When compressing PDF files using Google Drive, you have the option to store them and preview them online or offline.

To view a PDF file offline, click on the offline tab and download for a later preview.

Read How to Edit a PDF in Google Drive

Reasons to compress files to PDF

If you can, avoid sharing files in word or PowerPoint form. If there are no reasons why the file should remain as it is, always convert to PDF.

There are two primary reasons why a file should be stored and shared as a PDF file.

 When you share information with the public, they can use it elsewhere. Not everyone uses Microsoft products like Excel, PowerPoint, or word since you must purchase them as Microsoft Office products.

New versions of Microsoft products keep changing their file extensions. Different kinds of file extensions like pptx and docx have more recent versions. While newer versions of these Office products can open an older file, an old version can’t open new versions of these file extensions.

Saving files as PDF ensures that anyone can open file extensions since you don’t need to buy Adobe Acrobat to open PDF files.

Read: How to upload files to Google Drive from PC and Smartphones

Order of precedence for user agents

Only one group is valid for a particular crawler. Google’s crawlers determine the correct group of rules by finding in the robots.txt file the group with the most specific user agent that matches the crawler’s user agent. Other groups are ignored. All non-matching text is ignored (for example, both googlebot/1.2 and googlebot* are equivalent to googlebot). The order of the groups within the robots.txt file is irrelevant.

If there’s more than one specific group declared for a user agent, all the rules from the groups applicable to the specific user agent are combined internally into a single group. User agent specific groups and global groups (*) are not combined.

Matching of user-agent fields

This is how the crawlers would choose the relevant group:

Group followed per crawler
Googlebot News googlebot-news follows group 1, because group 1 is the most specific group.
Googlebot (web) googlebot follows group 3.
Googlebot Images googlebot-images follows group 2, because there is no specific googlebot-images group.
Googlebot News (when crawling images) When crawling images, googlebot-news follows group 1. googlebot-news doesn’t crawl the images for Google Images, so it only follows group 1.
Otherbot (web) Other Google crawlers follow group 2.
Otherbot (news) Other Google crawlers that crawl news content, but don’t identify as googlebot-news follow group 2. Even if there is an entry for a related crawler, it is only valid if it’s specifically matching.

Grouping of rules

If there are multiple groups in a robots.txt file that are relevant to a specific user agent, Google’s crawlers internally merge the groups. For example:

The crawlers internally group the rules based on user agent, for example:

Rules other than allow, disallow, and user-agent are ignored by the robots.txt parser. This means that the following robots.txt snippet is treated as one group, and thus both user-agent a and b are affected by the disallow: / rule:

When the crawlers process the robots.txt rules, they ignore the sitemap line. For example, this is how the crawlers would understand the previous robots.txt snippet:

Recent Posts