Introduction to the Deep Web

Introduction to the Deep Web

 

 

Before we get into the underworld of the deep web, let’s quickly navigate through the finer points of The World Wide Web (WWW). WWW is an information space where documents and other web resources are identified by URLs, interlinked by hypertext links, and can be accessed via the Internet.  The World Wide Web was invented by English scientist Tim Berners-Lee in 1989. He wrote the first web browser in 1990. The World Wide Web is now commonly known as the Web.

The World Wide Web was central to the development of the Information Age and is the primary tool billions of people use to interact on the Internet.

Web pages are primarily text documents formatted and annotated with Hypertext Markup Language (HTML). In addition to formatted text, web pages may contain images, video and software components that are rendered in the user’s web browser as coherent pages of multimedia content.

Embedded hyperlinks permit users to navigate between web pages. Multiple web pages with a common theme, a common domain name, or both, may be called a website. Website content can largely be provided by the publisher or can be interactive where users contribute content. Websites may be mostly informative, primarily for entertainment, or largely for commercial purposes.

Two sides to everything?

Everything in this world has two sides, a good and a bad side. It depends on what use is made of it. The same way the internet also has two sides. The good part of the web- The Surface Web and the bad part of the web-The Deep Web.

The Surface Web

The surface web is that part of the internet which is readily available to the general public and which is searchable through your normal search engines- Google, Yahoo and Bing to name a few. The surface web is made up of static pages. Static pages do not depend on a database for their content. They reside on a server waiting to be retrieved, and are basically html files whose content never changes.

Any changes are made directly to the html code and the new version of the page is uploaded to the server. Thus, any reference to Surface Web will be referring to common websites, that is, sites whose domains end in .com, .org, .net, or similar variations, and whose content does not require any special configuration to access.

The common example used to differentiate the size of the surface web and the deep web is the picture of an iceberg. The part which can be seen to us on the top is the size of the surface web, whereas the hidden portion of the iceberg is the deep web. This cannot be accessed by your normal web browsers or search engines.

So what is the Deep Web?

The deep web is that part of the internet which is hidden and cannot be found through normal web browsers or search engines. The Deep Web refers to content hidden behind HTML forms. In order to get to such content, a user has to perform a form submission with valid input values. The name Deep Web arises from the fact that such content was thought to be beyond the reach of search engines. The Deep Web is also believed to be the biggest source of structured data on the Web and hence accessing its contents has been a long standing challenge in the data management community.

article-2454735-18AE3EA600000578-197_634x463

Size and Scale

An important aspect of understanding the Deep Web is the scale of its size. Michael K. Bergman published an influential whitepaper in 2001 that is still considered to be the Holy Grail of Deep Web information. In the paper, Bergman offers a mind-boggling analytical approach to understanding the Deep Web:

  • Public information on the Deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web.
  • The Deep Web contains 7,500 terabytes of information compared to 19 terabytes of information in the surface Web.
  • The Deep Web contains nearly 550 billion individual documents compared to the one billion of the surface Web.
  • The Deep Web is the largest growing category of new information on the Internet.
  • More than half of the deep Web content resides in topic­ specific databases.
  • A full ninety­ five per cent of the deep Web is publicly accessible information — not subject for fees or subscriptions

So how do you access the Deep Web?

The Deep web is a scary place, but accessing it is really simple. All you have to do is download the Tor browser. Tor is a free service that lets you connect to web pages anonymously. This makes it extremely difficult for anyone to track your internet activity if you follow the right precautions. Many deep web communities can only be accessed through the Tor network, since they are founded on anonymity, privacy, and secrecy.

Web pages on the Tor network tend to be unreliable, often going down for hours, days, or permanently. They can be slow to load as well, since Tor is routing your connection through other people’s computers to protect your anonymity.

While Tor browsers exist for Android and iOS, these are not secure and not recommended. Similarly, Tor add-ons for other browsers are not secure and are usually not supported by the Tor organization.

What goes on in the Deep Web?

  • Buying and selling of drugs
  • Buying and selling of arms and ammunition
  • Access to hiring assassins
  • Purchase of fake identification/passports
  • Journalists use the deep web as a base to source ‘inside’ information
  • Instructions on how to perfectly cook a woman
  • Trade of child pornography
  • Streaming of live torture
  • Experiments on humans
  • Access to government information

These are just some of the things you will find on the deep web.

 

 

 

Leave a Reply

Be the First to Comment!

Notify of
avatar
wpDiscuz