Jul 09 20:00:21, 2023
The dark side of the internet, part 2
The dark side of the internet, part 2: How do onion sites work
When people think of the Dark Web, Tor is typically the first thing that comes to mind. However, the Dark Web does not stop there: Tor is only a part of it.
Before you start reading this article, we would recommend checking out the previous one, “What is the Dark Web and what is it used for” — there, you will find a detailed account of how the Tor network works.
The Dark Web, part 1: What is the Dark Web and what is it used for
The Dark Web, part 2: How do onion sites work ← you are here
The Dark Web, part 3: What is Freenet
The Dark Web, part 4: What is I2P and how it works
The Dark Web, part 5: How to access Darknet via Tor, I2P and Freenet
What are onion services
Onion services are anonymous network services that can only be accessed over Tor. Running such a service gives your users all the security of HTTPS with the added privacy benefits of Tor Browser.
Darknet differs from the regular internet in that darknet sites have names consisting of numbers and characters and ending in the top level domain ‘.onion’ rather than ‘.com’ or similar.
The top-level domain is the starting point of a website’s name. Sites ending in ‘.onion’ can only be opened in Tor, as their paths can be resolved only in the Tor network. The regular websites’ paths are resolved by DNS servers.
A website can have a name like https://duckduckgfgjdkgjtkjgjkjefi59u49u34fji3fjzad.onion. Thus, to access any darknet website, the user needs to know its address.
How onion sites and services are different from those on the regular internet
The onion network enhances user privacy and security in several ways:
Tor is an overlay network and runs essentially on top of the regular internet. The latter is based on the TCP/IP protocols which make it possible to find out user locations. Tor runs on top of the regular internet, i.e. on top of TCP/IP and does not rely on it for operation.
Every Tor user knows that he/she receives specific content from a specific node. A third-party node cannot interfere in a request and impersonate as the data source. Let’s look into more detail using the example of a brand-name clothing store.
Let's say a person has decided to buy clothes in a PARA store. He/she comes to the store and sees in fact two PARA stores next to each other, one selling the original clothes and the other selling counterfeit items. The owner of the second store is a fraudster and cheater. In this case, it is very easy to visit the wrong store and buy some counterfeit clothes, as the two stores look identical and no one knows which one sells the original clothing.
In the case of onion sites, no cheating is possible: specific information is associated with a specific host, so no one else can impersonate a node by supplying the same information.
As for the regular internet, this type of fraud does exist and is termed “DNS attack”. In a DNS attack, the attacker replaces a website’s real IP address with a fake one. A user visiting the webpage would land on the fake one and won’t even be able to tell the difference, as the URL address would have stayed the same.
For example, if a DNS record gets spoofed for the website mysite.com, all links in the internet would lead to its fake version for a user accessing such a DNS record. Further, the hacker can copy the original website’s graphic design so the fake site looks the same and the user can’t spot the difference.
The fake site can contain malware or a phishing program with which the fraudster would attempt to steal user information. However, there are various methods of protection against DNS attacks, so it is quite difficult to spoof a website’s DNS record.
End-to-end encryption protects the traffic that goes from sender to receiver on the Tor network, so no one, including the intermediate nodes, can know what information is being sent over the encrypted channel.
This method of protection also exists on the regular internet: traffic is protected using SSL/HTTPS encryption. However, unlike in Tor, not all resources on the regular internet are automatically protected by this protocol.
Communication between the user and onion website
Onion services are TCP-based network services that are available over Tor only and provide mutual anonymity: the Tor client is anonymous to the server, and the server is anonymous to the client.
Clients access onion sites via onion domains that only work within the Tor network.
There are six Tor nodes between the client and the resource. The path is built from two sides: from the client to the rendezvous point, and from the Tor network to the rendezvous point. Neither party can know the other’s IP address.
The Tor circuit consists of several nodes. The Guard is the introduction node in the circuit of three nodes; it forms and completes the circuit and ensures that the information being communicated does not get compromised. The Middle is neither the guard nor the exit node — it acts as a relay between them.
Website communication over Tor is a step-by-step process. Let us look into each step, one by one.
At step 1, the onion recourse calculates the private and public key → chooses random nodes for introduction points → communicates its public key to these points. The introduction point serves only to establish the circuit and is not involved in the data communication.
A web resource’s private key is not known to anybody, and the public key is used to create the website’s domain name. None of the nodes in contact with the onion site can find out its location.
The onion resource hides in the Tor network and protects itself from unwanted visits allowing access only through the three introduction points through which the request is routed.
At step 2, private keys are used. To enable users to find sites and the Tor network to show them, information about onion resources is collected in a distributed hash table, which is essentially a directory stored in parts on different nodes of the network.
When an onion resource is scanned to put it in that directory, the network picks up its descriptor, which is a technical description of the site.
The descriptor contains a list of the entry points and the authentication keys from Step 1 — these are required to access the site.
This information is then signed with the private key which is the private part of the public key. It is encrypted in the onion resource’s address.
Steps 1 and 2 take place independently of the client, so the user cannot observe them. We described them to explain how sites appear on the Tor network.
Now let's move on to the point where a real client would like to visit a resource.
Let’s assume the user would like to visit an onion resource called SecureDrop. For this, the browser accesses the distributed directory hash table at step 2 and requests the signed descriptor for SecureDrop.
As a matter of fact, the user accesses the hash table indirectly, through a third-party resource. For example, the user can pick the hyperlink leading to SecureDrop on the regular internet or another onion resource.
When the user follows the site’s address, he/she connects to the distributed table and requests the hash for the resource. A resource’ hash appears as part of the 16-character string in the form of $hash.onion — this string is obtained from the service’s public key.
If such hash does exist, the Tor network automatically checks the public key embedded in the onion address. This way, the network figures out that the descriptor in the table is associated with a specific onion resource rather than any other resource. This ensures the security of end-to-end authentication.
Besides, the descriptor contains the introduction points through which to establish connection to the SecureDrop platform. Basically, thanks to the introduction points the browser knows where to send a request to connect to the site.
Before sending a request to the Tor network, the user picks a random node and connects to it. Essentially, the client asks the node to become its rendezvous point and provides it with the rendezvous point address and a "one-time secret" that will be used to connect to the site. Then the message is encrypted with the public key of the hidden service.
The introduction point passes the user details (the secret string and the rendezvous address) on to the onion resource; the latter runs multiple verification processes and decides whether the user’s request can be trusted.
The onion resource connects to the rendezvous point using an anonymous channel, a special hidden service decrypts the introductory message using its private key and thus knows the address of the meeting point and the one-time secret.
The rendezvous point does one final verification to match the secret strings coming from the client and the resource, and then establishes a connection between the client and the site.
As a result, the client and the hidden service have an established communication channel through this rendezvous point. The entire traffic is protected by end-to-end encryption and goes via the rendezvous point that acts as a relay.
All in all, six nodes are involved in the end-to-end communication. Three of them are selected by the special service, and three, including the rendezvous point, are selected by the client.
Let’s recap what happens at each stage in the diagram:
A hidden server sends its public key to the introduction point. This is done in order to ensure the website builder’s anonymity: the connection will be established via the introduction point, and data communication will go through other points.
The hidden server sends a descriptor into the distributed hash table. The descriptor contains the public key and information about the introduction point; these data are signed with the service’s private key.
The client requests the name of the ‘xyz.onion’ website, where xyz is the 16-character name composed based on the public key.
When the client’s request arrives, information is sent to the introduction point and to the rendezvous point. The rendezvous point is chosen randomly, its mission is to relay the data coming from the onion resource. Each user has his/her own route to the rendezvous point, so there is no single way of transmitting information in the Tor network.
Information about the rendezvous point is sent to the introduction point, along with an introductory message encrypted with the public key of the hidden service.
Information from the introduction point is sent to the hidden server, so the latter knows the address of the rendezvous point.
The hidden server sends information to the rendezvous point.
The rendezvous point finally verifies that the secret strings from the user and the resource match.
The rendezvous point sends information about the resource to the client.
The user and the hidden server now have an established connection and can exchange information.
Why not connect to onion sites via proxies like Tor2Web
Such proxies as Tor2Web help connect to onion sites using regular browsers. This works like this: the proxy server first connects to Tor and then relays the traffic to you via the regular internet.
This option is not secure to use. You lose your anonymity because your ISP, hacker and the authorities will be able to track which site you connected to, what devices you used for the connection, etc.
They will find out that you are using the proxy and track the confidential information you receive from the network. Once you leave Tor, end-to-end encryption and authentication is lost, making it easier to steal your traffic.