DISCLAIMER: This is an old method that has since been patched on most content filters. I only recommend trying this at home to see the binary method of internet browsing. The following article was found on http://www.neworder.box.sk/news/16160.
HOWTO:
Slip Past Content Filters
@
Articles -> Security Dec 29 2006,
07:39 (UTC+0) s0journist
writes: OVERVIEW
Content
filtering software inspects web and FTP requests before they leave
the network. Based on the software’s rule set, it will either allow
or deny the connection. Corporate networks and schools use content
filtering to ensure that employees and students cannot access
‘inappropriate’ web content.
Filtering
applications use three criteria to determine if the client is
requesting banned content. The HTTP header is scanned for (1) domain
name, (2) IP address and (3) key words. The application hosts a
database of all blacklisted content. After the HTTP request is sent
from the client, the application compares the HTTP Header to all
listings in the database. When a match is found, the application
reacts.
Three reactions are typical of a content filter.
First, the content filter drops the packet requesting the data. That
means that instead of sending your web request to
http://mosteroticteens.org
, it quietly drops the packet. The remote web server never sees the
request.
Additionally, most content filters will redirect your
browser to an internal web page. This is usually an intimidating page
stating that you have broken corporate policy and that the material
you are trying to access is inappropriate.
Finally, most
content filters will log the action. The application’s
administrator has a full listing of each filtered web access attempt.
The logs will show who broke the web policy, which workstation sent
the request, what time the event occurred and which web page was
requested. The more sophisticated applications have a snazzy front
with reports such as ‘top ten offenders’, ‘most popular banned
sites’, and so on. If the redirected page states that your action
has been recorded, it probably has—and in great detail.
WHY
CONTENT FILTERS ARE USED
Corporations
use content filters to two reasons. The most important reason is to
mitigate liability. Corporations are responsible for the environments
of their employees. If an employee is surfing porn from his desk and
a coworker is offended, the corporation can be sued (quite
successfully) for fostering sexual harassment in the workplace.
Likewise,ion in classes. All of those downloads clogged the campus
networks. In most cases, universities relied on QoS to solve the
issue, instead of censoring web access to dormitories.
The
other resource is employee time and attention. Games sites and sports
sites chew up a lot of hours of procrastination. If an employee can
not update his fantasy football team at work, then he might spend
some of that time filling out spreadsheets instead.
To slip
past the filters, one must first understand how they work, and how
your requests get from your browser to the Internet web server and
back.
NAMES
Most
users rely on names to connect to resources on the Internet. Names,
however, cannot be routed across networks. The network devices that
interconnect the networks around the world use IP addresses to figure
out where to pass the requests. The entire Internet relies on public
IP addressing to support connections from anywhere to
anywhere.
Users cannot be expected to memorize the IP address
of every computer connected to the Internet. Instead, the user types
the name of the computer that hosts the content he wants. Before the
browser creates a request, it passes the name of the computer to
Domain Name System (DNS) server.
A DNS server provides a
simple service to users. It waits for a client to send a name. It
then looks up the name in a database and returns the IP address for
the name. The server works much like a telephone directory. When
someone needs to call Bill Gates on the phone, he doesn’t dial the
name. Instead, he uses a directory to find the number listed for his
name. He then dials the number to reach the person he is looking for.
DNS is the IP directory for the Internet. First a browser does a
lookup of the web server’s name, and then it sends packets to the
number.
If content filters only looked at the names of web
servers, it would be very simple to bypass them. A user could simply
ping the web server’s name and connect to the server by IP address.
For example, if http://www.google.com
is a banned website, a user can ping the name. The console would show
four replies from 64.233.167.99. A connection can be made to the
search engine by typing either http://www.google.com
or http://64.233.167.99
into a browser.
By the way, even if the name is banned by the
content filter, the user can still safely ping the server. Content
filters only scan HTTP and FTP headers. They do not inspect ICMP
(such as ping) packets.
IP
ADDRESS
Knowing
the IP Address of the web server doesn’t really help if a true
content filter is inspecting all packets leaving the network. The
second criterion the application uses is the IP Address of the remote
server.
This is the crux of the issue. Every device between
the user’s computer and the server located half-way around the
world uses the IP Address to figure out how to forward the request to
the correct server. Without an IP address, the packets go
nowhere.
KEYWORDS
Before
the solution the IP Address issue is discussed, the last of the
criterion will be explained. Content filter vendors cannot reliably
list every server in a category of banned content. There are too many
sports sites, game sites, porn sites and financial sites for the
vendor to make a complete list. New sites are also added to the
Internet daily.
To catch any sites that may have been
overlooked when the database was created or updated, the content
filter also scans the text of the URL to find banned words.
For
example, the keyword SEX would stop a multitude of web requests, even
if the name of the site or IP address for a porn site was not in the
database. Unfortunately, this would also block access to the web page
for the town of Essex, Connecticut. Unintentionally banning access to
a legitimate website is called a ‘false positive’.
The
entire URL is scanned for keywords, not just the domain name. That
means that if SEX was anywhere in the URL (or the name of a picture
displayed), then the content filter will block access to the
content.
ROUTE
WITHOUT IP ADDRESSES
Back
to the main issue, how can a client make a connection to an Internet
server without using its Name or IP address? The answer is, you
can’t—but you can still slip the request past the content
filter.
The answer is in how the address is placed in the
packet. In the above example, where the IP address of www.google.com
could be used to get to the search engine, the request would be
blocked. If one knows how to put the IP Address into the packet
header, but write it in a way that fools the content filter, then the
request slips through undetected, unhindered and without being
logged.
The complicated answer is: conversion of decimal
octets to binary, then combined into a 32-bit stream, then converted
to decimal. If that doesn’t make sense, don’t worry. It only
sounds complicated. Done manually, all that is needed is the built-in
scientific calculator.
SIMPLE
IP ADDRESS STRUCTURE
An
IP Address is four decimal numbers separated by dots. An example is
an IP Address assigned to www.google.com: 64.233.167.99. Each of
these four numbers can have any value ranging from 0 to 255.
The
numbers are referred to as ‘octets’. An octet is an eight bit
(read as: eight digit) binary number. Eight bits can represent any
value from 0 (00000000) to 255 (11111111). All IP addresses are 32
bits long. Four octets (4 x 8) represent these 32 bits. Users and
administrators read and write IP addresses in octets because using a
stream of ones and zeroes is impractical—and could give your retina
serious screen burn!
Content filters are expecting IP
addresses in the standard decimal notation. Instead, we can express
the same 32-bit number as one big number, instead of four smaller
ones.
SIMPLE
MATH FOR A SIMPLE TECHNIQUE
Start
by pulling up your scientific calculator. In Windows type ‘calc’
into the Run prompt. On Linux, type ‘gcaltool’ in the terminal
console.
Once the calculator appears, select ‘Scientific’
from the View menu. This will add lots of buttons and options to your
plain old calculator. Above the buttons, notices the radial buttons
next to each of the number systems: bin (binary), oct (octal), dec
(decimal) and hex (hexadecimal). These buttons are used to switch
back and forth between the different bases, as well as convert the
numbers.
Follow these steps, using the example IP address of
64.233.167.99:
Verify that the calculator is in Decimal (‘dec’
should be selected)
Type in the first octet of the IP address
(64)
Convert the number to binary by clicking the ‘bin’ radial
button.
Write this number down. The calculator displays ‘1000000’.
Octets represent EIGHT digits. The result from the calculator shows
only seven digits. In order for this technique to work correctly
enter each result in eight digits. Pad the beginning of the number
with zeroes until the octet has eight digits. This means you should
write down ‘01000000’
Switch the calculator back to
Decimal.
Clear the calculator display.
Repeat steps 1 through 6
for the remaining octets. Your results should be: 233 (11101001), 167
(10100111) and99 (01100011)
Switch the calculator to
binary.
Combine the results of your conversion into a single
32-bit number (01000000111010011010011101100011) Notice, if you
failed to pad the last number with a zero, the result would be only
31 bits, and the technique would fail.
Type this number into the
calculator and convert it to decimal. This should give you a decimal
result of 1208930147.
In your browser, type http://1089054563
and hit enter.
Notice that the Google search engine appears.
A
content filter will see a request for a web server named 1208930147.
This does not match (1) the name of a banned server, (2) an IP
address or (3) a keyword. The browser wrote the 32 address into the
packet header, but the content filter, which only inspects the HTTP
header, doesn’t notice that the server is blacklisted. Because this
activity is not significant, it will not flag your request. Instead,
it will fetch the content that was requested.
VENDORS
AND COMPANIES KNOW ABOUT THIS
Content
filter vendors are aware of this vulnerability. A developer can
easily fix the hole of the application, but they won’t. It would be
detrimental to the vendor to do so.
With the current structure
of the application, three separate queries and functions must be run
for every HTTP packet the passes through the device. This delay
causes network latency (in other words, slows down the network).
Three queries need to be run against a database to determine whether
the packet passes through or gets dropped.
To add a feature to
the application that patches this hole, the developer could build
another complete table in the database to hold all of the converted
decimal addresses for blocked content. This would increase the size
of the database by 25%. More importantly, there would be a
corresponding decrease in performance, due to the added query.
An
alternative is to build a function into the filter that performed a
text-to-strings-to-decimal-{many mathematical
calculations}-to-string-to-IP operation—followed by a database
query. This is too much processing overhead to perform on every HTTP
packet that passes through the device. This again slows down the
performance of the content filter.
If a vendor chose one of
the above options for the application, his product would perform only
75% as efficiently as his competition. It is a hard sell when your
product is slow and inefficient but protects against a really obscure
method of looking a websites that involves lots of stubby pencil math
to exploit. So for competitive reasons, developers have no interest
in changing the way they filter traffic.
Furthermore, there is
little demand for a product with this feature. Companies are
concerned with legal liability. If an employee goes to the extents of
what we saw above just to look at a website, then the company has
shown due diligence to protect the work environment. The fact that an
employee must consciously bypass software and devices to get a single
blacklisted page shows that the company did spend time, effort and
money to secure the workplace. If a couple of employees abuse this
hole and a lawsuit is filed, the company is in the clear and the
employee is liable.
WHEN
THIS TECHNIQUE DOES NOT WORK
The
technique illustrated does not work in all environments. This method
of surfing was not a planned and supported function of web browsers
and servers. The procedures only work by taking advantage of some
features that are a part of web standards. The following
circumstances may break the feature:
Internet Explorer 7:
IE7 closed the hole on this technique. Before sending the web
request, it translates the decimal IP address back to octets. This is
a browser-level function. The latest version of Firefox works
perfectly. IE6 can also be used to successfully bypass
filters.
Websites That Use Host Headers: Host Headers are a
technique for hosting multiple domain names on a single IP address.
If eight sites are running on a single IP, and the header asks for a
long string of number for the site, then the web service will not
know which web site the client is requesting.
Sites with URL
Security: Some anti-leaching and other settings do not allow requests
framed from a domain name other than the site’s true name. To see
an example of a site that would not accept this technique, navigate
to http://linux.org.
Note the error. The site only accepts requests for
http://www.linux.org
NOTE: This site is a purely private homepage for the author. It has absolutely no associations with ANY hardware or software manufacturer, NOR retail store, service center or company. The opinions expressed on this site are not to be construed as anything other than the purely personal opinions and amusements of the author. That said, if you are offended or insulted by this site, don’t come back. I didn’t make it for you anyway. This site is brought to you via the First Amendment and someone with too much spare time on their hands.