Using URLConnection with a Proxy Server
By Daniel Horn
Abstract: Using the Java URLConnection and
HttpURLConnection classes with a proxy server is easy but poorly
documented until now.
Introduction:
A simple way to get started writing
network client applications is to use Javas URLConnection class. Given a URL, such as http://www.borland.com/, a developer can
create an instance of a URLConnection to retrieve information from an HTTP
server.
In practice, a developer will often need to indicate a proxy
server when making a network connection.While easy to do, telling a URLConnection to use a proxy server has been
poorly documented -- possibly undocumented until now.
Typically, corporate sites do not give individuals direct
access to the Internet, but rather access is shared via a proxy server.
Basically, when a network client application makes an HTTP request, it instead
will actually make the request to the proxy server; the proxy then makes the
actual request to the Internet and passes the results back to the original
client.
A program will sometimes use a system property to retrieve
the JVMs proxy setting, perhaps set via a command line by something like:
java -Dhttp.proxyHost=myproxyhost MyClass
However, this is
not good enough in many situations. The
command line settings of Java and the program may be obscure enough that the
user should not be adding to them. Or,
perhaps, the program may be started by a script that the user cannot edit. Or, simply, the proxy settings needed in the
program may have to be different from those used by default by Java.
As an example of this last case, consider that some proxy
servers also provide caching functionality. An application might need to
preload a large Web objects (such as a large image file) onto a specific proxy-cache
(or a list of proxy-caches). One way of accomplishing this would be to
make the URL request via the proxy (or make the request multiple times,
assigning a different proxy address from the list each time).
(Note that this solution is far from optimal as the bytes
needed to preload the proxy/cache still have to travel all the way back to the
client).
In the example code we provide (http://codecentral.borland.com/codecentral/ccWeb.exe/listing?id=19589),
you will see how to specify the proxy server at run time.
How its done:
We use the Java URL and URLConnection classes to connect to
a proxy.
Ordinarily, one might make a Web request using code that
looks something like:
URL url = new URL(http://www.borland.com/);
URLConnection c = url.openConnection();
To connect via a proxy, we use a different constructor for URL:
public URL(String protocol, String host, int port, String file);
In use, this might look like:
URL url = new URL(http, // protocol,
myProxy.com, // host name or IP of proxy server to use
-1, // proxy port or 1 to indicate the default port for the protocol
http://www.borland.com/); // the original URL, specified in the file parameter
Thats pretty much all there is to it.
The example program, ProxyDemo.java, demonstrates this usage
in the doURLRequest(String strURL, String strProxy, int iProxyPort) method. The
strURL argument is a string representing the URL to request. The strProxy should be the host name or IP address of the proxy to use, or null if the request is not to be made through a proxy. The iProxyPort is
integer that is the port of the proxy server or 1 if the default port should
be used (or if no proxy is to be used).
You must change the line
ProxyDemo.doURLRequest("http://www.borland.com/", "0.0.0.0", -1); // **** Change this line to use a valid proxy
to use a valid proxy. If you don't change the proxy setting to something valid, then you will get the following error message:
**** Connection failure: java.net.BindException: Cannot assign requested address: connect
How did I figure this out?
Basically I made a good guess, and fortunately, it
worked.
When you make an HTTP GET request as indicated by some URL,
say http://www.borland.com/foo.html,
you are really making a TCP request to the host indicated by www.borland.com and sending it an argument
string that looks something like
"GET /foo.html"
to indicate a path or file. (This is a gross simplification as I am omitting text that indicates the
protocol, request headers, etc.). When
you make the same request through a proxy, the proxy becomes the host for the
request and the argument string becomes the original URL; e.g.,
GET http://www.borland.com/foo.html .
In the JDK documentation for the URL class, I didnt find
anything saying that I could or could not put a string such as http://www.borland.com/foo.html as
the file parameter to the URL constructor, so I just gave it a try and it
worked.
Whats next?
There are several ways that the sample program could be
expanded upon.
The sample program merely reads the response headers; this
was done for the sake of simplicity (and for the sake of having it do something
at all). The sample does not actually
download, though with simple modifications, it could.
In practice, one would usually put the call to make the
connection in a different thread than the applications main thread.
One might also consider making the class into a JavaBean component.
Questions, comments, and suggestions may be sent to the
author at dan@nerds.com ; use the title of the article or the words BDN
Article in the email subject.