11

My requirements are fairly simple, but I need to do a lot of this so I'm looking for a robust solution.

Is there a good light-weight library for decomposing URLs into their component parts in Java? I'm referring to hostname, query string, etc.

5 Answers 5

22

I am always forgetting the URI format, so here it is:

<scheme>://<userinfo>@<host>:<port><path>#<fragement> 

And here an example:

URI uri = new URI ("query://[email protected]:9000/public/manuals/appliances?stove#ge"); 

The following will happen:

  • uri.getAuthority() will return "[email protected]:9000"
  • uri.getFragment () will return "ge"
  • uri.getHost () will return "books.com"
  • uri.getPath () will return "/public/manuals/appliances"
  • uri.getPort () will return 9000
  • uri.getQuery () will return "stove"
  • uri.getScheme () will return "query"
  • uri.getSchemeSpecificPart () will return "//[email protected]:9000/public/manuals/appliances?stove"
  • uri.getUserInfo () will return "jeff"
  • uri.isAbsolute () will return true
  • uri.isOpaque () will return false

I found this blog handy: Exploring Java's Network API: URIs and URLs

Sign up to request clarification or add additional context in comments.

Comments

5

java.net.URI and java.net.URL do not work for many modern URLs. java.net.URI adheres to RFC 2396, which a really old standard. java.net.URL sometimes does a good job, but if you're working with URLs as found in the wild, it will fail for many cases.

In order to solve these issues, I wrote galimatias, a URL parsing and normalization library for Java. It will work with almost any URL you can imagine (basically, if it works in a web browser, galimatias will parse it correctly). And it has very convenient API.

You can get it at: https://github.com/smola/galimatias

Comments

4

Take a look at java.net.URL. It has methods for exactly what you're trying to do.

Hostname: getHost()
Query string: getQuery()
Fragment/ref/anchor: getRef()
Path: getPath()

Comments

0

Look at the getter methods of the URL class.

You have all you need there.

Comments

0
URL.getProtocol() URL.getHost() URL.getPort() 

And so on.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.