- All Implemented Interfaces:
- Canonicalizer
public class SemanticPreciseCanonicalizer
extends Object
implements Canonicalizer
Precise semantic canonicalizer, semantic in the sense that the intention is
to canonicalize urls that "mean" the same thing, that you would expect to
load the same stuff and look the same way if you pasted them into the
location bar of your browser.
Does everything WHATWG does and also some cleanup:
- sets default scheme http: if scheme is missing
- removes extraneous dots in the host
And these additional steps:
- collapses consecutive slashes in the path
- standardizes percent encodings so that different encodings of the same-ish
thing match
- sorts query params
- removes userinfo