Maven-metadata.xml is being weird

So, I need to access a list of released forge, sponge vanilla, and sponge forge versions, preferably through maven. I’m doing some testing with the below snippet of code:

    public static void main(String[] args) throws IOException, ParserConfigurationException, SAXException, TransformerException {
        URL versions = new URL("https://files.minecraftforge.net/maven/net/minecraftforge/forge/maven-metadata.xml");
        InputStream is = versions.openConnection().getInputStream();
        Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is);
        is.close();

        StringWriter writer = new StringWriter();
        TransformerFactory.newInstance().newTransformer().transform(new DOMSource(doc), new StreamResult(writer));
        writer.close();
        System.out.println(writer.toString());
    }

and that produces the expected output. However, if I change the url to https://repo.spongepowered.org/maven/org/spongepowered/spongevanilla/maven-metadata.xml or https://repo.spongepowered.org/maven/org/spongepowered/spongeforge/maven-metadata.xml the following error is produced:

Exception in thread "main" com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
	at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:701)
	at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:567)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1896)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.arrangeCapacity(XMLEntityScanner.java:1761)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipString(XMLEntityScanner.java:1799)
	at com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(XMLVersionDetector.java:156)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:812)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243)
	at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
	at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
	at Test.main(Test.java:42)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)

I have checked out all three documents with chrome’s view-source, and I cannot find any noticeable differences. I have tried reading the url source into a byte array and then putting it in a string, using an InputStreamReader, and just plain printing the content of the document. From this, I have discovered that the two repo.spongepowered url’s seem to produce many unrecognizable characters, yet, chrome can still read them just fine. Does anyone know what could be going wrong? Thanks for any help!

Is it file encoding (UTF-8 instead of ASCII)?

According to the document headers, every one is in utf-8, and I’ve tried decoding with ASCII and UTF8…

I’ve just tested your code and was able to reproduce the problem. It is not actually related to a specific charset but rather compression: The reason why you get this error is that you are trying to parse the response compressed using GZIP as XML file.

Normally, the server should only compress the result if the client also states that it supports the specific encoding (e.g. through the Accept-Encoding: gzip HTTP header). However, for some reason our server does not check this header and will always compress the response using GZIP.

While this is something that should be also fixed on our side, there are two ways you can fix this in your code:

  • Use a custom HTTP client like OkHttp which will transparently handle the compression (you don’t need to do anything extra).

  • Modify your code to decompress the response if the server has encoded it using GZIP:

    public static void main(String[] args) throws IOException, ParserConfigurationException, SAXException, TransformerException {
        URL versions = new URL("https://repo.spongepowered.org/maven/org/spongepowered/spongevanilla/maven-metadata.xml");
        URLConnection con = versions.openConnection();
    
        // State that we support encoding using GZIP
        con.setRequestProperty("Accept-Encoding", "gzip");
    
        InputStream is = con.getInputStream();
    
        // Decompress the response if the server replies with GZIP encoding
        if ("gzip".equals(con.getHeaderField("Content-Encoding"))) {
            is = new GZIPInputStream(is);
        }
    
        Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is);
        is.close();
    
        StringWriter writer = new StringWriter();
        TransformerFactory.newInstance().newTransformer().transform(new DOMSource(doc), new StreamResult(writer));
        writer.close();
        System.out.println(writer.toString());
    }
    

Additionally, I would suggest that you take a look at our new Downloads API: It was primarily designed for our new downloads page but can be also used by any other tools. It is documented at https://dl-api.spongepowered.org/v1/. If you need any additional routes feel free to create an issue on the repository: Issues · SpongePowered/SpongeDownloads · GitHub

2 Likes

You may be interested in this JSON file with all forge releases listed (note: it’s about 1.4MB)
http://files.minecraftforge.net/maven/net/minecraftforge/forge/json

Also a couple of other links for promoted forge builds
http://files.minecraftforge.net/maven/net/minecraftforge/forge/promotions.json
http://files.minecraftforge.net/maven/net/minecraftforge/forge/promotions_slim.json

Although the forge JSON and sponge download API are certainly valid options, I’d rather use the maven-metadata.xml, since it allows me to write the same code for forge, sponge vanilla, and sponge forge. Thanks for the suggestion though!