- Notifications
You must be signed in to change notification settings - Fork 35
Description
Environment details
- Specify the API at the beginning of the title: CloudStorageFileSystem
- OS type and version: macOS 13.4.1 (a)
- Java version: OpenJDK 64-Bit Server VM Temurin-11.0.18+10 (build 11.0.18+10, mixed mode)
- version(s): 0.126.20-SNAPSHOT
Steps to reproduce
- Create a bucket in GCS with a name containing an underscore
- Call
Paths.get(URI.create("<path_in_your_bucket"))with the GCS path in your bucket
Code example
Paths.get(URI.create("gs://bucket_with_authority/path"))Stack trace
java.lang.IllegalArgumentException: Expected scheme-specific part at index 3: gs: at com.google.cloud.storage.contrib.nio.CloudStorageUtil.stripPathFromUri(CloudStorageUtil.java:65) at com.google.cloud.storage.contrib.nio.CloudStorageFileSystemProvider.getPath(CloudStorageFileSystemProvider.java:282) at com.google.cloud.storage.contrib.nio.CloudStorageFileSystemProvider.getPath(CloudStorageFileSystemProvider.java:97) at java.base/java.nio.file.Path.of(Path.java:208) at java.base/java.nio.file.Paths.get(Paths.java:97) at com.google.cloud.storage.contrib.nio.CloudStorageFileSystemProviderTest.testBucketWithAuthority(CloudStorageFileSystemProviderTest.java:836) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at com.google.cloud.testing.junit4.MultipleAttemptsRule$1.evaluate(MultipleAttemptsRule.java:94) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38) at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55) External references such as API reference guides
- https://docs.oracle.com/javase/8/docs/api/java/net/URI.html
- https://www.ietf.org/rfc/rfc2396.txt
- https://cloud.google.com/storage/docs/buckets
Any additional information below
According to RCF 2396, a URI may contain an “Authority Component” (section 3.2). That authority may be “Server-based” (section 3.2.2) or “Registry-based” (section 3.2.1).
All Google bucket names are not server-based authorities. The Google bucket naming documentation mentions “You can use a bucket name in a DNS record as part of a CNAME or A redirect.” but it does not say “You must”.
Since the bucket names are not hostnames then under RFC 2396 the URI may still be constructed using the “Registry-based” form. URI.create("gs://bucket_with_authority/path") produces a valid java.net.URI, it’s just not “Server-based” according to the RFC.
The stack trace appears to be caused by the CloudStorageFileSystem internal use of the java.net.URI API. In a number of places the CloudStorageFileSystem attempts to construct URIs with hostnames, or retrieve the hostname from “Server-based” URIs instead of “Registry-based” URIs.
The fix is to internally use the java.net.URI APIs that support all RFC 2396 section 3.2.1 and section 3.2.2 authorities, not the APIs that only support the section 3.2.2 "Server-based" authorities.
Thanks!