Using icgc-get behind a corporate proxy server


#1

We are having difficulties using icgc-get behind our proxy server. I am attempting to download data from the collaboratory repo, and so far I have tried doing it two ways, with the docker image, and with score-client manually installed. I am getting errors indicative that the score-client does not obey proxy settings, and attempts to connect to external servers directly. I have proxy host and port configured for docker daemon and docker client, as well as for the JVM.

I am unable to post the details of the problem (including error messages, software versions and packet capture results, due to receiving an error “New users can only put 2 links in a post”, while my full post does not include any links. May I ask the moderators to lift the restriction? I will post the details as soon as I am able. Thank you very much.


#2

Hello,

We are also seeing this issue. A user of our HPC service is trying to perform collaboratory download using icgc-get / score-client behind a tight corporate firewall with web proxy. If icgc-get / score-client do not support proxy downloads we would need to provide full sets of IP addresses / subnet ranges in order to obtain specific firewall exceptions from our networking department. Is this information available?

Many thanks.


#3

[Please forgive lack of dots and slashes; board software thinks these are links and doesn’t allow me to post them.]

Here are IP addresses and ports accessed by icgc-get running in Docker mode, when downloading data files from Collaboratory repo:
142 1 177 85 port 443 (storage cancercollaboratory org)
142 1 177 10 port 9080 (object cancercollaboratory org)
142 1 177 11 port 9080 (object cancercollaboratory org)
142 1 177 199 port 443 (song cancercollaboratory org)
We obtained these addresses by capturing icgc-get packets with tcpdump (tcpdump -n -i docker0), and by creating firewall exceptions one by one. We did not attempt downloading data from other repositories (EGA,GDC,PDC), which I assume use different sets of IPs.

Another issue was that, before trying to access the above IPs, icgc-get insisted on resolving the hostname “storage cancercollaboratory org” by Google public DNS servers (8 8 8 8 and 8 8 4 4), which were blocked by our firewall too. This turned out to be a “feature” of Docker, which can be worked around by configuring the Docker daemon (,etc,docker,daemon.json). [Can’t post the exact syntax]