Giter VIP home page Giter VIP logo

Comments (12)

tomaswolf avatar tomaswolf commented on June 2, 2024

Thank you for this test case. It appears that there is indeed something wrong with the FileChannels. The following is in my tests much faster (and on par with OpenSSH or Jsch):

SftpClient sftpClient = SftpClientFactory.instance().createSftpClient(session);
try (OutputStream out = sftpClient.write("largeFile")) {
    Files.copy(new File(largeFile).toPath(), out);
}

or also

try (SftpFileSystem fs = SftpClientFactory.instance().createSftpFileSystem(session)) {
  Path remoteFile = fs.getPath("largeFile");
  Files.copy(new File(largeFile).toPath(), remoteFile, StandardCopyOption.REPLACE_EXISTING);
}

With the channels and transferTo I see uploads (to localhost, so no network latency) about 4 times (400%) slower, and downloads about 25% slower. We'll have to investigate what's going on there...

What is the JSCHED library?

from mina-sshd.

tomaswolf avatar tomaswolf commented on June 2, 2024

Interesting: if you change in your code

readableChannel.transferTo(0, length, writeableChannel);

to

writeableChannel.transferFrom(readableChannel, 0, length);

it will also run much faster (but still 25% slower than the two versions with Files.copy() I posted).

Off-topic note: you should probably also check the return value of transferTo/transferFrom and execute them in a loop until everything is transferred.

from mina-sshd.

tomaswolf avatar tomaswolf commented on June 2, 2024

After some analysis, here's what's going on:

transferTo/transferFrom, as well as the FileChannel.write() operations, are positional operations. readableChannel.transferTo(0, length, writeableChannel) will essentially read 8kB ByteBuffers from the file and then call writeableChannel.write() for each buffer.

However, SftpRemotePathChannel.write() doesn't know that it is being called essentially for a sequential copy operation, and so it doesn't employ a number of optimizations. The result is the slow transfer.

If you change the logic and use writeableChannel.transferFrom(), then the SftpRemotePathChannel drives the operation, and it knows that it is going to sequentially read buffers. Hence it can employ these optimizations.

When you use OutputStream/InputStream as in my Files.copy() examples, then it is known that a sequential data transfer occurs, and the SFTP implementation can employ its optimizations unconditionally.

Finally, transferTo/transferFrom by default copy data in 8kB chunks. With streams, the chunks are about 32kB. This difference causes the 25% slowdown.

Hence:

  • In general, using streams is the simplest for downloading and uploading files, and gives good performance.
  • If you want to use FileChannels:
    • Always let the remote channel drive the operation. Use transferFrom for uploading and transferTo for downloading.
    • Execute transferTo/From in a loop until all data has been transferred.
    • Increase the transfer buffer size via SftpModuleProperties.COPY_BUF_SIZE.set(session, 32 * 1024);

It might be possible to improve our implementation to handle the case you stumbled upon better, but I'm not sure yet.

from mina-sshd.

kvlnkarthik avatar kvlnkarthik commented on June 2, 2024

We see same issue in our tests as well. We are using 2.12.1 version.

We executed filetransfer test case using Files.copy() approach for transferring a file of about 167Mb to a remote server and it took around 30seconds.

If we transfer the same file from same system to the same remote server but with sftp session created with below commands, it takes around 6 minutes to complete the transfer. Performance is very much degraded in this scenario.

" sftp -P 2022 @localhost"
put file /tmp/

We run the SSHD server with sftp subsystem and a custom FileSystemFactory which creates a remote sftp filesystem. Remote sftp filesystem is created using below code.

URI sftpUri = SftpFileSystemProvider.createFileSystemURI(sshConnectionDetails.getHostname(), sshConnectionDetails.getSshPort(), sshConnectionDetails.getUsername(), sshConnectionDetails.getPassword());

Apache Mina code runs on localhost 2022 but creates a remote filesystem. So, when we execute the put file /tmp/, the file gets transferred from our local system to remote server. i.e., client -> apache mina server -> remote server.
We acknowledge that there is an additional hop here, i.e., the file needs to be transferred to server and then to remote server but the transfer rate is way too slow.

We see SftpRemotePathChannel.write method invocations during this mode of transfer in the thread dump. Based on our tests and your explanation in previous comments, this mode of transfer seems to be very slow.

Stack trace:

"sshd-SftpSubsystem-47114-thread-1" #35 daemon prio=5 os_prio=0 tid=0x00007f7cf4070800 nid=0x18e88 in Object.wait() [0x00007f7d30ffc000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:460)
at org.apache.sshd.sftp.client.impl.DefaultSftpClient.receive(DefaultSftpClient.java:351)
- locked <0x000000071c95faa0> (a java.util.HashMap)
at org.apache.sshd.sftp.client.impl.DefaultSftpClient.receive(DefaultSftpClient.java:325)
at org.apache.sshd.sftp.client.impl.AbstractSftpClient.response(AbstractSftpClient.java:181)
at org.apache.sshd.sftp.client.impl.AbstractSftpClient.rpc(AbstractSftpClient.java:169)
at org.apache.sshd.sftp.client.impl.AbstractSftpClient.checkCommandStatus(AbstractSftpClient.java:233)
at org.apache.sshd.sftp.client.impl.AbstractSftpClient.write(AbstractSftpClient.java:783)
at org.apache.sshd.sftp.client.fs.SftpFileSystem$Wrapper.write(SftpFileSystem.java:422)
at org.apache.sshd.sftp.client.impl.SftpRemotePathChannel.doWrite(SftpRemotePathChannel.java:266)
- locked <0x000000071cd40f08> (a java.lang.Object)
at org.apache.sshd.sftp.client.impl.SftpRemotePathChannel.write(SftpRemotePathChannel.java:202)
at org.apache.sshd.sftp.server.FileHandle.write(FileHandle.java:161)
at org.apache.sshd.sftp.server.SftpSubsystem.doWrite(SftpSubsystem.java:884)
at org.apache.sshd.sftp.server.AbstractSftpSubsystemHelper.doWrite(AbstractSftpSubsystemHelper.java:605)
at org.apache.sshd.sftp.server.AbstractSftpSubsystemHelper.doProcess(AbstractSftpSubsystemHelper.java:362)
at org.apache.sshd.sftp.server.SftpSubsystem.doProcess(SftpSubsystem.java:355)
at org.apache.sshd.sftp.server.AbstractSftpSubsystemHelper.process(AbstractSftpSubsystemHelper.java:344)
at org.apache.sshd.sftp.server.SftpSubsystem.run(SftpSubsystem.java:331)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)

Is there any way to force/override the file upload/download in sftp sessions through put/get commands to use Files.copy() way in order to see better performance or tune buffers in SftpRemotePathChannel. Could you please let us know.

Also observed that if we simply use sftp openssh client to the remote server directly without going through our Apache Mina Sftp server code, it takes only 3seconds to transfer the same file.

from mina-sshd.

Holger-Benz avatar Holger-Benz commented on June 2, 2024

from mina-sshd.

kvlnkarthik avatar kvlnkarthik commented on June 2, 2024

@tomaswolf ,
Any thoughts on my above comment especially on
"""
Is there any way to force/override the file upload/download in sftp sessions through put/get commands to use Files.copy() way in order to see better performance or tune buffers in SftpRemotePathChannel. Could you please let me know.
"""
Thanks,
Karthik

from mina-sshd.

Holger-Benz avatar Holger-Benz commented on June 2, 2024

The SSHD-server is also integrated in our communication software.

After updating the server from version 2.4.0 to 2.12.1, the communication of the server has become significantly slower (factor 2).

Is there any way to improve the perfomance?

from mina-sshd.

tomaswolf avatar tomaswolf commented on June 2, 2024

The SSHD-server is also integrated in our communication software.

The original report was about the client side. Whatever this may be, it would be a new separate issue. But unless you have more information we can't do anything anyway. Best bet to track it down might be to run with debug logging, once against the old version and once against the new version. Maybe that gives some hints. Also monitor resource consumption (memory etc) on the server side in both cases, and look for differences.

from mina-sshd.

tomaswolf avatar tomaswolf commented on June 2, 2024

Any thoughts on my above comment especially on """ Is there any way to force/override the file upload/download in sftp sessions through put/get commands to use Files.copy() way in order to see better performance or tune buffers in SftpRemotePathChannel. Could you please let me know. """

I don't think so. If I understand it right, your problem is in a server acting as a kind of SFTP proxy. That intermediary server does not see put/get commands, it only sees positional write/read requests.

from mina-sshd.

Holger-Benz avatar Holger-Benz commented on June 2, 2024

I'm sorry, you're right. We will open a new issue when we have the relevant debug data.

from mina-sshd.

benz-ppi avatar benz-ppi commented on June 2, 2024

Even with the changes you have suggested, transferring files with the SFTP Apache client software is significantly slower (> factor 3) than transferring files with jsched or winscp.

Do you intend to improve the performance of the SFTP Apache client software?

from mina-sshd.

tomaswolf avatar tomaswolf commented on June 2, 2024

So far I have not enough information to do anything. I have run my own speed tests, and I see no performance problem. Before I can do anything I need to be able to reproduce the problem that you observe.

I would need detailed information about your setup: your client-side code, your test setup, what authentication mechanisms and ciphers are used, what's the size of the files, what Java version do you use, what hardware is your client running on, what server are you testing against and on what hardware or virtual machine or container is it running, what is the network latency, what buffer sizes are used, which of the I/O back-ends in Apache MINA SSHD are you using (NIO2, MINA, Netty?), and what is that "jsched" client that you keep mentioning? I have never heard of that.

from mina-sshd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.