Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Warning
This is work in progress!
Warning
titleOutOfMemoryException when ingesting large files

Currently there seems to be a bug, which creates OutOfMemoryExceptions when ingesting files that are larger than available Because of https://issues.jboss.org/browse/MODE-2103, ingesting files larger than Java heap space with certain infinispan configurations Infinispan configuraitons (e.g. LevelDB) . It seems like this is an issue with the Modeshape project which has been reported at: https://issues.jboss.org/browse/MODE-2103creates an OutOfMemoryError.  The following TestCase can be used to reproduce the issue: https://github.com/futures/large-files-test

WorkaroundYou will need : Use Infinispan file storage with a large heap size for this to work (, e.g. -Xmx2048g)

Currently the only known workaround is using a _file_ configuration for infinspan caches e.g.: https://github.com/futures/fcrepo4/blob/34aab66bc26edfca3a4cbabecc4870bfd81f05da/fcrepo-http-commons/src/main/resources/config/single-file/repository.json.

This can be done by setting the following property:

:

-Xmx2048m -Dfcrepo.modeshape.configuration=config/single-file/repository.json

 

Ingesting Large Files

...

via the REST API

Based on the tests below, we believe arbitrarily-large files can be ingested and downloaded via the REST API (tested up to 1TB).  The only apparent limitations are disk space available to store the files, and a sufficiently large Java heap size (tested with -Xmx2048m).

REST API Upload/Download Roundtrip

  • Platform: Linux 3.12.1-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux 16GB RAM
  • Repository Profile: Single-File
  • Workflow Profile: Upload/Download Roundtrip
File SizeUploadDownload
256GB15,488,156ms (16.9MB/sec)3,306,756ms (79.3MB/sec)

REST API Upload/Download Roundtrip

Because of https://issues.jboss.org/browse/MODE-2103 for large file ingests only the single-file configuration can be used.

Setting the Java Property fcrepo.modeshape.configuration to classpath:/config/single-file/repository.json and allowing the heap to grow up to 2gb is required.

Example

Running Fedora 4 for large file ingests in Tomcat7 

CATALINA_OPTS="-Xmx1024m -XX:MaxPermSize=256m -Dfcrepo.modeshape.configuration=classpath:/config/single-file/repository.json" bin/catalina.sh run
Tip

Using the single-file configuration ingest and retrieval of files up to the size of 300 GB using Fedora 4's REST API were tested successfully. The files were ingested sequentially, retrieved and a bitwise comparison with the original data has been performed. Larger sizes have not been tested, due to HDD size limitations.

 

...

File SizeUploadDownload
256GB15,488,156ms (16.9MB/sec)3,306,756ms (79.3MB/sec)
512GB 31,262,610ms (16.77MB/sec)5,386,542ms (97.33MB/sec)
1TB59,631,142ms (17.58MB/sec)15,120,135ms (69.35MB/sec)

...

Serving Large Files via Filesystem Federation

Based on the tests below, we believe arbitrarily-large files can be

...

projected into the repository via filesystem federation and downloaded via the REST API (tested up to 1TB).  The only apparent limitations are disk space available to store the

...

files, and a sufficiently large Java heap size (tested with -Xmx2048m).

...

Filesystem Federation Download Tests

  • Platform: Linux 3.12.1-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux 16GB RAM
  • Repository Profile: Single-File
  • Workflow Profile: Upload/Download Roundtrip
File SizeUploadDownload
256GB15,488,156ms (16.9MB/sec)3,306,756ms (79.3MB/sec)
512GB  

Federated Content Large File Download Roundtrip Tests

  • Platform: Linux 3.12.1-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux  16GB RAM
  • Repository Profile: Single-File with an additional external Resource:

    "externalSources" : {
    "home-directory" : {
        "classname" : "org.modeshape.connector.filesystem.FileSystemConnector",
        "directoryPath" : "/tmp/projection",
        "projections" : [ "default:/projection => /" ],
        "readOnly" : true,
        "addMimeTypeMixin" : true
        }
    }

 
File SizeProjection Directory Request DurationFirst Projected Node Request Duration

Download Duration

Throughput
2 GB0m35.117s0m34.572s0m8.236s248.66 mb/sec
10 GB    

100 GB

  

 

 
300 GB    
10*10 GB   

Filesystem Federation Download Tests

...