Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File monitor stops with FileSystemException when copying a large directory structure into the watched folder #193

Open
akkie opened this issue Oct 26, 2017 · 10 comments

Comments

@akkie
Copy link
Contributor

akkie commented Oct 26, 2017

If I copy a large directory structure (the unzipped jmeter package as example) into a watched folder, then the FileMonitor stops with a java.nio.file.FileSystemException:

java.nio.file.FileSystemException: ...: The process cannot access the file because it is being used by another process.

        at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsDirectoryStream.<init>(Unknown Source)
        at sun.nio.fs.WindowsFileSystemProvider.newDirectoryStream(Unknown Source)
        at java.nio.file.Files.newDirectoryStream(Unknown Source)
        at java.nio.file.FileTreeWalker.visit(Unknown Source)
        at java.nio.file.FileTreeWalker.walk(Unknown Source)
        at java.nio.file.FileTreeIterator.<init>(Unknown Source)
        at java.nio.file.Files.walk(Unknown Source)

I've tested this on Windows 7 with the following code:

val watcher = new FileMonitor(directory, recursive = true) {
  override def onEvent(eventType: WatchEvent.Kind[Path], file: File, count: Int) = eventType match {
    case EventType.ENTRY_CREATE => logger.info(s"$file got created")
    case EventType.ENTRY_MODIFY => logger.info(s"$file got modified $count")
    case EventType.ENTRY_DELETE => logger.info(s"$file got deleted")
  }
}
watcher.start()(context.system.dispatcher)
@pathikrit pathikrit added the bug label Oct 26, 2017
@pathikrit
Copy link
Owner

I don't have access to a Windows system. Can you give me the full stack trace? The stack trace you posted does not help me figure out from where in better-files this was caught. Thank you for reporting.

@akkie
Copy link
Contributor Author

akkie commented Oct 26, 2017

@pathikrit

at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsDirectoryStream.<init>(Unknown Source)
        at sun.nio.fs.WindowsFileSystemProvider.newDirectoryStream(Unknown Source)
        at java.nio.file.Files.newDirectoryStream(Unknown Source)
        at java.nio.file.FileTreeWalker.visit(Unknown Source)
        at java.nio.file.FileTreeWalker.walk(Unknown Source)
        at java.nio.file.FileTreeIterator.<init>(Unknown Source)
        at java.nio.file.Files.walk(Unknown Source)
        at better.files.File.walk(File.scala:523)
        at better.files.FileMonitor.watch(FileMonitor.scala:47)
        at better.files.FileMonitor.$anonfun$process$1(FileMonitor.scala:36)
        at better.files.FileMonitor.$anonfun$process$1$adapted(FileMonitor.scala:30)
        at scala.collection.Iterator.foreach(Iterator.scala:929)
        at scala.collection.Iterator.foreach$(Iterator.scala:929)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
        at scala.collection.IterableLike.foreach(IterableLike.scala:71)
        at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at better.files.FileMonitor.process(FileMonitor.scala:30)
        at better.files.FileMonitor.$anonfun$start$3(FileMonitor.scala:56)
        at better.files.FileMonitor.$anonfun$start$3$adapted(FileMonitor.scala:56)
        at scala.collection.Iterator.foreach(Iterator.scala:929)
        at scala.collection.Iterator.foreach$(Iterator.scala:929)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
        at better.files.FileMonitor.$anonfun$start$1(FileMonitor.scala:56)
        at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:140)
        at java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(Unknown Source)
        at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)

@pathikrit
Copy link
Owner

Thanks for the stack trace. Looks like Windows does not really like walking the directory when it is being unzipped into.

Can you try this:

val yourMonitor = new FileMonitor {
 // Your current code
  override def onCreate(file: File, count: Int) = ...
  override def onModify(file: File, count: Int) = ...
  override def onDelete(file: File, count: Int) = ...
  override def onUnknownEvent(event: WatchEvent[_], count: Int) = ...
  override def onException(exception: Throwable) = ...

// Modify the library
 override def watch(file: File, depth: Int): Unit = {
    def toWatch: Files = if (file.isDirectory) {
      file.walk(depth).filter(f => f.isDirectory && f.exists)
    } else {
      when(file.exists)(file.parent).iterator  // There is no way to watch a regular file; so watch its parent instead
    }
    toWatch.foreach(f => Try[Unit](f.register(service)).recover(PartialFunction(onException)).get)
  }
}

If that still does not work, try this:

  override def watch(file: File, depth: Int): Unit = {
    val toWatch: Files = if (file.isDirectory) {
      file.walk(depth).filter(f => f.isDirectory && f.exists)
    } else {
      when(file.exists)(file.parent).iterator  // There is no way to watch a regular file; so watch its parent instead
    }
    try {
     toWatch.foreach(f => Try[Unit](f.register(service)).recover(PartialFunction(onException)).get)
    } catch {
      case NonFatal(e) => onException(e)
    }
  }

@akkie
Copy link
Contributor Author

akkie commented Oct 27, 2017

The last version seems to work. The watcher doesn't stop watching and the watcher notifies also about files for witch an exception was thrown:

[error] a.FileImportService - Got exception
java.nio.file.FileSystemException: C:\Users\...\files\import\apache-jmeter-3.3\printable_docs\usermanual: The process cannot access the file because it is being used by another process.

        at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsDirectoryStream.<init>(Unknown Source)
        at sun.nio.fs.WindowsFileSystemProvider.newDirectoryStream(Unknown Source)
        at java.nio.file.Files.newDirectoryStream(Unknown Source)
        at java.nio.file.FileTreeWalker.visit(Unknown Source)
        at java.nio.file.FileTreeWalker.walk(Unknown Source)
        at java.nio.file.FileTreeIterator.<init>(Unknown Source)
        at java.nio.file.Files.walk(Unknown Source)
[info] a.FileImportService - C:\Users\...\files\import\apache-jmeter-3.3\printable_docs\usermanual got created

@akkie
Copy link
Contributor Author

akkie commented Nov 3, 2017

@pathikrit I think this issue isn't really fixable on windows. Maybe the problem is also that my notebook disk is to slow.

So if I copy a large directory structure in a watched directory, the copy process locks the files and directories during creation. But at the same time, the watcher tries to register a watcher for these newly created files and directories. Because of the fact the directories are locked, the watcher fails with an exception and now it isn't able to register a watcher for all inner files and directories. Even if you try to wait for the lock to release, the watcher can then register itself for the inner files and directories, but the listener will not be notified for the creation event, because the files and directories were created already.

Anyway, I think the current implementation isn't correct, because it stops watching if an exception occurs. Instead the watcher should wait for the lock to release and then try to register the watcher again.

With this implementation I get the best results on my system:

override def watch(file: File, depth: Int): Unit = {
  val toWatch: List[File] = if (file.isDirectory) {
    import scala.collection.JavaConverters._
    FileUtils.iterateFilesAndDirs(file.toJava, FalseFileFilter.FALSE, TrueFileFilter.TRUE)
      .asScala
      .toList
      .filter(_.exists())
      .map(f => File(f.toPath))
  } else {
    when(file.exists)(file.parent).toList
  }
  def iterate(files: List[File], tries: Int = 0, maxTries: Int = 1000, sleep: Long = 1L): Unit = {
    files match {
      case Nil => ()
      case h :: t =>
        try {
          h.register(service)
        } catch {
          case NonFatal(_) if tries < maxTries && h.isWriteLocked() =>
            Future {
              Thread.sleep(sleep)
              iterate(List(h), tries + 1, maxTries, sleep)
            }
          case NonFatal(e) => onException(e)
        } finally {
          iterate(t, 0, maxTries, sleep)
        }
    }
  }
  iterate(toWatch)
}

I use FileUtils.iterateFilesAndDirs because it doesn't throw an exception while iterating over a locked directory. If it finds a locked directory, then it waits and tries to watch it again. I do this in an async way to not block the main thread. With such a solution it isn't guaranteed that the creation event will be triggered for every newly created directory or file, because of the reasons I mentioned above. But it will still watch every directories and files and trigger the modification and deletion events for it.

With the current settings sleep = 1 and maxTries = 1000 it also notifies for the creation event of every file copied to my watched directory. But this is dependent of the size and structure of the source directory.

@akkie
Copy link
Contributor Author

akkie commented Nov 8, 2017

@pathikrit
Copy link
Owner

I like to keep better-files dependency free but if @gmethvin would like to port that to Scala and better-files, would love to have it in :)

@gmethvin
Copy link
Contributor

gmethvin commented Nov 9, 2017

@pathikrit The FILE_TREE part of it could be ported potentially but I'm not sure if you're comfortable with the JNA dependency for the OS X implementation. Another option is to have a way to plug in alternate implementations for the file monitor in better-files, so I could create a library that provides the custom implementation.

Regarding this bug, the real issue is that there's generally going to be a delay between the time a new directory is created and when we can register a listener on it. We'll miss any events that happen during the time we weren't listening. I don't think there's a foolproof way to eliminate that race condition without hooking into low-level OS functionality.

@pathikrit
Copy link
Owner

@gmethvin I am fine with JNA dependency in better-files. Alternatively, yes, there is a pluggable interface for monitoring - see the trait File.Monitor:

trait Monitor extends AutoCloseable {
val root: File
/**
* Dispatch a StandardWatchEventKind to an appropriate callback
* Override this if you don't want to manually handle onDelete/onCreate/onModify separately
*
* @param eventType
* @param file
*/
def onEvent(eventType: WatchEvent.Kind[Path], file: File, count: Int): Unit = eventType match {
case StandardWatchEventKinds.ENTRY_CREATE => onCreate(file, count)
case StandardWatchEventKinds.ENTRY_MODIFY => onModify(file, count)
case StandardWatchEventKinds.ENTRY_DELETE => onDelete(file, count)
}
def start()(implicit executionContext: ExecutionContext): Unit
def onCreate(file: File, count: Int): Unit
def onModify(file: File, count: Int): Unit
def onDelete(file: File, count: Int): Unit
def onUnknownEvent(event: WatchEvent[_], count: Int): Unit
def onException(exception: Throwable): Unit
def stop(): Unit = close()

@gmethvin
Copy link
Contributor

gmethvin commented Dec 1, 2017

@pathikrit Yes, I'm already using it here: https://github.com/gmethvin/directory-watcher/#better-files-integration-scala. I think the File.Monitor interface is a good integration point for other libraries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants