Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

now how to pass Tensor or data dir path to ChunkDataReader in Javacpp-pytorch #1556

Open
mullerhai opened this issue Dec 26, 2024 · 7 comments

Comments

@mullerhai
Copy link

Hi :
now we have write some code to extend storch framework https://github.com/sbrunk/storch make pytorch in scala env, so will rewrite the dataset dataloader simpler and dataReader, now javacpp-pytorch only have ChunkDataReader ,but I not know how to pass data chunk path or tensor Example to ChunkDataReader ,could you give me one example to show how to use it .@h @saudet . by the way @sbrunk If you know thank tell me ,
image

image
image
image

@saudet
Copy link
Member

saudet commented Dec 26, 2024

@mullerhai
Copy link
Author

There's some sample code here: https://github.com/bytedeco/javacpp-presets/blob/master/pytorch/samples/TestChunkData.java

Very thanks , these ChunkDataLoader and ChunkDataset could use ChunkDataReader , but for these JavaDataset also need pass org.bytedeco.javacpp.Pointer object, and should also pass some DataReader object? but now javacpp-pytorch only have ChunkDataReader ,so what should we pass to these JavaDataLoader JavaDataset, I also think maybe should pass InputStream** but It not is Pointer subclass.

image
image

@saudet
Copy link
Member

saudet commented Dec 27, 2024

I don't know what JavaDataset is for. @HGuillemet ?

@HGuillemet
Copy link
Collaborator

JavaDataSet is the abstract class to subclass for implementing stateless datasets in Java.

@HGuillemet HGuillemet removed their assignment Dec 27, 2024
@mullerhai
Copy link
Author

JavaDataSet is the abstract class to subclass for implementing stateless datasets in Java.

how to use these javaDataset, please show me one use case,thanks

@HGuillemet
Copy link
Collaborator

Just subclass it:

JavaDataset ds = new JavaDataset() {
  @Override public Example get(long idx) {
      // ...
  }

  @Override public SizeTOptional size() {
     // ...
  }
};

Then use it for instance with a random sampler and a random loader:

DataLoaderOptions opts = new DataLoaderOptions(2);
opts.workers().put(5);
JavaRandomDataLoader loader = new JavaRandomDataLoader(ds, new RandomSampler(ds.size().get()), opts);      

@mullerhai
Copy link
Author

Just subclass it:

JavaDataset ds = new JavaDataset() {
  @Override public Example get(long idx) {
      // ...
  }

  @Override public SizeTOptional size() {
     // ...
  }
};

Then use it for instance with a random sampler and a random loader:

DataLoaderOptions opts = new DataLoaderOptions(2);
opts.workers().put(5);
JavaRandomDataLoader loader = new JavaRandomDataLoader(ds, new RandomSampler(ds.size().get()), opts);      

thanks. but I do not see how to pass data dir path or tensor param to the javadataset , need me implement javadataset ,and use difined how to pass ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants