New Source: Kaggle Datasets #35251
Replies: 4 comments
-
@paultimothymooney this is an amazing idea!!! |
Beta Was this translation helpful? Give feedback.
-
Great! Here is some documentation that describes how to use the official Kaggle API: https://github.com/Kaggle/kaggle-api#datasets and/or https://www.kaggle.com/docs/api#interacting-with-datasets |
Beta Was this translation helpful? Give feedback.
-
Is is correct to say that if we download a data set, there is no guarantee on the format right? It can be a variety of file format (CVS, XLS.. )? What else have you seen? |
Beta Was this translation helpful? Give feedback.
-
Yes, that is correct, a single dataset might contain multiple .CSV files, .JSON files, .zip files, text files, image files, or most any type of file. You can also download individual files independently (e.g. |
Beta Was this translation helpful? Give feedback.
-
Tell us about the new integration you’d like to have
An integration with Kaggle's public datasets platform: https://www.kaggle.com/datasets. It would be useful to use these datasets as either data sources or data destinations (public or private).
Describe the context around this new integration
Kaggle's public data repository has more than 70,000 public datasets as of February 2021. Some of these datasets have more than 1 million views and more than 100,000 downloads (e.g. https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia/activity). It would be useful to have a connector that connects Airbyte to Kaggle because this would make it easy to both: (A) obtain data that interests you; and (B) share data that you think will interest others.
Describe the alternative you are considering or using
Options for sending data to Kaggle: (1) upload locally; (2) upload using the Kaggle API; (3) upload from a URL; or (4) upload from a Google Cloud Storage account.
Options for obtaining data from Kaggle: (1) download the file locally; (2) download the file using the Kaggle API; or (3) use the Google Sheets integration if it is a CSV file.
┆Issue is synchronized with this Asana task by Unito
Beta Was this translation helpful? Give feedback.
All reactions