-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(weave): Upload dataset #3876
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces changes to enhance the flexibility of drawer components on the home page. A new optional Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant DP as DatasetsPage
participant DU as DatasetUploadDrawer
participant BE as Backend
U->>DP: Click "Upload" button
DP->>DU: Open upload drawer (set showDatasetUploadDrawer true)
U->>DU: Upload file & configure dataset
DU->>BE: Submit dataset information
BE-->>DU: Return success/failure response
DU->>DP: Trigger onClose to hide drawer
Suggested Reviewers
Poem
Tip ⚡🧪 Multi-step agentic review comment chat (experimental)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/DatasetsPage/DatasetUploadDrawer.tsx (7)
28-45
: Consider consolidating related state variablesThe component has many state variables that could potentially be grouped into logical objects (e.g., fileState, columnState, uiState) to simplify state management and reduce the number of useState calls.
- const [step, setStep] = useState<'upload' | 'configure' | 'preview'>('upload'); - const [file, setFile] = useState<File | null>(null); - const [datasetName, setDatasetName] = useState<string>(''); - const [fileContent, setFileContent] = useState<any[] | null>(null); - const [columns, setColumns] = useState<Array<{original: string; renamed: string; selected: boolean}>>([]); - const [previewData, setPreviewData] = useState<any[] | null>(null); - const [loading, setLoading] = useState<boolean>(false); - const [publishing, setPublishing] = useState<boolean>(false); - const [error, setError] = useState<string | null>(null); - const [selectionModel, setSelectionModel] = useState<GridRowSelectionModel>([]); - const [isDragging, setIsDragging] = useState<boolean>(false); + const [uiState, setUiState] = useState({ + step: 'upload' as 'upload' | 'configure' | 'preview', + loading: false, + publishing: false, + error: null as string | null, + isDragging: false + }); + + const [fileState, setFileState] = useState({ + file: null as File | null, + datasetName: '', + fileContent: null as any[] | null, + previewData: null as any[] | null, + }); + + const [columnState, setColumnState] = useState({ + columns: [] as Array<{original: string; renamed: string; selected: boolean}>, + selectionModel: [] as GridRowSelectionModel + });
88-131
: Consider using Web Workers for file processingProcessing large files in the main thread could cause UI freezing. Consider using Web Workers to handle file processing in a background thread, especially for larger files.
+ const fileProcessingWorker = useMemo(() => { + if (typeof Worker !== 'undefined') { + const workerCode = ` + self.onmessage = function(e) { + const { file, fileType } = e.data; + const reader = new FileReader(); + + reader.onload = function(event) { + try { + const text = event.target.result; + let data = []; + + if (fileType === 'csv') { + // CSV parsing logic + } else if (fileType === 'json') { + // JSON parsing logic + } + + self.postMessage({ success: true, data }); + } catch (err) { + self.postMessage({ success: false, error: err.message }); + } + }; + + reader.onerror = function() { + self.postMessage({ success: false, error: 'Error reading file' }); + }; + + reader.readAsText(file); + }; + `; + + const blob = new Blob([workerCode], { type: 'application/javascript' }); + return new Worker(URL.createObjectURL(blob)); + } + return null; + }, []); + + useEffect(() => { + if (fileProcessingWorker) { + fileProcessingWorker.onmessage = (e) => { + const { success, data, error } = e.data; + setLoading(false); + + if (success) { + setFileContent(data); + // Process columns and preview data + } else { + setError(error); + } + }; + } + + return () => { + if (fileProcessingWorker) { + fileProcessingWorker.terminate(); + } + }; + }, [fileProcessingWorker]);
203-248
: Add more robust error handling for dataset submissionWhile basic error handling is in place, consider adding more specific error cases and recovery strategies. Also, consider adding validation for column names to ensure they meet any backend requirements.
const handleSubmit = async () => { if (!datasetName.trim()) { setError('Please enter a dataset name'); return; } if (!file || !fileContent) { setError('Please upload a file'); return; } + + // Validate that at least one column is selected + if (!columns.some(col => col.selected)) { + setError('Please select at least one column to include in the dataset'); + return; + } + + // Validate column names for validity (no duplicates, valid characters) + const selectedColumns = columns.filter(col => col.selected); + const columnNames = selectedColumns.map(col => col.renamed); + + // Check for duplicate column names + const uniqueNames = new Set(columnNames); + if (uniqueNames.size !== columnNames.length) { + setError('Each column must have a unique name. Please rename duplicate columns.'); + return; + } + + // Check for invalid characters in column names + const invalidColumnName = columnNames.find(name => !/^[A-Za-z0-9_]+$/.test(name)); + if (invalidColumnName) { + setError(`Column name "${invalidColumnName}" contains invalid characters. Use only letters, numbers, and underscores.`); + return; + } setPublishing(true); setError(null);
282-285
: Improve error handling for row update failuresThe current implementation only logs errors to the console. Consider showing these errors to the user for better feedback.
const handleProcessRowUpdateError = (newError: any) => { console.error('Error during row update:', newError); + setError(`Failed to update column: ${newError instanceof Error ? newError.message : 'Unknown error'}`); };
299-358
: Add accessibility improvements to the file upload UIEnhance the accessibility of the file upload area by adding appropriate ARIA attributes and keyboard navigation support.
<label htmlFor="file-upload" - className="flex cursor-pointer flex-col items-center justify-center gap-3 outline-none"> + className="flex cursor-pointer flex-col items-center justify-center gap-3 outline-none" + role="button" + aria-label="Upload file" + tabIndex={0} + onKeyDown={(e) => { + if (e.key === 'Enter' || e.key === ' ') { + e.preventDefault(); + document.getElementById('file-upload')?.click(); + } + }}>
481-518
: Add file size limit validationAdd a file size check to prevent processing extremely large files that could cause performance issues or exceed backend limits.
const readFile = (file: File): Promise<any[]> => { return new Promise((resolve, reject) => { + // Add file size validation (e.g., 10MB limit) + const MAX_FILE_SIZE = 10 * 1024 * 1024; // 10MB + if (file.size > MAX_FILE_SIZE) { + reject(new Error(`File size exceeds the limit of ${MAX_FILE_SIZE / (1024 * 1024)}MB`)); + return; + } + const reader = new FileReader(); reader.onload = e => {
520-545
: Improve CSV parsing robustnessThe current CSV parsing is basic and might not handle all edge cases like quoted fields containing delimiters or newlines. Consider using a more robust CSV parsing library.
-const parseCSV = (text: string, delimiter: string): any[] => { - const lines = text.split('\n'); - if (lines.length === 0) { - return []; - } - - const headers = lines[0] - .split(delimiter) - .map(h => h.trim().replace(/^"|"$/g, '')); - - return lines - .slice(1) - .filter(line => line.trim()) // Skip empty lines - .map(line => { - const values = line - .split(delimiter) - .map(v => v.trim().replace(/^"|"$/g, '')); - const row: Record<string, string> = {}; - - headers.forEach((header, index) => { - row[header] = values[index] || ''; - }); - - return row; - }); +const parseCSV = (text: string, delimiter: string): any[] => { + // More robust CSV parsing implementation + const lines = text.split('\n'); + if (lines.length === 0) { + return []; + } + + // Function to parse a single CSV line considering quotes + const parseLine = (line: string): string[] => { + const values: string[] = []; + let currentValue = ''; + let insideQuote = false; + + for (let i = 0; i < line.length; i++) { + const char = line[i]; + + if (char === '"') { + // Toggle quote state + insideQuote = !insideQuote; + } else if (char === delimiter && !insideQuote) { + // End of value + values.push(currentValue.trim()); + currentValue = ''; + } else { + // Add character to current value + currentValue += char; + } + } + + // Add the last value + values.push(currentValue.trim()); + + return values.map(v => v.replace(/^"|"$/g, '')); + }; + + const headers = parseLine(lines[0]); + + return lines + .slice(1) + .filter(line => line.trim()) // Skip empty lines + .map(line => { + const values = parseLine(line); + const row: Record<string, string> = {}; + + headers.forEach((header, index) => { + row[header] = values[index] || ''; + }); + + return row; + }); };
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
weave-js/src/components/PagePanelComponents/Home/Browse3/ReusableDrawer.tsx
(2 hunks)weave-js/src/components/PagePanelComponents/Home/Browse3/pages/DatasetsPage/DatasetUploadDrawer.tsx
(1 hunks)weave-js/src/components/PagePanelComponents/Home/Browse3/pages/DatasetsPage/DatasetsPage.tsx
(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{js,jsx,ts,tsx}`: Focus on architectural and logical i...
**/*.{js,jsx,ts,tsx}
: Focus on architectural and logical issues rather than style (assuming ESLint is in place).
Flag potential memory leaks and performance bottlenecks.
Check for proper error handling and async/await usage.
Avoid strict enforcement of try/catch blocks - accept Promise chains, early returns, and other clear error handling patterns. These are acceptable as long as they maintain clarity and predictability.
Ensure proper type usage in TypeScript files.
Look for security vulnerabilities in data handling.
Don't comment on formatting if prettier is configured.
Verify proper React hooks usage and component lifecycle.
Check for proper state management patterns.
weave-js/src/components/PagePanelComponents/Home/Browse3/ReusableDrawer.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/DatasetsPage/DatasetsPage.tsx
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/DatasetsPage/DatasetUploadDrawer.tsx
⏰ Context from checks skipped due to timeout of 90000ms (167)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, mistral1)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, groq)
- GitHub Check: Trace nox tests (3, 13, trace_server)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, mistral1)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, dspy)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, mistral1)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, groq)
- GitHub Check: Trace nox tests (3, 13, trace_server)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, mistral1)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, groq)
- GitHub Check: Trace nox tests (3, 13, trace_server)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, groq)
- GitHub Check: Trace nox tests (3, 13, trace_server)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, groq)
- GitHub Check: Trace nox tests (3, 13, trace_server)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, trace_server)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, trace_server)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
- GitHub Check: WeaveJS Lint and Compile
- GitHub Check: Trace nox tests (3, 13, openai)
- GitHub Check: Trace nox tests (3, 13, llamaindex)
- GitHub Check: Trace nox tests (3, 13, trace)
- GitHub Check: Trace nox tests (3, 12, scorers)
- GitHub Check: Trace nox tests (3, 12, notdiamond)
- GitHub Check: Trace nox tests (3, 12, llamaindex)
- GitHub Check: Trace nox tests (3, 12, trace)
- GitHub Check: Trace nox tests (3, 11, scorers)
- GitHub Check: Trace nox tests (3, 11, llamaindex)
- GitHub Check: Trace nox tests (3, 11, trace)
- GitHub Check: Trace nox tests (3, 10, scorers)
- GitHub Check: Trace nox tests (3, 10, llamaindex)
- GitHub Check: Trace nox tests (3, 10, trace)
- GitHub Check: Trace nox tests (3, 9, scorers)
🔇 Additional comments (12)
weave-js/src/components/PagePanelComponents/Home/Browse3/ReusableDrawer.tsx (2)
11-11
: Great addition of the optional footer property!This enhancement makes the ReusableDrawer component more flexible by allowing custom footer content to be passed in as a prop.
63-73
: Conditional footer rendering looks goodThe implementation properly checks if the footer prop exists before rendering the container, which avoids unnecessary DOM elements when no footer is provided.
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/DatasetsPage/DatasetsPage.tsx (6)
18-18
: Good import of the new DatasetUploadDrawer componentThis import correctly brings in the new dataset upload functionality.
33-33
: State management for drawer visibility is appropriateUsing useState for managing the visibility of the upload drawer follows React best practices.
74-117
: Good integration of the upload drawerThe implementation properly wraps the existing SimplePageLayout with a Fragment to allow adding the DatasetUploadDrawer component at the same level. The drawer receives the necessary props and is conditionally rendered based on state.
130-130
: Proper addition of onUploadButton callback propThe callback prop is correctly added to the component interface.
Also applies to: 140-140
159-164
: Clean implementation of upload buttonThe upload button is well-implemented with an appropriate icon and click handler.
170-170
: Good addition of the upload button to the UIThe upload button is correctly positioned alongside existing action buttons.
weave-js/src/components/PagePanelComponents/Home/Browse3/pages/DatasetsPage/DatasetUploadDrawer.tsx (4)
22-27
: Component props are well-typed and appropriateThe props interface is clear and includes all necessary properties for the component's functionality.
53-67
: Form reset logic is well implementedThe effect hook correctly resets all form state when the drawer is opened, ensuring a clean state for each new upload.
133-141
: File change handler has appropriate validationThe handler correctly checks if a file is selected before processing it.
143-172
: Well-implemented drag and drop functionalityThe drag and drop handlers are implemented correctly with proper event prevention and state management.
Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=2941ccf9092a993b61a3ecb6004e3595c1f7a318 |
Simple dataset upload

Summary by CodeRabbit