Skip to content

feat: implement dataset creation step one with preview functionality#30507

Merged
CodingOnStar merged 4 commits intomainfrom
refactor/step-one
Jan 6, 2026
Merged

feat: implement dataset creation step one with preview functionality#30507
CodingOnStar merged 4 commits intomainfrom
refactor/step-one

Conversation

@CodingOnStar
Copy link
Contributor

Summary

  • Added a new component for dataset creation, integrating file, Notion, and website data sources.
  • Introduced a custom hook to manage preview states for files, Notion pages, and websites.
  • Created reusable components: , , and for better modularity.
  • Implemented conditional rendering for different data source types and integrated billing plan checks.
  • Added tests for the new functionality to ensure proper behavior and state management.

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods
- Added a new component for dataset creation, integrating file, Notion, and website data sources. - Introduced a custom hook to manage preview states for files, Notion pages, and websites. - Created reusable components: , , and for better modularity. - Implemented conditional rendering for different data source types and integrated billing plan checks. - Added tests for the new functionality to ensure proper behavior and state management.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @CodingOnStar, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the dataset creation workflow by introducing a new "Step One" interface. It allows users to intuitively select their data source from files, Notion pages, or websites, and provides immediate preview functionality. The changes involve a substantial refactoring of the frontend logic, moving towards a more modular and maintainable architecture by extracting state management into a custom hook and breaking down the main component into smaller, reusable parts. This also includes integrating billing plan considerations for certain features.

Highlights

  • New Dataset Creation Step: Implemented the first step of dataset creation, supporting file, Notion, and website data sources.
  • Preview State Management: Introduced a custom usePreviewState hook to centralize and manage preview states for various data types.
  • Component Refactoring: Created dedicated, reusable components (DataSourceTypeSelector, NextStepButton, PreviewPanel) to improve modularity and maintainability of the StepOne component.
  • Billing Integration: Incorporated conditional rendering and billing plan checks, particularly for features like batch uploads.
  • Comprehensive Testing: Added extensive unit tests for the new hook and components, ensuring robust behavior and state management.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a great refactoring of the dataset creation's first step. The introduction of the usePreviewState hook and breaking down the UI into smaller, more manageable components like DataSourceTypeSelector, NextStepButton, and PreviewPanel significantly improves the code's readability, maintainability, and testability. The new test suite is comprehensive and covers the new functionality well.

I have one suggestion to further improve the code by applying memoization in the DataSourceTypeSelector component for better performance.

CodingOnStar and others added 2 commits January 4, 2026 17:06
…Memo - Updated handleTypeChange to use useCallback for better performance. - Utilized useMemo for filtering visible data source options based on web enablement. - Improved code readability and efficiency in the data source type selector component.
@CodingOnStar CodingOnStar marked this pull request as ready for review January 4, 2026 09:07
Copilot AI review requested due to automatic review settings January 4, 2026 09:07
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. 💪 enhancement New feature or request labels Jan 4, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the dataset creation step one component to improve code modularity and maintainability by extracting reusable components and introducing a custom hook for preview state management.

Key Changes

  • Introduced a custom usePreviewState hook to centralize preview state management for files, Notion pages, and websites
  • Extracted three reusable components: DataSourceTypeSelector, NextStepButton, and PreviewPanel for better separation of concerns
  • Refactored the batch upload validation logic using a lookup table (MULTIPLE_ITEMS_CHECK) for cleaner code

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
web/app/components/datasets/create/step-one/index.tsx Main refactor of StepOne component with extracted logic and components, fixed typo in variable name
web/app/components/datasets/create/step-one/index.spec.tsx Comprehensive test suite covering the hook, components, and integration scenarios
web/app/components/datasets/create/step-one/hooks/use-preview-state.ts New custom hook for managing preview states with memoized callbacks
web/app/components/datasets/create/step-one/hooks/index.ts Barrel export file for hooks module
web/app/components/datasets/create/step-one/components/preview-panel.tsx Extracted preview panel component for right-side preview display
web/app/components/datasets/create/step-one/components/next-step-button.tsx Reusable next step button component
web/app/components/datasets/create/step-one/components/data-source-type-selector.tsx Extracted data source type selector with conditional web option rendering
web/app/components/datasets/create/step-one/components/index.ts Barrel export file for components module

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Updated logic in StepOne component to default to FILE type for data source when no type is specified from either the creation page or the dataset. - Modified DataSourceTypeSelector to handle undefined title gracefully.
Copy link
Member

@WTW0313 WTW0313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 6, 2026
@CodingOnStar CodingOnStar merged commit 64bfcbc into main Jan 6, 2026
14 checks passed
@CodingOnStar CodingOnStar deleted the refactor/step-one branch January 6, 2026 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

💪 enhancement New feature or request lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

3 participants