Skip to content

feat: implement step two of dataset creation with comprehensive UI components and hooks#30681

Merged
CodingOnStar merged 3 commits intomainfrom
refactor/step-two
Jan 9, 2026
Merged

feat: implement step two of dataset creation with comprehensive UI components and hooks#30681
CodingOnStar merged 3 commits intomainfrom
refactor/step-two

Conversation

@CodingOnStar
Copy link
Contributor

Summary

feat: implement step two of dataset creation with comprehensive UI components and hooks

  • Added new components for general chunking options, parent-child options, preview panel, and step two footer.
  • Introduced hooks for document creation, indexing configuration, indexing estimation, preview state, and segmentation state.
  • Created types for step two props and integrated them into the components.
  • Implemented escape and unescape utility functions for handling special characters.
  • Established a structured approach for managing dataset creation workflow, enhancing user experience and functionality.

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran make lint and make type-check (backend) and cd web && npx lint-staged (frontend) to appease the lint gods
…mponents and hooks - Added new components for general chunking options, parent-child options, preview panel, and step two footer. - Introduced hooks for document creation, indexing configuration, indexing estimation, preview state, and segmentation state. - Created types for step two props and integrated them into the components. - Implemented escape and unescape utility functions for handling special characters. - Established a structured approach for managing dataset creation workflow, enhancing user experience and functionality.
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@CodingOnStar CodingOnStar marked this pull request as ready for review January 7, 2026 09:08
Copilot AI review requested due to automatic review settings January 7, 2026 09:08
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. 💪 enhancement New feature or request labels Jan 7, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the step-two component of the dataset creation workflow by extracting logic into custom hooks and smaller components, improving maintainability and testability.

Key Changes:

  • Extracted state management into 5 custom hooks (useSegmentationState, usePreviewState, useIndexingConfig, useIndexingEstimate, useDocumentCreation)
  • Split monolithic component into 5 focused sub-components (GeneralChunkingOptions, ParentChildOptions, IndexingModeSection, PreviewPanel, StepTwoFooter)
  • Added comprehensive test coverage with 2185 lines of tests
  • Created escape/unescape utility functions for handling special characters

Reviewed changes

Copilot reviewed 15 out of 19 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
index.tsx Refactored main component to use extracted hooks and sub-components
types.ts Defined StepTwoProps interface
hooks/* Implemented 5 custom hooks for state management and business logic
components/* Created 5 focused sub-components for UI sections
escape.ts/unescape.ts Utility functions for character escaping
index.spec.tsx Comprehensive test suite covering hooks and components

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Updated test case to clarify handling of complex strings without backslashes. - Added new test case to document behavior for strings containing existing backslashes, highlighting the non-symmetrical nature of escape/unescape functions. - Improved code readability and understanding of string manipulation scenarios.
Copy link
Member

@WTW0313 WTW0313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 8, 2026
@CodingOnStar CodingOnStar merged commit 9848823 into main Jan 9, 2026
14 checks passed
@CodingOnStar CodingOnStar deleted the refactor/step-two branch January 9, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

💪 enhancement New feature or request lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

3 participants