Skip to content

Add video example#28

Open
dreadatour wants to merge 7 commits intomainfrom
video-example
Open

Add video example#28
dreadatour wants to merge 7 commits intomainfrom
video-example

Conversation

@dreadatour
Copy link
Copy Markdown
Contributor

@dreadatour dreadatour commented Jan 20, 2025

Video example for datachain-ai/datachain#890

Note: This PR is also required: datachain-ai/datachain#900

@dreadatour dreadatour marked this pull request as ready for review January 27, 2025 18:24
Copy link
Copy Markdown
Contributor

@shcheklein shcheklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we work with a local storage there? it should cloud

@shcheklein
Copy link
Copy Markdown
Contributor

shcheklein commented Jan 28, 2025

map(lambda file: file.get_info(), output={'meta': Video})

can it be w/o output - it looks heavy for such a basic task. Can it be a single model - File + Meta (not sure how to organize it better tbh)

annotations_dc = DataChain.from_csv( )

can we avoid creating a data model explicitly? from_csv should be able to figure out the schema and create Pydantic model for us

file_stem = file.get_file_stem() 

file_ext = file.get_file_ext()
file_name = f"{file_stem}_{timestamp}.jpg"
file_path = f"data/ava/frames/{file_name}"
frame = file.save_frame(round(timestamp * meta.fps), file_path)

should be part of some helper ... ideally for saving a frame we don't have to do UDFs

@dreadatour
Copy link
Copy Markdown
Contributor Author

Updated example to work with cloud (including upload).
Simplify it a lot, it is now only:

  • getting video meta
  • saving video fragments
  • saving video frames
@dreadatour
Copy link
Copy Markdown
Contributor Author

can it be w/o output - it looks heavy for such a basic task. Can it be a single model - File + Meta (not sure how to organize it better tbh)

Done, fixed. The reason why it wasn't working before is because of typings in function definition.

Before:

if TYPE_CHECKING: from datachain.lib.file import Video, VideoFile def video_info(file: "VideoFile") -> "Video":

After:

from datachain.lib.file import Video, VideoFile def video_info(file: VideoFile) -> Video:
@dreadatour
Copy link
Copy Markdown
Contributor Author

can we avoid creating a data model explicitly? from_csv should be able to figure out the schema and create Pydantic model for us

We can not in this case, because this CSV file does not have headers :(
I can not update this file manually because I am downloading it from URL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants