Skip to content

Conversation

@phofl
Copy link
Member

@phofl phofl commented Mar 17, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This cuts runtime roughly in half when dtype is different than the dtype from values

# main %timeit DataFrame(arr, dtype="int32", copy=True) 6.01 ms ± 37.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) # pr %timeit DataFrame(arr, dtype="int32", copy=True) 3.54 ms ± 25.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 
@phofl phofl added this to the 2.0 milestone Mar 18, 2023
@phofl phofl added Performance Memory or execution speed performance DataFrame DataFrame data structure labels Mar 18, 2023
@mroeschke mroeschke merged commit 7b93b06 into pandas-dev:main Mar 20, 2023
@mroeschke
Copy link
Member

Thanks @phofl

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Mar 20, 2023
mroeschke pushed a commit that referenced this pull request Mar 20, 2023
…py=True and different dtype) (#52088) Backport PR #52054: PERF: Improve performance with copy=True and different dtype Co-authored-by: Patrick Hoefler <61934744+phofl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DataFrame DataFrame data structure Performance Memory or execution speed performance

2 participants